That's basically all it is. It's a readme file that is guaranteed to be read. So the agent doesn't spend 10 minutes trying to re-configure the toolchain because the first command it guessed didn't work.
So usually MCP tool calls a sequential and therefore waste a lot of tokens. There is some research from Antrophic (I think there was also some blog post from cloudflare) on how code sandboxes are actually a more efficient interface for llm agents because they are really good at writing code and combining multiple "calls" into one piece of code. Another data point is that code is more deterministic and reliable so you reduce the hallucination of llms.
What do the calls being sequential have to do with tokens? Do you just mean that the LLM has to think everytime they get a response (as opposed to being able to compose them)?
LLMs can use CLI interfaces to compose multiple tool calls, filter the outputs etc. instead of polluting their own context with a full response they know they won't care about. Command line access ends up being cleaner than the usual MCP-and-tool-calls workflow. It's not just Anthropic, the Moltbot folks found this to be the case too.
That makes sense! The only flaw here imo is that sometimes that thinking is useful. Sub-agents for tool calls imo make a nice sort of middle ground where they can both be flexible and save context. Maybe we need some tool call composing feature, a la io_uring :)
I usually have a long running note per-project and whenever I need to context switch, I add a "Next Step: ..." line at the bottom of the doc. So I can jump right back in when I come back.
This is a powerful technique that has helped me a lot in the past as well, especially for those projects where I rarely progressed on (mostly private stuff, the work topics are more streamlined).
Nowdays in my private projects I often use a combination of the git commit messages and comments left in the code to indicate where to continue. Of course, this is not useful for work, either.
For work I like to use the ticket system and a separate text file and a paper notebook each to a slightly different effect.
The text file is the log what was done and is done per day grouped by ticket, typically ~10 lines for a day. The notebook contains meeting notes, design thoughts, general notes etc. and is very verbose (often six or mor pages per day, A4 paper) but sometimes helps to identify how/why/when a given decision was taken. The ticket contains what might also benefit others such as technical insights, meeting summaries (derived and summarized after the meeting from the paper notebook), summaries of important (design or product) decisions etc.
I use a very similar setup. I initially used nix to manage dev tools, but have since switched to mise and can't recommend it enough https://mise.jdx.dev/
Yeah I'm just confused why someone would go from a completely deterministic dependency management system back to a dice-rolling one especially when LLM's now exist where all the top tier ones are excellent at the Nix language
Because I myself am never going to anything else ever again, unless it's a derivative of the same idea, because it's the only one that makes sense
this is important, i feel like a lot of people are falling in to the "stop liking what i don't like" way of thinking. Further, there's a million different ways to apply an AI helper in software development. You can adjust your workflow in whatever way works best for you. ..or leave it as is.
You're right, though I think a lot of the push back is due to the way companies are pushing AI usage onto employees. Not that complaining on HN will help anything...
That's literally what they are. It's a dead simple self describing JSONRPC API that you can understand if you spend 5 seconds looking at it. I don't get why people get so worked up over it as if it's some big over-engineered spec.
I can run an MPC on my local machine and connect it to an LLM FE in a browser.
I can use the GitHub MCP without installing anything on my machine at all.
I can run agents as root in a VM and give them access to things via an MCP running outside of the VM without giving them access to secrets.
It's an objectively better solution than just giving it CLIs.
All true except that CLI tools are composable and don't pollute your context when run via a script. The missing link for MCP would be a CLI utility to invoke it.
How does the agent know what clis/tools it has available? If there's an `mcpcli --help` that dumps the tool calls, we've just moved the problem.
The composition argument is compelling though. Instead of clis though, what if the agent could write code where the tools are made available as functions?
> what if the agent could write code where the tools are made available as functions?
Exactly, that would be of great help.
> If there's an `mcpcli --help` that dumps the tool calls, we've just moved the problem.
I see I worded my comment completely wrong... My bad. Indeed MCP tool definitions should probably be in context. What I dislike about MCP is that the IO immediately goes into context for the AI Agents I've seen.
Example: Very early on when Cursor just received beta MCP support I tried a Google Maps MCP from somewhere on the net; asked Cursor "Find me boxing gyms in Amsterdam". The MCP call then dumped a HATEOAS-annotated massive JSON causing Cursor to run out of context immediately. If it had been a CLI tool instead, Cursor could have wrapped it in say a `jq` to keep the context clean(er).
I mean what was keeping Cursor from running jq there? It's just a matter of being integrated poorly - which is largely why there was a rethink of "we just made this harder on ourselves, let's accomplish this with skills instead"
reply