I use Cursor daily, I have worked on Agents using LangChain. Maybe we are doing something wrong but even ysing SOTA models unless we explicitly give which mcp tool to call, it uses anything - sometimes - while other times it can do a passavle job. So now our mandate is to spell everything out to LLM so it doesn't add a non existent column like created at or updated at to our queries
I've used every SOTA for day to day work, and at best they save some effort. They can't do everything yet
Precisely, I always find myself thinking that maybe I'm just too dumb to use these LLM's properly, but that would defeat the purpose of them being useful haha.
And I keep reading people who heap praises at AI like the Staff engg at Google who weirdly praised a competitor LLM. They miss one important part - AI is good for end to end problems that are already solved. Asking it to write a load balancer will result in a perfect solution because it has access to very well written load balancers already.
The real MOAT is to write something custom and this is where it struggles sometimes.
On a contrary note: if LLMs really are that helpful why are QA teams needed? Wouldn't the LLM magically write the best code?
Since LLMs have been shoved down everyone's work schedule, we're seeing more frequent outages. In 2025 2 azure outage. Then aws outage. Last week 2 snowflake outages.
Either LLMs are not the panacea that they're marketed to be or something is deeply wrong in the industry
Yes, it is both. If something is forced top down as a productivity spike then it probably isn't one! I remember back in the days when I had to fight management for using Python for something! It gave us a productivity boost to write our tooling in Python. If LLMs were that great since the start, we would have to fight for them.
I use cursor on a daily basis. It is good for a certain use cases. Horribly bad for some other. Read the below one by keeping that in mind! I am not an LLM skeptic.
It is wild that people are ao confident with AI that they're not testing the code at all?
What are we doing as a programmer? Reducing the typing + testing time? Because we have to write the prompt in English and do software design otherwise AI systems write a billion lines of code just to add two numbers.
This hype machine should show tangible outputs, and before anyone says they're entitled to not share their hidden talents then they should stop publishing articles as well.
Not to mention the fact that juniors can now put the entire problem statement in AI chatbot which spits out _some_ code. The said juniors then don't understand half the code and run the code and raise the PR. They don't get a pat on the back but this raises countless bugs later on. This is much worse as they don't develop skills on their own. They blindly copy from AI.
We use mcp at work. Due to some typo the model ran absolutely random queries on our database most of the cases. We had initially kept ot open ended but after that, we wrote custom tools that took an input, gave an output and that was strictly mentioned in the prompt. Only then did it work fine.
Also since firefox is FOSS and any model has reasonably been trained on the code base of at least Firefox if not also Chromium, it's not a shock that agents are able to generate a similar code!
Until I resd this blog I was under the impression that everyone wrote Python/ other files and used Github Actions to just call the scripts!
This way we can test it on local machine before deployment.
Also as other commenters have said - bash is not a good option - Use Python or some other language and write reusabe scripts. If not for this then for the off chance that it'll be migrated to some other cicd platform
I've used every SOTA for day to day work, and at best they save some effort. They can't do everything yet
reply