What I want to see at this point are more screencasts, write-ups, anything reall...

fragmede · 2025-10-14T10:02:34 1760436154

https://mitchellh.com/writing/non-trivial-vibing (not me)

skydhash · 2025-10-14T12:06:57 1760443617

From the article

  Important: there is a lot of human coding, too. I almost always go in after an AI does work and iterate myself for awhile, too.

Some people like to think for a while (and read docs) and just write it right at the first go. Some people like to build slowly and get a sense of where to go at each steps. But in all of those steps, there’s an heavy factor of expertise needed from the person doing the work. And this expertise does not comes for free.

I can use agentic workflow fine and generate code like any other. But the process is not enjoyable and there’s no actual gain. Especially in an entreprise settings where you’re going to use the same stack for years.

structural · 2025-10-15T14:33:10 1760538790

These things are amazing for maintenance programming on very large codebases (think, 50-100million lines of code or more, the people who wrote the code no longer work there, it's not open source so "just google it or check stack overflow" isn't even an option at all.)

A huge amount of effort goes into just searching for what relevant APIs are meant to be used without reinventing things that already exist in other parts of the codebase. I can send ten different instantiations of an agent off to go find me patterns already in use in code that should be applied to this spot but aren't yet. It can also search through a bug database quite well and look for the exact kinds of mistakes that the last ten years of people just like me made solving problems just like the one I'm currently working on. And it finds a lot.

Is this better than having the engineer who wrote the code and knows it very well? Hell no. But you don't always have that. And at the largest scale you really can't, because it's too large to fit in any one person's memory. So it certainly does devolve to searching and reading and summarizing for a lot of the time.

Zababa · 2025-10-14T12:03:34 1760443414

https://simonwillison.net/2025/Oct/8/claude-datasette-plugin...

sussmannbaka · 2025-10-14T13:07:26 1760447246

this is definitely closer to what I had in mind but it's still rather useless because it just shows what winning the lottery is like. what I am really looking for is neither the "Claude oneshot this" nor the "I gave up and wrote everything by hand" case but a realistic, "dirty" day-to-day work example. I wouldn't even mind if it was a long video (though some commentary would be nice in that case).

Zababa · 2025-10-14T13:42:17 1760449337

I don't think you should consider this as "winning the lottery", the author has been using these tools for a while.

The sibling comment with the writeup by the creator of Ghostty shows stuff in more detail and has a few cases of the agent breaking, though it also involves more "coding by hand".

nutjob2 · 2025-10-14T20:56:56 1760475416

I think the point is that you want to see typical results or process. How does it run when you use it 10 times, or 100 times, what results can you expect generally?

There's a lot of wishful thinking going around in this space and something more informative than cherrypicking is desperately needed.

Not least because lots of capable/smart people have no idea which way to jump when it comes to this stuff. They've trained themselves not to blindly hack solutions through trial and error but this essentially requires that approach to work.

Zababa · 2025-10-15T10:02:41 1760522561

Yeah that's a good point and the sibling comment seems to be pointing in the same direction. You could take a look at Steve Yegge's beads (https://steve-yegge.medium.com/introducing-beads-a-coding-ag..., https://github.com/steveyegge/beads) but the writeup is not super detailed.

I think your last point is pretty important, that all that we see is done by experienced people, and that today we don't have a good way to teaching "how to effectively use AI agents" other than saying to people "use them a lot, apply software engineering best practices like testing". That is a big issue, compounded because that stuff is new, there are lots of different tools, and they evolve all the time. I don't have a better answer here than "many programmers that I respect have tried using those tools and are sticking with it rather than going back" (with exceptions, like Karpathy's nanochat), and "the best way to learn today is to use them, a lot".

As for "what are they really capable of", I can't give a clear answer. They do make easy stuff easier, especially outside of your comfort zone, and seem to make hard stuff come up more often and earlier (I think because you do stuff outside your comfort zone/core experience zone ; or because you know have to think more carefully about design over a shorter period of time than before with less direct experience with the code, kind of like in Steve Yegge's case ; or because when hard stuff comes up it's stuff they are less good at handling so that means you can't use them).

The lower bound seems to be "small CLI tool", the higher bound seems to be "language learning app with paid users (sottaku I think? the dev talks on twitter. Lots of domain knowledge in japanese here to check the app itself) ; implementing a model on pytorch by someone that didn't know how to code before (00000005 seconds or something like this on twitter, has used all these models and tools a lot); reporting security issues that were missed in cURL", middle bound "very experienced dev shipping a feature faster and while doing other things on a semi mature codebase (Ghostty)", middle bound too is "useful code reviews". That's about the best I can give you I think.

sussmannbaka · 2025-10-15T08:16:10 1760516170

I'm not sure if you just didn't understand what I'm looking for. If I'm searching for a good rails screencast to get a feeling for how it's used, a blogpost consisting of "rails new" is useless to me. I know that these tools can oneshot tasks, but this doesn't help me when they can't.