Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks, sharing my learnings on how coding agents work was my main intention with the article. Personally I was a bit surprised by how much of the "magic" is coming directly from the underlying LLM.

The shell command can run anything really. When I tested it, it asked me multiple times to run the tests and then I could see it fixing the tests in iterations. Very interesting to observe.

If I was to improve this to be a better Ruby agent (which I don't plan to do, at least not yet), I would probably try adding some Rspec/Minitest specific tools that would parse the response and present it back to the LLM in a cleaned up format.



Do you know of examples of other agents with more defined tools, to use as inspiration/etc?

(Like - what would it look like to clean up test results for an LLM?)


Why stop there? Give it a capybara tool and make it a full TDD agent


That's a very neat idea, maybe even add something like browser-use to allow it to implement a Rails app and try it out automatically. I think you should try it. :)

I'm being serious. This sounds like a fun project but I have to turn my attention to other projects for the near future. This was more of an experiment for me, but it would be cool to see someone try out that idea.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: