Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why not preprompt with ```json {


Yes, you can pre-fill the assistant's response with "```json {" or even "{" and that should increase the likelihood of getting a proper JSON in the response, but it's still not guaranteed. It's not nearly reliable enough for a production use case, even on a bigger (8B) model.

I could recommend using ollama or VLLm inference servers. They support a `response_format="json"` parameter (by implementing grammars on top of the base model). It makes it reliable for a production use, but in my experience the quality of the response decreases slightly when a grammar is applied.


Grammars are best but if you read their comment they're apparently using ollama in a situation that doesn't support them.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: