Why not preprompt with ```json {

jkukul · on Oct 25, 2024

Yes, you can pre-fill the assistant's response with "```json {" or even "{" and that should increase the likelihood of getting a proper JSON in the response, but it's still not guaranteed. It's not nearly reliable enough for a production use case, even on a bigger (8B) model.

I could recommend using ollama or VLLm inference servers. They support a `response_format="json"` parameter (by implementing grammars on top of the base model). It makes it reliable for a production use, but in my experience the quality of the response decreases slightly when a grammar is applied.

BoorishBears · on Oct 25, 2024

Grammars are best but if you read their comment they're apparently using ollama in a situation that doesn't support them.