I haven't ran any of these models yet, I had just assumed CodeGen was less perfo...

I haven't ran any of these models yet, I had just assumed CodeGen was less performant for "understanding" prompts. You are right that it's probably enough, especially if fine-tuning is an option.

Now, I wonder: as the code-base grows, how often, and how, should such tuning take place?