Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder if FauxPilot's models (Salesforce Codegen family) can be quantized and run on the CPU. I was able to run the 350M model on my machine but it wasn't able to compete with Copilot in any way. Salesforce claims their model is competitive with OpenAI Codex their github description[1]. Maybe their largest 16B model is, but I haven't been able to try it.

[1] https://github.com/salesforce/CodeGen



We will add quantized CodeGen for fast inference on CPUs up on cformers (https://github.com/NolanoOrg/cformers/) by later today.


> by later today

Wow, that's the timeframe things are moving at right now, we better get used to it!


Whoa is there a PR or wiki about this


4bit GPTQ maybe?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: