Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We will add quantized CodeGen for fast inference on CPUs up on cformers (https://github.com/NolanoOrg/cformers/) by later today.


> by later today

Wow, that's the timeframe things are moving at right now, we better get used to it!


Whoa is there a PR or wiki about this


4bit GPTQ maybe?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: