Wouldn't eGPU defeat the purpose of having fast memory bandwidth? Have you tried...

dandanua · on Feb 11, 2025

40Gbps of USB4 is plenty. I've tried this pytorch tests https://github.com/aime-team/pytorch-benchmarks/ and saw only 10% drop in performance. No drop in performance for LLM inference, if a model is already loaded to the VRAM.

mrbonner · on Feb 12, 2025

Wow that makes sense now if you can load the entire model to vram. What eGPU dock and GPU setup you use if you don’t mind?

dandanua · on Feb 12, 2025

That ADT-link UT3G is the eGPU doc, you just need an ordinary PSU to power it and the GPU (I use an old one that I have). Though, to make everything work you might need time to fix quirks. E. g., I had to use "nvidia error 43 fixer" on Windows and to find a correct configuration on my Linux NixOS system (you need to load the Nvidia driver into the kernel in the boot process etc.). Here is how it looks https://imgur.com/a/qySDN4n