Software table walk performance is bad on modern out of order processors because it has to finish every older instruction in flight and redirect the front end to the exception vector. This can take several hundred cycles. Hardware table walk can take <20 cycles to hit in the next level cache.