This was a fun read, I didn't know about rseq until today! And before this I reasonably assumed that the naive busy-wait thing would typically be what you'd do in a thread in most circumstances. Or that at least most threads do loop in that manner. I knew that signals and such were a problem but I didn't think just wanting to stop a thread would be so hard! :)
IIRC rseq was originally proposed by Google to support their pure-userspace read-copy-update (RCU) implementation, which relied on per-CPU not per-thread data.
Hopefully this improves eventually? Who knows?