qsort in general executes one call per comparison, this doubles that to two. Whether the std::sort method is better depends on the cost of the comparison: for cheap comparisons, having it inlined into the sorting routine is better. For expensive comparisons, it's likely better to have only a single instantiation of the sorting algorithm that uses function calls.
I'm curious: Is there a programming language or a compiler that is actually smart enough to make that distinction? In principle, C++ compilers wouldn't have to monomorphize template instantiations all the time: it's in principle possible to compile templates in a way that adds a hidden "type" parameter to the function which essentially acts as an out-of-band vtable to provide the operations that the template uses (yes, there are ABI issues).
There are a _lot_ of operations in C++ which behave wildly differently across types. Polymorphically compiling templates would either involve some significant performance hits, or significant limitations on how much polymorphism could actually take place. Unfortunately, the language is not designed to make it easy to avoid monomorphizing templates.
For example, due to operator overloading, given "T a, b;", you don't know whether "a + b" should be a simple addition or a function call. For full genericity, you'd want to compile this as a vtable'd addition, but that's going to severely harm performance for primitive types. So you'll have to monomorphize for primitive types. This applies to every operator in C++, of which there are a lot.
But that's just performance. Worse, the semantics of operators can depend on whether they are overloaded or not. For example, "a && b" short-circuits normally, but does not short-circuit if "operator&&" is overloaded. So now you need a variant for whether or not operator&& is overloaded (similar for "operator||", and even "operator,").
I'm fairly sure I've only scratched the surface of weird things that come up with compiling C++ templates. There's loads of other fun examples of template weirdness, like for example the program that throws a syntax error if and only if an integer template parameter is composite: https://stackoverflow.com/a/14589567/1204143
Well yes: monomorphization is an important optimization, but I'd think of it like inlining: whether a function is inlined is subject to a heuristic that aims to get the majority of the benefit of inlining while reducing code size blow-up.
Monomorphization would ideally be treated the same way.
Yes, there are aspects of C++ that make that so difficult that it's just not realistic, but really I was only bringing up C++ because the parent poster certainly meant the C++ std::sort. The same argument could also be applied to other languages that have less baggage, e.g. Rust comes to mind.
It’s a bit orthogonal to monomorphization, but I believe C++ compilers can occasionally deduplicate identical template instantiations, where “identical” means they compile to the same assembly even though the types are different.
One next step that I don't think C++ compilers do yet is deduplicate functions that differ only in constants used or functions called. This could help with std::sort-style functions where the outer algorithm can work on generic pointers and only the comparison is a call to a different function.
In his talk "Speed Is Found In The Minds of People" [0], Andrei Alexandrescu tries to make a sort really fast for medium sized vector and then closes on thoughts on this precise distinction.
If this question intress you, it is worth watching.
A C++ compiler is not guaranteed/required to inline the comparison (just very likely to do so if it's small). Can you explain what advantage you see in doing things the qsort way over a (possibly but not necessarily comparison-inlined) templated sort function? What difference does it make to have multiple (differently typed) std::sort instantiations vs just one type-agnostic one (qsort-style)?
If the comparisons are expensive enough to warrant not inlining them, then I very strongly doubt you'll find a meaningful speed difference between qsort and std::sort. I don't know if you're concerned about code size and instruction cache etc., but I doubt that matters if comparisons are expensive.
Yes, code size and everything that comes with it is the concern. Individually it's not much, but C++ template compilation does tend to lead to death by a thousand cuts.
I'm curious: Is there a programming language or a compiler that is actually smart enough to make that distinction? In principle, C++ compilers wouldn't have to monomorphize template instantiations all the time: it's in principle possible to compile templates in a way that adds a hidden "type" parameter to the function which essentially acts as an out-of-band vtable to provide the operations that the template uses (yes, there are ABI issues).