In short, the maximum possible speed is the same (+/- some nitpicks), but there can be significant differences in typical code, and it's hard to define what's a realistic typical example.
The big one is multi-threading. In Rust, whether you use threads or not, all globals must be thread-safe, and the borrow checker requires memory access to be shared XOR mutable. When writing single-threaded code takes 90% of effort of writing multi-threaded one, Rust programmers may as well sprinkle threads all over the place regardless whether that's a 16x improvement or 1.5x improvement. In C, the cost/benefit analysis is different. Even just spawning a thread is going to make somebody complain that they can't build the code on their platform due to C11/pthread/openmp. Risk of having to debug heisenbugs means that code typically won't be made multi-threaded unless really necessary, and even then preferably kept to simple cases or very coarse-grained splits.
To be honest, I think a lot of the justification here is just a difference in standard library and ease of use.
I wouldn't consider there to be any notable effort in making thread build on target platforms in C relative to normal effort levels in C, but it's objectively more work than `std::thread::spawn(move || { ... });`.
Despite benefits, I don't actually think the memory safety really plays a role in the usage rate of parallelism. Case in point, Go has no implicit memory safety with both races and atomicity issues being easy to make, and yet relies much heavier on concurrency (with a parallelism degree managed by the runtime) with much less consideration than Rust. After all, `go f()` is even easier.
(As a personal anecdote, I've probably run into more concurrency-related heisenbugs in Go than I ever did in C, with C heisenbugs more commonly being memory mismanagement in single-threaded code with complex object lifetimes/ownership structures...)
He straight ported some C code to rust and found the rust code outperformed it by ~30% or something. The culprit ended up being that in C, he was using a hash table library he's been copy pasting between projects for years. In rust, he used BTreeMap from the standard library, which turns out to be much better optimized.
This isn't evidence Rust is faster than C. I mean, you could just backport that btreemap to C and get exactly the same performance in C code. At the limit, I think both languages perform basically the same.
But most people aren't going to do that.
If we're comparing normal rust to normal C - whatever that means - then I think rust takes the win here. Even Bryan Cantrill - one of the best C programmers you're likely to ever run into - isn't using a particularly well optimized hash table implementation in his C code. The quality of the standard tools matters.
When we talk about C, we're really talking about an ecosystem of practice. And in that ecosystem, having a better standard library will make the average program better.
The only real question I have with this is did the program have to have any specific performance metric?
I could write a small utility in python that would be completely acceptable for use but at the same time be 15x slower than an implementation in another language.
So you do you compare code across languages that were not written for performance given one may have some set of functions that happens to favour one language in that particular app?
I think to compare you have to at least have the goal of performance for both when testing. If he needed his app to be 30% faster he would have made it so, but it didn't need to be so he didn't. Which doesn't make it great for comparison.
Edit, I also see that your reply was specifically about the point that the libs by themselves can help the performance with no work, and I do agree with you, as you were to the guy above.
Honestly I'm not quite sure what point you're making.
> If he needed his app to be 30% faster he would have made it so
Would he have? Improving performance by 30% usually isn't so easy. Especially not in a codebase which (according to Cantrill) was pretty well optimized already.
The performance boost came to him as a surprise. As I remember the story, he had already made the C code pretty fast and didn't realise his C hash table implementation could be improved that much. The fact rust gave him a better map implementation out of the box is great, because it means he didn't need to be clever enough to figure those optimizations out himself.
Its not an apples-to-apples comparison. But I don't think comparing the world's fastest C code to the world's fastest rust code is a good comparison either, since most programmers don't write code like that. Its usually incidental, low effort performance differences that make a programming language "fast" in the real world. Like a good btree implementation just shipping with the language.
I did feel my post was a bit unneeded when I added my edit :)
My point about the 30% was that you mentioned that he got in rust and attributed it to essentially, better algorithms in the rust lib he used. Once he knew that then its hard to say that rust is 'faster' but the point is valid and I accept that he gained performance by using the rust library.
My other point was that the speed of his code probably didn't matter at the time. If it was a problem in the past he probably would have taken the time to profile and gain some more speed. Sure you cant gain speed that can't be had but as you pointed out, it wasn't a language issue, it was an implementation of the library issue.
He could have arbitrarily used a different program that used a good library and the results reversed.
I also agree that most devs are not working down at that level of optimisation so the default libraries can help but at the same time it mostly doesnt matter if something takes 30% longer if that overall time is not a problem. If you are working on something where the speed really matters and you are trying to shave off milliseconds then you have to be that developer that can work C or Rust at that level.
What I think it illustrates more is how much classic languages could gain by having a serious overhaul of their standard library and maybe even a rebrand if that's the expected baseline of a conformant implementation.
>If he needed his app to be 30% faster he would have made it so
That still validates "In short, the maximum possible speed is the same (+/- some nitpicks), but there can be significant differences in typical code" the parent wrote
> He straight ported some C code to rust and found the rust code outperformed it by ~30% or something. The culprit ended up being that in C, he was using a hash table library he's been copy pasting between projects for years. In rust, he used BTreeMap from the standard library, which turns out to be much better optimized.
Are you surprised? Rust is never inherently faster than C. When it appears faster, it boils down to library quality and algorithm choice, not the language.
Also worth noting that hash tables and B-trees have fundamentally different performance characteristics. If BTreeMap won, it is either the hash table implementation, or access patterns that favor B-tree cache locality. Neither says anything about Rust vs C. It is a library benchmark, not a language one.
And especially having performant and actively maintained default choices built in. With C, as described in the post you responded to, you'll typically end up building a personal collection of dusty old libraries that work well enough for most of the time.
I think Rust projects will accumulate their own cruft over time, they are just younger. And the Rust ecosystem's churn (constant breakage, edition migrations, dependency hell in Cargo.lock) creates its own class of problems.
Either way, I would like to reiterate that the comparison is flawed at a more fundamental level because hash tables and B-trees are different data structures with different performance characteristics. O(1) average lookup vs O(log n) with cache-friendly ordered traversal. These are not interchangeable.
If BTreeMap outperformed his hash table, that is either because the hash table implementation was poor, or because the access patterns favored B-tree cache locality. Neither tells you anything about Rust vs C. It is a data structure benchmark.
More importantly, choosing between a hash table and a tree is an architectural decision with real trade-offs. It is not something that should be left to "whatever the standard library defaults to". If you are picking data structures without understanding why, that is on you, not on C's lack of a blessed standard library (BTW one size cannot fit all).
> If BTreeMap outperformed his hash table, that is either because the hash table implementation was poor, or because the access patterns favored B-tree cache locality. Neither tells you anything about Rust vs C. It is a data structure benchmark.
The specific thing it tells you about Rust vs C is that Rust makes using an optimized BTreeMap the default, much-easier thing to do when actually writing code. This is a developer experience feature rather than a raw language performance feature, since you could in principle write an equally-performant BTreeMap in C. But in practice Bryan Cantrill wasn't doing that.
> More importantly, choosing between a hash table and a tree is an architectural decision with real trade-offs. It is not something that should be left to "whatever the standard library defaults to". If you are picking data structures without understanding why, that is on you, not on C's lack of a blessed standard library (BTW one size cannot fit all).
The Rust standard library provides both a hash table and a b-tree map, and it's pretty easy to pull in a library that provides a more specialized map data structure if you need one for something (because in general it's easier to pull in any library for anything in a Rust project set up the default way). Again, a better developer experience that leads to developers making better decisions writing their software, rather than a fundamentally more performant language.
> the Rust ecosystem's churn (constant breakage, edition migrations, dependency hell in Cargo.lock) creates its own class of problems.
What churn? Rust hasn't broken compatibility since 1.0, over a decade ago. These days it feels like rust changes slower than C and C++.
> Either way, I would like to reiterate that the comparison is flawed at a more fundamental level because hash tables and B-trees are different data structures with different performance characteristics. O(1) average lookup vs O(log n) with cache-friendly ordered traversal. These are not interchangeable.
They're mostly interchangeable when used as a map! In rust code, in most cases you can just replace HashMap with BTreeMap. In practice, O(log n) and O(1) are very similar bounds owing to how slowly log(n) grows with respect to n. Cache locality often matters much more than a O(log n) factor in your algorithm.
If you read the actual article, you'll see that Cantrill benchmarked his library using rust's b-tree and hash table implementation. Both maps outperformed his C based hash table implementation.
> Neither tells you anything about Rust vs C.
It tells you rust's standard library has a faster hash map implementation than Bryan Cantrill. If you need a hash table, you're almost certainly better off using rust than rolling your own in C.
One point of clarification: the C version does not have (and never had) a hash table; the C version had a BST (an AVL tree). Moreover, the "Rust hash table implementation" is in fact still B-tree based; the hash table described in the post is a much more nuanced implementation detail. The hash table implementation has really nothing to do with the C/Rust delta -- which is entirely a BST/B-tree delta. As I described in the post, implementing a B-tree in C is arduous -- and implementing a B-tree in C as a library would be absolutely brutal (because a B-tree relies on moving data). As I said in the piece, the memory safety of Rust is very much affecting performance here: it allows for the much more efficient data structure implementation.
I wouldn't consider implementing a B-tree in C any more "arduous" than implementing any other notable container/algorithm in C, nor would making a library be "brutal" as moving data really isn't an issue. Libraries are available if you need them.
Quite frankly, writing the same in Rust seems far, far more "arduous", and you'd only realistically be writing something using BTreeMap because someone else did the work for you.
However, being right there in std makes use much easier than searching around for an equivalent library to pull into your C codebase. That's the benefit.
I don't often do this, but I'm sorry, you don't know what you're talking about. If you bother to try looking for B-tree libraries in C, you will quickly find that they are either (1) the equivalent of undergraduate projects that are not used in production systems or (2) woven pretty deeply into a database implementation. This is because the memory model of C makes a B-tree library nasty: it will either be low performance or a very complicated interface -- and it is because moving data is emphatically an issue.
Can you mention 3 cases of breakage the language has had in the last, let's say, 5 years? I've had colleagues in different companies responsible for updating company-wide language toolchains tell me that in their experience updating Rust was the easiest of their bunch.
> edition migrations
One can write Rust 2015 code today and have access to pretty much every feature from the latest version. Upgrading editions (at your leisure) can be done most of the time just by using rustfix, but even if done by hand, the idea that they are onerous is overstating their effect.
Last time I checked there were <100 checks in the entire compiler for edition gates, with many checks corresponding to the same feature. Adding support for new features that doesn't affect prior editions and by extension existing code (like adding async await keywords, or support for k# and r# tokens) is precisely the point of editions.
> When it appears faster, it boils down to library quality and algorithm choice, not the language.
That's a thin, thin line of argumentation. The distinction between the ecosystem and language may as well not exist.
A lot of improvements of modern languages come down to convenience, and the more convenient something is, the more likely it is to be used. So it is meaningful to say that the average Rust program will perform better than the average C program given that there exist standard, well-performing, generic data structure libraries in Rust.
> It is a library benchmark, not a language one.
If you have infinite time to tune performance, perhaps. It is also meaningful to say that while importing a library may take a minute, writing equivalently performant code in C may take an hour.
> (As a personal anecdote, I've probably run into more concurrency-related heisenbugs in Go than I ever did in C, with C heisenbugs more commonly being memory mismanagement in single-threaded code with complex object lifetimes/ownership structures...)
Is that beyond just "concurrency is tricky and a language that makes it easier to add concurrency will make it easier to add sneaky bugs"? I've definitely run into that, but have never written concurrent C to compare the ease of heisenbug-writing.
> Despite benefits, I don't actually think the memory safety really plays a role in the usage rate of parallelism.
I can see what you mean with explicit things like thread::spawn, but I think Tokio is a major exception. Multithreaded by default seems like it would be an insane choice without all the safety machinery. But we have the machinery, so instead most of the async ecosystem is automatically multithreaded, and it's mostly fine. (The biggest problems seem to be the Send bounds, i.e. the machinery again.) Cargo test being multithreaded by default is another big one.
You raise a good point here. When I think about writing multi-threaded code, three things come to mind about why it is so easy in Java and C#: (1) The standard library has lots of support for concurrency. (2) Garbage collection. (3) Debuggers have excellent support for multi-threaded code.
Not really, especially as garbage collection doesn't achieve memory safety. Safety-wise, it only helps avoid UAF due to lifecycle errors.
Garbage collection is primarily just a way to handle non-trivial object lifecycles without manual effort. Parallelism happens to often bring non-trivial object lifecycles, but this is not a major problem in parallelism.
In plain C, the common pattern is trying to keep lifecycles trivial, and the moment this either doesn't make sense or isn't possible, you usually just add a reference count member:
In both Go and C, all types used in concurrent code needs to be reviewed for thread-safety, and have appropriate serialization applied - in the C case, this just also includes the refcnt itself. And yes you could have UAF or leak if you don't call ref/unref correctly, but that' sunrelated to parallism - it's just everyday life in manual memory management land.
The issues with parallelism is the same in Go and C, that you might have invalid application states, whether due to missing serialization - e.g., forgetting to lock things appropriately or accidentally using types that are not thread safe at all - or due to business logic flaws (say, two threads both sleeping, waiting for the other one to trigger an event and wake it up).
> (As a personal anecdote, I've probably run into more concurrency-related heisenbugs in Go than I ever did in C, with C heisenbugs more commonly being memory mismanagement in single-threaded code with complex object lifetimes/ownership structures...)
Yes. All `&mut` references in Rust are equivalent to C's `restrict` qualified pointers. In the past I measured a ~15% real world performance improvement in one of my projects due to this (rustc has/had a flag where you can turn this on/off; it was disabled by default for quite some time due to codegen bugs in LLVM).
I was confused by this at first since `&T` clearly allows aliasing (which is what C's `restrict` is about). But I realize that Steve meant just the optimization opportunity: you can be guaranteed that (in the absence of UB), the data behind the `&T` can be known to not change in the absence of a contained `UnsafeCell<T>`, so you don't have to reload it after mutations through other pointers.
Yes. It's a bit tricky to think about, because while it is literally called 'noalias', what it actually means is more subtle. I already linked to a version of the C spec below, https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pdf but if anyone is curious, this part is in "6.7.4.2 Formal definition of restrict" on page 122.
In some ways, this is kind of the core observation of Rust: "shared xor mutable". Aliasing is only an issue if the aliasing leads to mutability. You can frame it in terms of aliasing if you have to assume all aliases can mutate, but if they can't, then that changes things.
I used to use it, but very rarely, since it's instant UB if you get it wrong. In tiny codebases which you can hold in your head it's probably practical to sprinkle it everywhere, but in anything bigger it's quite risky.
Nevertheless, I don't write normal everyday C code anymore since Rust has pretty much made it completely obsolete for the type of software I write.
restrict works by making some situations undefined behavior that would otherwise be defined without it. It is probably unwise to use casually or habitually.
But of course the only thing restrict does in C is potentially introduce certain kinds of undefined behavior into a program that would be correct without it (and then things can be optimized on the assumption that the code is not invoked in a way that it would happen)
Aliasing info is gold dust to a compiler in various situations although the absence of it in the past can mean that they start smoking crack when it's provided.
The simplest example is `memcpy(dst, src, len)` and similar iterative byte copying operations. If the function did not use noalias, the compiler wouldn't be free to optimize individual byte read/writes into register-sized writes, as the destination may overlap with the source. In practice this means 8x more CPU instructions per copy operation on a 64-bit machine.
Note that memcpy specifically may already be implemented this way under the hood because it requires noalias; but I imagine similar iterative copying operations can be optimized in a like manner ad-hoc when aliasing information is baked in like it is with Rust.
Say you have 2 pointers (that might overlap). You (or the compiler) keep one value read from the first pointer in a register, since the value is needed multiple times.
You then write access the second pointer. Now the value you kept in the register is invalidated since you might have overwritten it through the overlapping pointers.
Yes. Specifically since Rust's design prevents shared mutablity, if you have 2 mutable data-structures you know that they don't alias which makes auto vectorization a whole lot easier.
what about generics (equivalent to templates in C++), which allow compile time optimizations all the way down which may not possible if the implementation is hidden behind a void*?
Unless you use `dyn`, all code is monomorphized, and that code on its own will get optimized.
This does come with code-bloat. So the Rust std sometimes exposes a generic function (which gets monomorphized), but internally passes it off to a non-generic function.
This to avoid that the underlying code gets monomorphized.
> This does come with code-bloat. So the Rust std sometimes exposes a generic function (which gets monomorphized), but internally passes it off to a non-generic function.
There's no free lunch here. Reducing the amount of code that's monomorphised reduces the code emitted & improves compile times, but it reduces the scope of the code that's exposed to the input type, which reduces optimisation opportunities.
In C, the only way to write a monomorphized hash table or array list involves horribly ugly macros that are difficult to write and debug. Rust does monomorphization by default, but you can also use &dyn trait for vtable-like behaviour if you prefer.
I think the way Rust checks borrows also makes it a lot more feasible to avoid allocations/copies; not because it is impossible to do in C, but because doing it in C requires writing very careful documentation and the caller to actually read that documentation. In (safe) Rust this is all checked by the compiler such that libraries can leverage it without blowing their complexity budget.
> When writing single-threaded code takes 90% of effort of writing multi-threaded one
That "when" is doing some heavy lifting! More seriously: You raise a very interesting point. When I moved from C++ to Java (10+ years ago), I was initially so nervous to add threads to my Java code. Why? Because it was (then) difficult and dangerous to do it in C++. C++ debuggers were awful, so I didn't think I could debug problems with multi-threaded C++ code. (Of course, the C++ ecosystem has drastically improved in the last 10 years, so I am sure it is now much more pleasant (and safe) to write multi-threaded C++ code.) When I finally sat down to add threads to some Java code, I could not believe how easy it was, including debugging. As a result, going forward, I was much more likely to add threads to my Java... or even start with a multi-threaded design, even if there is only a modest performance improvement.
The Rust version of this is "turn .iter() into .par_iter()."
It's also true that for both, it's not always as easy as "just make the for loop parallel." Stylo is significantly more complex than that.
> to this I sigh in chrome.
I'm actually a Chrome user. Does Chrome do what Stylo does? I didn't think it did, but I also haven't really paid attention to the internals of any browsers in the last few years.
Concurrency is easy by default. The hard part is when you are trying to be clever.
You write concurrent code in Rust pretty much in the same way as you would write it in OpenMP, but with some extra syntax. Rust catches some mistakes automatically, but it also forces you to do some extra work. For example, you often have to wrap shared data in Arc when you convert single-threaded code to use multiple threads. And some common patterns are not easily available due to the limited ownership model. For example, you can't get mutable references to items in a shared container by thread id or loop iteration.
> For example, you can't get mutable references to items in a shared container by thread id or loop iteration.
This would be a good candidate for a specialised container that internally used unsafe. Well, thread id at least; since the user of an API doesn't provide it, you could mark the API safe, since you wouldn't have to worry about incorrect inputs.
Loop iteration would be an input to the API, so you'd mark the API unsafe.
Even just spawning a thread is going to make somebody complain that they can't build the code on their platform due to C11/pthread/openmp.
This matches squarely with my experience, but it's not limited to threading, and Rust evades a large swath of these problems by relatively limited platform support. I look forward to the day I can run Rust wherever I run C!
While Rust doesn't have C coverage, it has (by my last check) better coverage than something like CPython currently does.
The big thing though is Rust is honest about their tiers of support, whereas for many projects "supported platform" for minor platforms often mean "it still compiles (at least we think it does, when the maintainer tries it and it fails they will fix it)"
Not to be too glib though, there are obviously tools out there that have as much or more rigor than Rust and cover more platforms. Just... "supported platforms" means different things in different contexts.
All too common (not just with compilers) for someone to port the subset they care about and declare it done. Rust's decision to create standards of compliance and be conscious about which platforms are viable targets and which ones don't meet their needs is a completely valid way to ensure that whole classes of trouble never come. I think it's a completely valid approach, despite complaints from some.
In C, one can build data structures with pointers that would require reference counting and heap allocation in Rust. The performance would also depend on what kind of CPU/features it is compiled for.
I'm still confused as to why linux requires linking against TBB for multithreading, thus breaking cmake configs without if(linux) for tbb. That stuff should be included by default without any effort by the developer.
I don't know the details since I'm mainly a windows dev, but when porting to linux, TBB has always been a huge pain in the ass since it's a suddenly additionally required dependency by gcc. Using C++ and std::thread.
Also clang, and in general parallel algorithms aren't available outside of platforms not supported by TBB.
C++26 will get another similar dependency, because BLAS algorithms are going to be added, but apparently the expectation is to build on top of C/Fortran BLAS battle tested implementations.
CPUs are most energy efficient sitting idle doing nothing, so finishing work sooner in wall-clock time usually helps despite overheads.
Energy usage is most affected by high clock frequencies, and CPUs will boost clocks for single-threaded code.
Threads waiting on cache misses let CPU use hyperthreading, which is actually energy efficient (you get context switching in hardware).
You can waste energy in pathological cases if you overuse spinlocks or spawn so many threads that bookkeeping takes more work than what the threads do, but helper libraries for multithreading all have thread pools, queues, and dynamic work splitting to avoid extreme cases.
Most of the time low speed up is merely Amdahl's law – even if you can distribute work across threads, there's not enough work to do.
Multithreading does not make code more efficient. It still takes the same amount of work and power (slightly more).
On a backend system where you already have multiple processes using various cores (databases, web servers, etc) it usually doesn’t make sense as a performance tool.
And on an embedded device you want to save power so it also rarely makes sense.
According to [1], the most important factor for the power consumption of code is how long the code takes to run. Code that spreads over multiple cores is generally more power efficient than code that runs sequentially, because the power consumption of multiple cores grows less than linearly (that is, it requires less than twice as much power to run two cores as it does one core).
Therefore if parallelising code reduces the runtime of that code, it is almost always more energy efficient to do so. Obviously if this is important in a particular context, it's probably worth measuring it in that context (e.g. embedded devices), but I suspect this is true more often than it isn't true.
>Therefore if parallelising code reduces the runtime of that code, it is almost always more energy efficient to do so
Only if it leads to better utilisation. But in the scenario that the parent comment suggests, it does not lead to better utilisation as all cores are constantly busy processing requests.
Throughput as well as CPU time across cores remains largely the same regardless of whether or not you paralellise individual programs/requests.
That's true, which is why I added the caveat that this is only true if parallelising reduces the overall runtime - if you can get in more requests per second through parallelisation. And the flip side of that is that if you're able to perfectly utilise all cores then you're already running everything in parallel.
That said, I suspect it's a rare case where you really do have perfect core utilisation.
> Multithreading does not make code more efficient. It still takes the same amount of work and power (slightly more).
In addition to my sibling comments I would like to point out that multithreading quite often can save power. Typically the power consumption of an all core load is within 2x the power consumption of a single core load, while being many times faster assuming your task parallelizes well. This makes sense b/c a fully loaded cpu core still needs all the L3 cache mechanisms, all the DRAM controller mechanisms, etc to run at full speed. A fully idle system on the other hand can consume very little power if it idles well(which admittedly many cpus do not idle on low power).
Edit:
I would also add that if your system is running a single threaded database, and a single threaded web server, that still leaves over a hundred of underutilized cores on many modern server class cpus.
If you use a LAMP style architecture with a scripting language handling requests and querying a database, you can never write a single line of multithreaded code and already are setup to utilize N cores.
Each web request can happen in a thread/process and their queries and spawns happen independently as well.
Multithreading can made an application more responsive and more performant to the end user. If multithreading causes an end user to have to wait less, the code is more performant.
> Are people making user facing apps in rust with uis?
We are talking not only about Rust, but also about C and C++. There are lots of C++ UI applications. Rust poses itself as an alternative to C++, so it is definitely intended to be used for UI applications too - it was created to write a browser!
At work I am using tools such as uv [1] and ruff [2], which are user-facing (although not GUI), and I definitely appreciate a 16x speedup if possible.
The engine being written in C++ does not mean the application is. You're conflating the platform with what is being built on top of it. Your logic would mean that all Python applications should be counted as C applications.
When a basic question is asked, a basic answer is given. I didn’t say that I think that’s the coolest or most interesting answer. It’s just the most obvious, straightforward one. It’s not even about Rust!
(And also, I don’t think things like work stealing queues are relevant to editors, but maybe that’s my own ignorance.)
You cannot have it both ways though. Either these are meaningful examples of Rust's benefits, or they are not worth mentioning.
In a thread about Rust's concurrency advantages, these editors were cited as examples. "Don't block the UI thread" as justification only works if Rust actually provides something novel here. If it is just basic threading that every language has done for decades, it should not have been brought up as evidence in the first place.
Plus if things like work-stealing queues and complex synchronization are not relevant to editors, then editors are a poor example for demonstrating Rust's concurrency story in the first place anyway.
Well, what about small CLI tools, like ripgrep and the like? Does multithreading not matter when we open a large number of files and process them? What about compilers?
Sure. But the more obviously parallel the problem is (visiting N files) the less compelling complex synchronization tools are.
To over explain, if you just need to make N forks of the same logic then it’s very easy to do this correctly in C. The cases where I’m going to carefully maintain shared mutable state with locking are cases where the parallelism is less efficient (Ahmdal’s law).
Java style apps that just haphazardly start threads are what rust makes safer. But that’s a category of program design I find brittle and painful.
The example you gave of a compiler is canonically implemented as multiple process making .o files from .c files, not threads.
> The example you gave of a compiler is canonically implemented as multiple process making .o files from .c files, not threads.
This is a huge limitation of C's compilation model, and basically every other language since then does it differently, so not sure if that's a good example. You do want some "interconnection" between translation units, or at least less fine-grained units.
It reminds me of the joke that "I can do math very fast", probed with a multiplication and immediately answering some total bollocks answer.
- "That's not even close"
- "Yeah, but it was fast"
Sure, it's not a trivial problem, but why wouldn't we want better compilation results/developer ergonomics at the price of more compiler complexity and some minimal performance penalty?
And it's not like the performance doesn't have its own set of negatives, like header-only libraries are a hack directly manifested from this compilation model.
The big one is multi-threading. In Rust, whether you use threads or not, all globals must be thread-safe, and the borrow checker requires memory access to be shared XOR mutable. When writing single-threaded code takes 90% of effort of writing multi-threaded one, Rust programmers may as well sprinkle threads all over the place regardless whether that's a 16x improvement or 1.5x improvement. In C, the cost/benefit analysis is different. Even just spawning a thread is going to make somebody complain that they can't build the code on their platform due to C11/pthread/openmp. Risk of having to debug heisenbugs means that code typically won't be made multi-threaded unless really necessary, and even then preferably kept to simple cases or very coarse-grained splits.