> The trouble is that we software engineers have spent so long working in an artificially deterministic world that we're not used to designing and evaluating probabilistic quality control systems for computer output.
I think that's a mischaracterization and not really accurate. As a trade, we're familiar with probabilistic/non-deterministic components and how to approach them.
You were closer when you used quotes around "AI Engineer" -- many of the loudest people involved in generative AI right now have little to no grounding in engineering at all. They aren't used to looking at their work through "fit for purpose" concerns, compromises, efficiency, limits, constraints, etc -- whether that work uses AI or not.
The rest of us are variously either working quietly, getting drowned out, or patiently waiting for our respected colleagues-in-engineering to document, demonstrate, and mature these very promising tools for us.
And that small or large subsets of occasional or consistent bad reasoners we may have sometimes called "users" (in the secrecy of the four walls) reinforced, by contrast and by forcing us to look at things objectively trying to understand their "rants", the idea of proper reasonable stance, did it not?
Take all computers and make it so all memory has a 0.1-5% chance of bit flipping any second (depending on cost and temperature). That this just became a fundamental truth of reality. Any bit, anywhere in memory. It would completely turn SWE work on it's head.
This is kind of how traditional engineering is, since reality is analog and everything is on a spectrum interacting with everything else all the time.
There is no simple function where you put in 1 and get out 0. Everything in reality is put in 1 +/- .25 and get out 0 +/- .25. It's the reason why the complexity of hardware is trivial compared to the complexity of software.
That's not really engaging with the point because you're suggesting turning all of our tools into something grossly unreliable. Of course that's a radical shift from what anybody's used to and undermines every practice in the trade.
But your mistake is just reinforcing what I wrote, because its the same mistake that the "loud people" are make when they think about generative AI. They imagine it as being a wholesale replacement for how projects are implemented and even how they're built in the first place.
But the many experienced engineers looking at generative AI recognize it as one of many tools that they can turn to while building a project that fulfills their requirements. And like all their tools, it has capabilities, costs, and limitations that need to be considered. That its sometimes non-deterministic is not a new kind of cost or limitation. It's a challenging one, but not a novel one, and one just mindfully (or analytically) considers whether and how that non-determinism can be leveraged, minimized, etc. That is engineering, and it's what many of us have been doing with all sorts of tools for decades.
Perhaps I am not explaining this well. What you call grossly unreliable and a radical shift from what any [SWE] is used to, is called Tuesday afternoon for a mechanical, electrical, civil, chemical, etc. etc. engineer. Call them classic engineers.
Statistical outputs are the only outputs of classical engineering. You have never in your life assigned x = 5 and then later queried it and gotten x = 4.83. But that happens all the time in classic engineering, to the point that it is classic engineering.
That's what the OP is trying to get across. LLM's are statistical systems that need statistical management. SWE's don't deal with statistical systems because like you said:
>[statistical software systems would be] turning all of our tools into something grossly unreliable. Of course that's a radical shift from what anybody's used to and undermines every practice in the trade.
Which is exactly why OP is saying SWE's need a new approach here.
You seem to be saying that because we don't only deal with "statistical systems" we don't ever or otherwise aren't institutionally or professionally familiar with them.
This is simply not the case.
Your career path may have only ever used deterministic components that you could fully and easily model in your head as such, like assigning to and reading from some particular abstract construct like the variable in your example. I don't really believe this is true for you, but it's what you seem to be letting yourself believe.
But for many of the rest of us, and for the trade as a whole, we already use many tools and interface with many components that are inherently non-determinstic.
Sometimes this non-determinism is itself a program effect, as with generative AI models or chaotic or noisy signal generators. In fact, such components are used in developing generative AI models. They didn't come out of nowhere!
Other times, this non-determinism is from non-software components that we interface with, like sensors or controllers.
Sometimes we combine both into things like random number generators with specific distribution characteristics, which we use to engineer specific solutions like cryptography products.
Regardless, the trade has been collectively been doing it every day for decades longer than anybody on this forum has been alive.
Software engineering is not all token CRUD apps and research notebooks or whatever. We also build cryptography products, firmware for embedded systems, training systems for machine learning, etc -- all of which bring experience with leveraging non-deterministic components as some of the pieces, exactly like we quiet, diligent engineers are already doing with generative AI.
You're missing his point. He's saying if you make a program, you expect it to do X reliably. X may include "send an email, or kick off this workflow, or add this to the log, or crash" but you don't expect it to, for example, "delete system32 and shut down the computer". LLMs have essentially unconstrained outputs where the above mentioned program couldn't possibly delete anything or shut down your computer because nothing even close to that is in the code.
Please do not confuse this example with agentic AI losing the plot, that's not what I'm trying to say.
Edit: a better example is that when you build an autocomplete plugin for your email client, you don't expect it to also be able to play chess. But look what happened.
I think that's a mischaracterization and not really accurate. As a trade, we're familiar with probabilistic/non-deterministic components and how to approach them.
You were closer when you used quotes around "AI Engineer" -- many of the loudest people involved in generative AI right now have little to no grounding in engineering at all. They aren't used to looking at their work through "fit for purpose" concerns, compromises, efficiency, limits, constraints, etc -- whether that work uses AI or not.
The rest of us are variously either working quietly, getting drowned out, or patiently waiting for our respected colleagues-in-engineering to document, demonstrate, and mature these very promising tools for us.
Everything else you said is 100% right, though.