"We found that due to these weak defaults, MongoDB’s causal sessions did not preserve causal consistency by default: users needed to specify both write and read concern majority (or higher) to actually get causal consistency. MongoDB closed the issue, saying it was working as designed, and updated their isolation documentation to note that even though MongoDB offers “causal consistency in client sessions”, that guarantee does not hold unless users take care to use both read and write concern majority. A detailed table now shows the properties offered by weaker read and write concerns."
That sounds like a valid redress, or am I missing something ?
Kyle's point is that it's arguably valid but certainly unhelpful: the default settings are liable to lead to data loss. Moreover, he draws attention specifically to transactions as something which you would expect to make things safer, but in fact there's a rather arcane part of the documentation that notes that you need to manually specify both read and write concerns on every transaction individually if you want transactions to behave consistently, regardless of the concerns specified at the database level.
Basically, there are a large number of pitfalls that it's very easy to fall into unless you have an encyclopaedic knowledge of the documentation, and you need to ignore some of the words that are used (like "transaction" or "ACID") because they carry connotations that either do not apply or only apply if you do extra work to make it so.
> the default settings are liable to lead to data loss
In Mongo's defense, the defaults are similar to what you would likely have with a replicated MySQL/Postgres cluster (single node accepting writes with slaves replicating from there; no concept of write concern). My assumption here is that he is assuming the primary dies before the writes have replicated to the secondaries; that is exactly how master-slave fails too. Maybe there are systems folks can use for having write concern in those databases, but in the companies I've worked for we didn't have them and we definitely didn't have automated failovers
How is this any different than DynamoDB where you specify that you want either eventual consistency vs strong consistency? DDB also does eventual consistent reads by default.
Is the argument that Mongo’s documentation isn’t clear?
"In order to obtain snapshot isolation, users must be careful not only to set the read concern to snapshot for each transaction, but also to set write concern for each transaction to majority. Astonishingly, this applies even to read-only transactions."
"This behavior might be surprising, but to MongoDB’s credit, most of this behavior is clearly laid out in the transactions documentation… MongoDB offers database and collection-level safety settings precisely so users can assume all operations interacting with those databases or collections use those settings; ignoring read and write concern settings when users perform (presumably) safety-critical operations is surprising!"
There is difference between “Mongo’s documentation sucks” and “Mongo is technically deficient”. The former can be corrected by updating the documentation.
Yes, I agree as far as the end user is concerned, they are losing data either way.
I think the implication here is that "Mongo's documentation is deliberately bad in order to hide their technical deficiencies," i.e. they're hoping people will use the defaults, be impressed by the speed, and never realize until it's too late that they're not getting the consistency they were promised.
DynamoDB conditional writes are strongly consistent. Defaulting to inconsistent reads was reckless and I would never defend that, but the worst case is non-repeatable stale results, never lost writes.
That's the right way to disclose a dangerous default, but defaults should be as safe as possible, and users should think very carefully about whether they can get away with opting out. Consistency failures can be very non-intuitive, and hard to clean up after.
So now we shouldn’t ever trust a project because they don’t have good technical writers?
I don’t have a dog in the Mongo fight. I haven’t done an implementation on top of it in years and probably the next time I do something with “Mongo” it will probably be AWS’s Document DB with Mongo support. That’s based on AWS’s own code and storage tier and doesn’t have the same characteristics as Mongo proper.
For what it's worth, Document DB doesn't support a lot of the Mongo API, such as $$ROOT in aggregations, and it can't use indices on (paraphrased) "SELECT * FROM x WHERE id IN [list]" if the list length is > 10.
If you ask me, if there's something worse than Mongo it's Document DB.
Given a choice between the limitations of DynamoDB and DocumentDB, DocumentDB is still far more versatile and written on top of the same storage engine and is just as reliable.
And yes I know most of the data modeling tricks around using GSI’s and LSI’s.
> So now we shouldn’t ever trust a project because they don’t have good technical writers?
> the newer MongoDB 4.2.6 has more problems including “retrocausal transactions” where a transaction reverses order so that a read can see the result of a future write.
That sounds like a valid redress, or am I missing something ?