Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> how do you argue for that

0 downtime and 0 data loss is impossible, both theoretically and practically. Even a Galara cluster is going to have downtime if a node fails (timeouts expiring, etc.).

Really, it comes down to what trade-offs is the company willing to accept: the potential for stale reads for high availability, slow/aborted writes for high consistency, downtime for reduced complexity, etc.

A few notes about Galara: its performance going to become severely degraded if one of the nodes decides it needs to be rebuilt - both that node and the donor node will effectively be out of the loop, with the network link between them saturated for the duration. That degraded state doesn't require the node to go down, it can happen spontaneously.

Galara also imposes limits on write frequency and location - if you're doing many writes, you don't want to split those writes between nodes, since every write has to look at every other pending transaction on every node before it can occur.

An automated RDB master failover can occur in under 30 seconds, easily. You can also typically run master-master without a risk of split brain by specifying even and odd primary keys.

> cloud VMs are expected to fail

Yup, but that timeframe is usually on the order of months or years. And if you use externally backed disk stores (at the cost of read/write speed) you can be back up in a matter of minutes with a single master. Even a full restore from backups can typically be done in 10-15 minutes; constrained mostly by your network speed.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: