Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Why do you think NoSQL means not important data?

MongoDB eats data but I am not sure every NoSQL database does.

NoSQL does add certain kinds of complexity, but also simplifies certain problems. Depends where your hard problems are..



Because in the 2 decades I'm in the industry I see RDBMS make the world spin and NoSQL DBs destroying companies and families.

MongoDB and CouchDB eat data for breakfast, I know that from 1st hand experience. And all the others DBs that claim that do not keep cropping up in Aphyr's blog.

I ain't saying that all NoSQL dbs are useless. I'm just saying that proposing and choosing an RDBMS solution is going to be the right choice for 99% of the projects.

Yes, most people think that they belong in that 1% where they have the infrastructure problems and big data of Google, FB and Twitter but.... they don't.


In the last 2 decades in the industry as well I've never lost data with MongoDB, Riak or Cassandra but have with Oracle, DB2 and PostgreSQL. After all databases are just software and there will always be bugs. Some people just get tripped up by different ones.

And you are woefully ignorant to think the RDBMS is the right choice for 99% of projects. Especially since you think that the 1% of remaining users are purely worried about scalability. Hint: think about the schema problems associated with storing auto generated features from deep learning models.


>In the last 2 decades in the industry as well I've never lost data with MongoDB, Riak or Cassandra but have with Oracle, DB2 and PostgreSQL

Yet every test proves otherwise. Also, use Google to see how people have lost data with MongoDB. Mongo is not considered a serious piece of technology by any scientist or engineer I know. Postgres though is universally considered an engineering marvel.

>Hint: think about the schema problems associated with storing auto generated features from deep learning models.

Hint: The problem you mentioned? Even less than 1%

Calling me ignorant doesn't change reality you know.


Is data-loss something inherent to nosql tech or just poor implementations?

If its the latter why haven't there been any reliable nosql implementations.

Perhaps its well suited to non transactional, low fi data?


NoSQL DBs usually target distributed environments.

So... enter CAP theorem. There's no free lunch. People think we can simply throw away half a century's worth of science because JSON and schemaless are teh awesome derp derp.

Implementation is surely an issue, if you take into account that the mongodb guys had to acquire another company [1] in order to overcome their abysmal write performance. And yet there were people, and benchmarks that were trying to tell us that mongo was faster than RDBMS alternatives. All this circa 2009-2012.

You know what's faster than everything? Writing to /dev/null ;)

Anyways, depending on your use case there might be a NoSQL out there that might fill your needs and it might actually deliver what it claims it can deliver. But it's hard to sift through all this ad-driven, buzzword-ridden informacials that gets thrown around by start-up companies in the DB domain.

Also, DBs are like filesystems; even if the match/science is correct, it needs at least a decade of proven track record before you can say that it works as advertised.

[1] http://www.informationweek.com/software/information-manageme...


> NoSQL DBs usually target distributed environments. So... enter CAP theorem.

Surely FB is not running MYSQL on a single machine. Perhaps i am misunderstanding what you are saying but saying SQL db's dont face the issues of distribution seems a little strange.

Distribution comes into picture from shape and size of the data not data saving/retrieval techniques. yea?


FB and all big companies are a very bad example. They have ton of resources and usually they don't use vanilla products, since they have the engineering capacity to support their own forked versions. e.g. see their own version of PHP.

Also distributing reads is easy, writes... not so much. NoSQL systems usually offer distributed writes with the caveat of eventual consistency. RDBMS have referential integrity and other constraints which by definition cannot migrate into a distributed environment. Or at least there's not a one size fits all solution.

> Distribution comes into picture from shape and size of the data not data saving/retrieval techniques. yea?

Most definitely not. It has nothing to do with the shape and size of data. Also.. there's not such thing as "distribution" in our context. Only "distributed", from "distributed computing"[1] and it's everything to do about data saving and retrieval :)

[1] https://en.wikipedia.org/wiki/Distributed_computing


>RDBMS have referential integrity and other constraints which by definition cannot migrate into a distributed environment.

so,

Use RDBMS if your data can be handled by a single machine( or have the resources of FB) ? '99% ppl need RDBMS' argument boils down to 99% of ppl have data that can be handled by a single machine RDBMS.

Is that a good conclusion?


The single machine shouldn't be the deciding factor.

If your application is like most apps(far more reads than writes) then you can easily distribute the load across multiple machines. If you have more writes than reads(quite rare but still) then scaling an RDBMS will be challenging.

In this case, if eventual consistency is something you can live with, a NoSQL store might be best for you.


MongoDB doesn't eat data any more than any other database.

It's used by eBay, Foursquare, Adobe, Facebook etc.

And NoSQL databases underpin most of the popular websites around today. It's nonsense to assume all of that data isn't valuable.


Foursquare? For real?

Like what's gonna happen if they have a couple of corrupt data? A minor incovenience at worst?

Is anyone gonna lose millions? Nah. Anyone gonna die? Nah. Anyone gonna get sued? Naaaaaah

Also Facebook uses MySQL for their primary data. Pretty sure it's the same for ebay. Don't know about Adobe, I bet it's the same deal there too.

People get so excited when they hear some big company using X, but they have no clue in what capacity it's used. I can guarantee you that all the data that matters, that need to be consistent and whole are in some kind RDBMS.


MongoDB is used in Facebook for Parse, eBay for analytics and Adobe for Experience Manager. All are pretty important parts of their business. In particular the latter which if there was data loss would cause the biggest shockwave in the web community.

But no point discussing it with you since you think: Sony Playstation Network, Apple iCloud, Office 365 etc aren't important data to these companies.


Parse is absolute garbage. Thats why it was shuttered https://techcrunch.com/2016/01/28/facebook-shutters-its-pars...


Have you actually used Parse? Obviously not, because you wouldn't dare mention that POC in this discusssion. Hint: search around about experiences.

There's no point discussing with me, because you can't have a coherent debate. Analytics data are not critical neither primary. You really have to reread what I said.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: