Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Google sued for data-mining students' email (sophos.com)
65 points by sp8 on March 19, 2014 | hide | past | favorite | 72 comments


What do they mean by "Google is reading our mail" - has Google reached sentience?

Just to display it on GMail, one of Google's servers has to read the mail. If they claim that Google has achieved sentience, the implications could be very far reaching...

I mean perhaps this lawsuit will clarify when algorithms will be considered sentient.


It's interesting how corporations are only people when it's convenient.


By "reading" I think they mean analyzing the content versus merely serving up text.


the exact same arguments could be applied to the NSA


If they would be an email provider, yes.


Except that the NSA doesn't have a website for checking my mail and I am not explicitly sending my mail to them.


your post had nothing to do with that, you claimed google wasnt "reading" email because it was all automatic scripts, behind some facetious rhetoric.

The situation with the NSA collecting and reading email is no different, specificly in terms of datamining the contents.


I wasn't claiming anything, I am asking what it is supposed to mean. It's interesting that a machine reading something can be described as the action of a juristic person. I do think the implications would be that web mail becomes impossible.

What exact part of "reading" is a privacy violation?


The plaintiffs are seeking payouts for millions of Gmail users. The financial damages would amount to $100 per day of each day of violation for every individual who sent or received an email message using Google Apps for Education during a two-year period beginning in May 2011.

So assuming at least 1 million users, they are seeking:

1,000,000 users * $100 per user * 365 days per year * 2 years = ~ $73 billion in damages.

I'm not commenting on whether data-mining on Google's part was right or wrong, but why isn't there any limit to the amount that companies can be sued for?

It seems impractical for the people suing to sue for around 73 billion dollars when the product/service is essentially free.


The cost of the product is surely irrelevant here: If the product damages your life (in an unexpected way), you should be able to sue to recoup those damages.

In this case, presumably the individuals may have been signed up to Google Apps for Education without their direct consent (or, it was an unstated requirement of their courses to sign up).

Also note: companies have access to liability insurance to cover them for being sued out of existence. Google presumably have a budget for lawsuits.


The problem seems to be that there actually haven't been any damages. Not a single person has been negatively affected by these targeted ads and nobody has had their personal information actually read. They have no chance with this lawsuit.


My experience with Google Apps is when you create a new user the first time they log in they are presented with the T&Cs they have to agree to before they can use the account.


If this Apps for Education platform is targeted at children, are many unable to consent to terms perhaps?


also (1) children in school may have no choice but to agree, and (2) just because you agree to something in a contract doesn't necessarily mean it's enforceable. For example there are some rights you can't sign away in a contract.


Apps for Education targets Universities.


Quoting from the article: "Apps for Education is used by K-12 schools and institutions of higher education throughout the world..."

Go to the Apps for Education page at Google, select "Customers", choose "K12" and it lists 10 primary and secondary education school systems as profiled "customer stories".

For example, "Saline Area Schools is a suburban/rural public school district of 5,450 students K-12 located in southeast Michigan, about 50 miles west of Detroit." ... "Encouraged by the success with faculty and staff, the Saline IT team will soon roll Google Apps out to their 3,200 students in grades 5-12."

https://docs.google.com/file/d/0B5AOHQcS-cAeODQ3MGVkNGEtNzJm...


This is the full list of places where they're used: http://www.google.com/enterprise/apps/education/customers.ht...

Most are universities.


That is "not the full list". That is a list of "customer stories", and certainly a subset of the full list.

In fact, it's the site I suggested you visit. Next, use the pull-down under "all types" to see that the three targets are "K12", "Organization", and "University".


There are two premises behind imposing damages in civil suits:

1) Restitution for the injured parties. Hard to measure in a case like this, but you can essentially think of this as the digital equivalent of conversion (i.e. using someone else's property for personal profit without permission). So the fair restitution might be whatever dollar figure the person would have been willing to have been paid to have that information data mined.

2) Deterrence of harmful behavior. The proper measure for this should be (dollar amount of profit from choosing the harmful course of action) / (probability of getting caught). I.e. if you do something harmful and profit $10 million, and have a 10% chance of getting caught, then assessing restitution and punitive damages of $100 million makes taking the harmful course of action economically irrational.

An arbitrary cap, or a cap based on the price of the product, is not relevant to either measure.


> but why isn't there any limit to the amount that companies can be sued for?

Because there is no limit on the damage that they could have done (regardless of the cost of the service). In many cases non-customers may be harmed (e.g. Deep Water Horizon) so there is no relevance of the cost of the service to the amount of potential damage that can be caused.

I'm not claiming that this claim is reasonable or that the amount of damage claimed here is remotely correct.


Initial damage claims are often made on the assumption that once all appeals and negotiations are done you'll be very lucky to walk away with even 5% of the original number.


Gmail is free to you, but their ads can earn more money by mining the info in your email. Is this really free here? Your email content has value.


Exactly. If anything, Google should pay the user for the privilege of mining their email. Maybe not much, since they do provide such a useful service, but even $20 per year would be pennies on the dollar for the value of the user's info. It wouldn't be enough for me to start using their service again, but for a lot of people it would be.


Small companies can be sued into bankruptcy. Why should large companies be exempt?


The suit maintains that, because such non-Gmail users who send emails to Gmail users never signed on to Google's terms of services, they can never have given, in Google's terms, "implied consent" to scan their email.

This. One victim is the private mailing list: There's always at least one sap who subscribes using google mail.

My personal hope is that suits like this will one day push them to discontinue gmail.


But this is nonsense. The sender doesn't need to consent; the recipient has all the legal power over his mail. If I want to engage a company to keep files on all my paper mail, and all associated metadata, there's nothing you can do to stop me.


> If I want to engage a company to keep files on all my paper mail, and all associated metadata, there's nothing you can do to stop me.

(my emphasis)

Well, in Norway I could report the company for storing personal data without a license to do so. Essentially any database (even list of phone numbers[1]) that contains personal data (name, contact information, any other information) is regulated. The laws have been somewhat modernized, so that it is assumed that it is ok for a school to give out (and keep) contact information to households/parents of a class -- or for a business to keep a database of customers, for example -- but you most certainly are not allowed to "just keep a lot of data on people because you can".

If you were to do it as a private person, you would most likely never get caught, of course -- but as a business you'd essentially be committing a crime -- and could face (rather steep) fines.

This is why Norway (along with a few other European countries) have taken a rather dim view of Facebook -- and note that with Facebook, users do consent in general (barring the shadow profiles, information uploaded by users regarding non-users etc).


Oh, look, a troll.



If you send an email to a gmail user of course you are consenting for their email service to process the email. Things like spam filtering and preloading images depend on this.


If you send a letter to an address of course you are consenting for the postman to read your mail.

This is all about moving from offline to online.

In the ancient days pre-email it was a non-trivial task to read every single letter being sent in a country, categorize them all and then profile people based on them.

And the countries that did that weren't somewhere you wanted to live.

In the modern age it's a "feature" that companies provide because "if you're not paying for it, you're the product".

Not that I think it's necessarily bad, but boundaries have to be set for this new age and this law suit is simply part of that process. Just look at it from a different perspective.

(And note that spam filters & image preloading are problems that only need to be solved because of the same new trivial cost of automated malicious actions)


>If you send a letter to an address of course you are consenting for the postman to read your mail.

It's more like if the person you are sending to hires an assistant to sort and organize their mail for them (which people actually do.) You then go and sue the assistant.


If the assistant read confidential mail that they have not been approved to do, yes, they can get sued for that.

Many people sign NDA's. If you have an assistant read mails covered by the NDA, then the party who sent the mail can sue.

and so on and so on.


Perhaps, but that seems pretty unreasonable. The assistant doesn't know that, and we can't ban assistants entirely because some people aren't responsible with their own mail.

I'm not sure how common it is, and it's probably less so now with email, but I'd guess a lot of important people do have their assistants screen their confidential mail. For example, I know politicians often have people help with their mail.


Lets pull apart this analogy a bit more.

A new cleaning company called CleanU create an app where people can easy order office cleaning by a single press of a button. In the ToS/service/liability contract on page 43, it says in legalize: "any document found by our assistants might be used to improve service and decrease costs".

This might sound as a great way to start a legit industry espionage service, but alas, it would be illegal under several laws. First, a judge would ask if there had been a merging of minds, and thus declare the contract void since no buyer could possible have agreed to such terms. Second, as a product (sale), one could ask if "theft" could reasonable be expected when one purchase cleaning. Since it can't, the contract can be made void in that way. Thirdly, if the intent is to hide an otherwise illegal activity under the assumption that people do not read EULA, that would qualify criminal charges under Mens rea, thus fraud.

In all, reading peoples mail because you managed to get people to click a box during registration is a deal on a very unstable legal ground. Law suits like this will explore exactly how unstable it is.


But this isn't a cleaning service. It should be reasonably expected that they automatically process your mail for spam filtering and image caching and other parts of the service. They are not using the service for espionage.


And Gmail isn't sold as an advertisement service. The espionage for profit is not the same as filtering and image caching. That is the crossed line which this kind of lawsuit is asking about.

Cleaning services are also reasonably expected to clean desks. They might even temporary hold confidential documents in their hands while doing so. However, once they start reading the document and interpret them, a line has been crossed which no legal fine print can fix.


You miss the point.

It's not that there's an assistant, it's that there's just one assistant reading everyone's mail, they can take all the tit bits to make a story and they've got a perfect memory.

This is one of the many differences between the new and old worlds, the sheer scope of the damage they could reap.


You could be consenting for the email service to do reasonable processing on each individual email (eg filtering and preloading). Sending an email doesn't necessarily have to give the receiving agent permission to build a profile for the specific sender that maps all the emails they send to the service to multiple, separate recipients on to a single graph.

Also, with Google's "Google Apps for Business", a user can use their own domain for their email. Sending someone an email doesn't inform the sender that they could be contributing to a Google profile.


You can send me an email and I can do quite a bit with it at that point, regardless of what you want me to do with it. That includes generating ads for me to look at to pay for my free email. You might have a copyright claim, but the implied license you give by sending me that email would go quite far.


I wouldn't argue that the recipient, eg the Google email account holder, has much claim on how Google profiles him. He has chosen to use Google's service. I'm talking about the sender.

For example I use onion2k@myemailprovider.com to send emails to alice@gmail.com, bob@gmail.com and charlie@gmail.com. Google now have a profile for onion2k@myemailprovider.com with a graph that includes nodes for alice@, bob@ and charlie@ and edges for the relationships between the four people, and links to anything we've discussed. If david@gmail.com then sends an email to onion2k@myemailprovider.com Google know that david@ has a second order link to alice@, bob@ and charlie@ despite the fact that they have never communicated or informed Google of this relationship, because Google have a profile based on onion2k@myemailprovider.com - an address that Google might not have a justifiable reason to be building a profile on.

Whether or not you believe that is a reasonable use of data is up to you. Some people think it's not.


If david@gmail.com then sends an email to onion2k@myemailprovider.com Google know that david@ has a second order link to alice@, bob@ and charlie@

I'm not sure at all that it is clear what the nature of that link is. I have no insight into the gmail's email relationship modeling but I suspect that it doesn't automatically build a relationship between david@gmail.com and other people that onion2k contacted. For one thing, it would be impractical to do this across all possible relationships -- there would be polynomial growth of the adjacency matrix with no information gain to justify it. Ask yourself, what does it mean that david contacted onion2k and onion2k, at some point in time, contacted alice, bob, and charlie? Unless you already know something nontrivial about these people, you assume they are independent.

I guess my point is that data analysis can be very costly; it is reason that humans have a hard time dealing with meaningful relationships spanning more than about 150 people[1]. Even a company with Google's resources would be overwhelmed if it had to store and take into account a large number of mostly (I mean, vastly mostly) meaningless relationships. At the least, it takes away resources from the relationship graphs that do matter and which are actionable, such as the graphs of spammers.

[1] http://en.wikipedia.org/wiki/Dunbar%27s_number


I don't understand. Any email provider can reconstruct that kind of relationship. What's special about Google in that case?


Google aren't special in this case. They're simply the one we know do data mining on email for their ad network, so they're the one we're talking about. If it came to light that other email providers who also run ad networks (or share data with ad networks) were doing the same thing then people would ask the same questions of them.


This is assuming that Google profiles onion2k@myemailprovider.com

But does Google profiles it?


"but the implied license you give by sending me that email would go quite far."

There's nothing implied in sending an email beyond "I want you to read the contents". If there's more I want you to do, those wishes would be in the email itself.

30 years ago, if I wrote a paper letter to a company, I was not implying that I want them to build up a profile of me based on my stated info compared to the stated info of other people, where my letter came from, type of stationary, quality of my grammar, etc., then send me coupons that match the behavior they expect of people of my profile. I am implying that I want them to read my letter and take care of my request.


>30 years ago, if I wrote a paper letter to a company, I was not implying that I want them to build up a profile of me based on my stated info compared to the stated info of other people, where my letter came from, type of stationary, quality of my grammar, etc., then send me coupons that match the behavior they expect of people of my profile. I am implying that I want them to read my letter and take care of my request.

Yeah, but it essentially always happened. You know those warranty cards you get with new products? They have always been used to build marketing databases.

http://www.nytimes.com/2003/12/25/technology/do-you-really-n...

http://www.bankrate.com/brm/news/advice/20030421a1.asp


That wouldn't be a parallel situation. The parallel would be you hire some company to sort and collate your mail (and toss out your junk mail for you), and in exchange they slip in some coupons that match the behavior they expect of people that get the kind of mail that you do. You may not like that trade, but fortunately you don't have to make it (or you can go get the business version).

Meanwhile, no, you have no right to dictate how I contract out how I handle my mail by sending me a letter. You gave me that letter, and you'd have a difficult time making a case even if I then proceeded to make thousands of physical copies of it and distributed it at a street corner.


I wasn't reacting to what you're allowed to do with it, or what rights I may or may not have. It was the 'implication' part stated earlier. There's no implication when I send a message that I want anything other than the message to be read by the recipient. That's it.


Google is not handling your mail, it is handling the receiver email


You don't have any rights in a physical letter you send away. There would be no issue with the recipient engaging a service to handle letters that happen to be sent to them, and that service processing those letters however they saw fit, including by recording information about the sender(s).

I guess what I'm saying is... what's your point?

edit:

You appear to be misreading your parent comment. The parallel situation they describe is precisely that the recipient engages a company to process their mail (read the description: "you hire some company to sort and collate your mail (and toss out your junk mail for you), and in exchange they slip in some coupons [for] people who get the kind of mail that you do". All of that applies to receiving mail, not sending it.)

And this is precisely what's happening with Google: if you send mail to my gmail address, I've engaged Google to process it on my behalf. You have no say in the matter, nor can you.


Yes, I think I missreaded it


How do I know whether someone is a gmail user or not when they're using their own domain?


You can check out the mx settings for the domain using one of the free online tools. If they have already sent you a message then you can look at the raw source of the email to see who and how it was sent.


It doesn't matter anyways. I might have a robot read every personal email I get to my entire household and I wouldn't need to disclose that to every person that sends me email.

I can forward all of them to a virtual personal assistant in Mumbai and have them deal with it all, just sending me a daily summary and maybe writing some auto-responses for me, and there'd be no need for me to disclose that fact to every person that writes me an email.

You can do the same thing with physical mail as well...there are services that can receive your mail, scan it, and send me stuff that seems relevant. No need for people to be able to figure out that you're doing that.

The fact is that I can contract that sort of thing out however I like. It's my mail once I receive it. I think it should be required for students to be protected when they have to use an email system when attending a school, so go ahead and get a firm statement from google on that, but the main part of this class action, the people sending email to people with gmail addresses, are barking up the wrong tree.


You can check out the mx settings for the domain using one of the free online tools.

Except that people who are on regular GMail will probably use e-mail forwarding provided by their domain registrar. Especially since the free Google Apps for domains is gone.


For a long time I had used an email that forwarded to gmail. Nobody who used that email would know it was going to gmail. (I have mostly moved away from Google product since.)


Firstly the EULA (which nobody reads) does cover this and then the aspect of you get what you pay for and again their choice compeletely how they wish to spend thier money, or not in this case as it is a free service they are using.

Also the cusomised ad advertising is an option you can opt out of, so once again I'm not understanding the issue here beyond grabbing some headlines for mistakes that were avoidable on many levels.


Also the cusomised ad advertising is an option you can opt out of, so once again I'm not understanding the issue here beyond grabbing some headlines for mistakes that were avoidable on many levels.

As far as I understood from previous coverage and reading the privacy policy, the point is that e-mails are mined even if ads are turned of for the domain. The resulting profiles are then used for showing contextual advertisements in services that are not in Google Apps (e.g. Google search and Google+). Google's lawyers have also admitted that this is true. IANAL, but reading the privacy policy and the ToS for Business accounts, it seems to be the same there.

Of course, you can completely opt out of interest-based ads, both on Google services as on Google ads across the web. But I assume that profiles are still built, if not used.

A related problem is that persons sending e-mail to a GMail address (which could be hidden behind a non-gmail domain) never consented to the ToS and their e-mails are profiled. To which Google's reaction was: "all users of email must necessarily expect that their emails will be subject to automated processing." [1] IMO there is a difference between scanning e-mail for spam and viruses, and using the content to build a profile of the sender or receiver.

[1] http://www.theguardian.com/technology/2013/aug/14/google-gma...


Do we have a proof that Google profiles non Gmail users?

Take into account that process the email is not the same that profiling that user


May be based upon the assumption that if Facebook did it, then Google must also do it.


It is really interesting how people get so concerned when google "reads" their email to target ads at them, but they expect google to "read" their email to filter spam. The machine act of text processing of the email is same in both cases, yet only one is offensive to users.


[deleted]


The virus argument is to online life as the terrorist argument is to offline life. Everybody useds it as a cover to get at you for their own purposes.

Recently, our glorious government here in Germany decided that there should be a new, secure way of online communication called De-Mail, implemented by the usual suspects (ISPs and the like).

Secure means end-to-end encryption, right?

Wrrronk! They encrypt the transport to and from the servers, but decrypt and reencrypt the data on the server, "to scan for viruses".

Because viruses are really the most pressing problem today.


For any email service to provide the features people expect (spam filtering, for one), it needs to 'read' the emails.


You have the option to opt out: don't use gmail. It's not a reasonable expectation that every single web service should be legally required to make every single feature opt-out-able.


It's kind of hard to do your job or complete your education without ever using your company or school email address. I suppose you could counter argue that you can opt out of going to university or working for a company the uses Google Apps, but that is starting to get kind of silly in my mind.


If you believe that strongly that none of your personal emails should fall into the hands of Google, how is it silly to choose a school / job that will respect that belief?

I realize for the majority of people who would simply 'rather Google not have their information' the trade off and the additional effort on their part are simply not worth it,

My point is that everyone who is 'forced to use Google mail' is unconsciously weighing their objection to it against the inconvenience it would cause them to avoid using it, and chooses to use Google (is forced to, if you prefer) because the possibly ill informed cost benefit analysis tells them using Google is worth it.

I am sorry if this whole comment comes off trollish, but the the idea that people are forced to do things they are strongly against because they do not find the alternative to be convienent I believe to be a symptom of entitlement which is a problem with society (not a new problem) that I believe should be combated.

Edit to clarify - My quotes come from no where and are not actual quotes just hypothetical, probably straw man statements that I have come to expect when issues like this are discussed.


The public school you attend is not generally viewed as a free market decision as only certain members of society have the economic freedom to 'select' a different one, and public schools have legal obligations to protect the privacy of their students.

I imagine most people here are not working in K-12 education and so are not familiar with these issues, but it should be understood that many public school districts are making use of Google Apps for education in a capacity which is either explicitly or implicitly compulsory for students, including creation of private accounts on minor students' behalf.

If Google is not being entirely forthcoming on precisely what personally-identifiable information they are retaining about minor students they are potentially in a great legal mess here.


The public school your children attend is very much a decision as people commonly choose the location they live based on that. As evidence of this perceived quality of school district can be a major factor in property values.

As a minor many of your choices are of course left to your guardians, if as a minor you have a problem with their decision making then that's really an entirely different manner.

You are compromising saying 'its not worth finding a new place to live, possibly a new job, to avoid having my children use google' That is very much an analysis on your part.


Then that is your company's problem, not Google. And that's not the argument I was making anyways.


[deleted]


Google tell that


Is this a serious comment? Opt in for mail search, spam filtering and anti virus filtering?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: