Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Google Drive Found Leaking Private Data (collaboristablog.com)
156 points by srikar on July 9, 2014 | hide | past | favorite | 73 comments


I didn't think "Anyone with link..." setting promised any kind of security. Honestly, I don't think this was a 'security hole', more like a digital equivalent of a home owner hiding house keys under the carpet, hoping no one will look.


That's not the issue. Let's say Alice shares a link to a Drive doc with Bob, https://drive/secretlink. If that document has an embedded link to Eve's website, http://some/thirdparty, and Bob clicks the link, then Eve (as the administrator of the third-party site) will see the HTTP Referer as https://drive/secretlink, and she will be able to access Alice's document.


Correct, which violates least surprise and is almost certainly not intended behavior.

However, the meta-point rishabhsagar touches on is that with an authentication-free access model, this is but one of possibly many potential failure modes. The risk surface is undefined size, but probably larger than your IT professionals are comfortable with.


I guess it depends on what you're expecting. I expect Refer headers to be sent when I click a link or load a resource (ssl restrictions withstanding https://tools.ietf.org/html/rfc2616#section-15.1.3).


I think rishabhsagar's metaphor of security through obscurity is indeed the issue. If the secretlink protected only by the difficulty in guessing the URL, and not an additional layer authentication for the person you are sharing it with, then it amounts to hiding something (a key) in plain sight and hoping for the best. Granted, I'm unsure if Google has a mechanism to throttle/block attempts at guessing Drive URLs.


Google doesn't really need to throttle attempts at guessing Drive URLs - they are long enough that it is totally infeasible for anyone to guess them. The guessability of these links is not the weakest link in finding these documents - it's that they can easily be shared around and you don't know who has the link and who doesn't. This may be okay for your usage - it's a tradeoff between usability (it's easy to share such a link with your friends instead of granting each one permission individually) and security (your friends could forward the link to others without you knowing, which they couldn't do if you'd only granted them the permission).


I expect that they have a throttling mechanism, but nevertheless ...

I just checked one of my shared documents. It has a 44 long “random” string, it’s alphanumeric with a few symbols. It looks like a version of base64, but let’s assume that it has only 50 characters to choice, so there are 50^44 = 5.7E74 possible addresses (2.9E79 if we assume base64). (Assuming they are using something like a cryptographically secure pseudorandom number generator.)

There are 7E9 live person, and assume that each one share less than 1000 documents, so there are less than 7E12 used addresses. Only one in 5.7E74 / 7E12 = 4.2E66 address has a document.

For a brute force attack, lets assume that the attacker use each valid ipv4 address 256^4 = 4.3E9 to do 1000000 tries per second, so there are 4.3E15 tries per second.

So the expected time to guess an address is 4.2E66 / 43.E15=9.8E50 seconds, that is 3.1E43 years. (For comparison, the universe is only 1.4E10 years old.)


That's in the ideal "system is performing exactly as intended" case, though.

You're assuming that there isn't, for example, a timing attack on the string comparison function. And it doesn't have to be just their server either. It could be, for example, an intermediate proxy server that leaks timing information.


Yes, I suspect the more probable leaks are dew to malware and mistakes (someone want to post a kittens picture, but he makes a mistake and paste the doc url.)

And your comment is interesting. Are the proxy servers expected to be secure against a timing attack?

(Also, the proxy administrator may be able to see the logs ...)


If you had 2^64 carpets then the key would be pretty safe.


> I didn't think "Anyone with link..." setting promised any kind of security

I think it is reasonable for someone to assume it is a capability, and to be surprised by the cascading vulnerability.


Regardless of this particular issue, it's ironic that people think their Google docs are private.

Doesn't Google already have the right to parse your documents in Drive to show you ads?

They just recently pledged to stop parsing the paid Google Apps for Business emails to build ad preferences to show on other Google properties like YouTube.

I wonder if they're already scanning documents, and if we'd even know unless they were forced to stop making misleading statements acknowledge it in a court case like in the lawsuit over ad profiling students email in Google Apps for Education.

Not to mention that Schidmt or Nadella could have read your email or seen your company docs this morning and traded stocks based on them and Google/Microsoft are not even legally obliged to inform you that it happened.


This incident comes to mind.

http://gawker.com/5637234/gcreep-google-engineer-stalked-tee...

Hope they have better controls now so that snooping on your data is not so easy for a Google employee now. But they're under no legal obligation since you sign away your rights when you upload data to their server. No one in that case would have a legitimate case against Google in court.


HTTP referers are evil. I've been using RefControl[0] to block 3rd party referers for years now.

[0] http://www.stardrifter.org/refcontrol/

The web wasn't built with privacy in mind. 3rd party cookies and HTTP Referers are just the low hanging fruit.


Why do you consider them evil? It's useful for a destination server to be given insight into the previous url and it doesn't expose any private information.

I suppose one might consider their previous url private information, but if that's the case you've go a lot more to worry about than http referers.


It's a violation of browsing privacy. It's noone's business how or why I arrived at a webpage.


By what edict?

The general default behaviour has always been to let an http server know where you're coming from so that it can take whatever actions appropriate. I don't see how or why there is a fundamental violation of some "browsing privacy" rule here.


More to worry about, such as?

URLs aren't protected any less than cookies are, and cookies are the standard way of securing login tokens.

Heck with URLs you get the 'secure flag' cookie option for free!


Your browsing behaviour in general is being recorded via user patterns, user agent strings, browser configuration, ip address, etc. An interested party can, in general, find out where your browser has been regardless of referer strings. What is so special about the url? It shouldn't contain any information that is meant to be secure.


Yes they are. Cookies are subject to the same origin policy. The Referer header is not.


I think you misread. grannyg00se said there is a lot more to worry about than http referers. We don't need a reminder that referer is a problem.


Perhaps this is a good reason to make referers subject to the same origin policy?


Becomes less and less of an issue as sites switch to HTTPS though, right?


Referrers are still sent if you're clicking an https link on an https site, iirc.


Yes

> Clients SHOULD NOT include a Referer header field in a (non-secure) HTTP request if the referring page was transferred with a secure protocol.

https://tools.ietf.org/html/rfc2616#section-15.1.3


This never made sense to me, why was this behavior defined this way and why has no browser challenged it? A more rational rule would be something like "only send header if the referer url is http, or if the referer and destination have exact match hosts"


Because the spec and implementations came along well before we started putting HTTPS on everything, or even most/many things. Given the number of HTTPS->HTTPS links you would have run across in regular practice the spec and your proposal were probably more or less identical in practice.

As for why it's still that way... I'm sure no one has bothered to really think about it since.


Huh, for some reason I thought there was a same domain policy there, but you're right. I guess the idea was just to prevent leaking URLs to a completely passive observer.

FWIW, the web would survive without Referer, but it is genuinely useful to site owners, especially in aggregate. Maybe a compromise would be to trim it to just domain rather than full path?


This "vulnerability" only affected HTTPS links, the article said.


I'm glad Google fixed this, but if something is important you really shouldn't be securing it merely by giving it an obscure URL.

Google Drive makes it very easy to say "only these named people" should have access, or "only people who have the link AND a google account for your company"


>"only people who have the link AND a google account for your company"

I'm not on my work computer, but I can't remember there ever being an option for only let people with a company google account see this. If that was the case, why wouldn't that just be on by default (which I would argue is everyone's expected behavior on corporate google drive).


So, if I attempt to share a document, I have these options available to me: http://imgur.com/efTUXgY

They include:

     Public on the Web
     Anyone with the link
     lynch.us
     People at lynch.us with the link
     Specific people
Additionally, if I goto https://admin.google.com/AdminHome#AppDetails:service=Drive+..., I have the abilitiy to the set the domain-wide default sharing settings. As seen here: http://imgur.com/RL0eUhZ


At least for me, the default sharing option is only "specific people" named as being able to view it can view it. (Though, by default, they can also add other people.) Having the link is irrelevant.


It shows up if you are on a Google Apps account.


Google is correct to say this is a relatively obscure issue, and a relatively small increment in loss of security. Who would consider sharing a Drive link to be "secure" by any definition of the word? It can leak all over the place by numerous means. For starters, and email recipient of the link might be using an unencrypted connection for downloading the link over wifi.


That's a very poorly worded security setting. If you're building a service where people can share something set as "Anyone with link...", you really ought to make it very clear that means it's open for anyone to download. The setting should really be named 'Remove privacy settings - allow anyone to download'. 'with link' implies some level of security that just isn't there. Even if Google proxy links within the document, there's always the possibility that someone could accidentally send a link to the file to someone, or that someone could shoulder surf it, or even guess it is if it's simple enough.


I don't know if that's quite fair. Infeasible to guess urls do have a level of security that's significantly higher than implied by "Remove privacy settings - allow anyone to download".

Yes, someone could shoulder surf it, but we don't tell you that your bank account has no security because someone could shoulder surf you entering your password.


My bank account has 2 factor authentication for that very reason.


If I know your bank's routing number (not a secret) and your account number, I can create a demand draft to take money out of your account. https://en.wikipedia.org/wiki/Demand_draft


What does that have to do with anything? If you issue fraudulent demand drafts, the bank will trace the destination account and send lawyers after you. If you trace referrers and open the origin URL, it's doubtful whether anyone will trace it or have legal recourse against you.


Not guaranteed.

I'm aware of a story where a local credit union assigned account numbers strictly sequentially. A customer setting up direct-withdrawal typo'd their account number by omitting a digit, i.e. their acccount number was '12345' and they entered '1234'.

Since the numbers are sequential, '1234' happened to exist. For about a year, the company in question cheerfully direct-withdrew from the inappropriate account, and the original owner of '1234' never noticed, never complained, or had their complaints ignored. To my knowledge, the error was never rectified.

End of the day, direct-draft is a badly-architected system from a security standpoint.


That's not two-factor authentication.


Yes, but it's reasonable to assume that if I keep a URL secret and it's a very long and complex URL, that nobody will be able to guess it and find the document.

I mean after all, we assume that's true for passwords.


But we're more accepting of someone crawling our site trying random URLs than we are of people trying hundreds of passwords.


Not really. If you try to guess a long and complex URL or a long and complex password you will either move at a glacial pace or DOS the site.

URL-shortened links can be an issue, but raw google docs links have crypto-length randomy numbers.


I've always thought of 'Anyone with the link' as functionally equivalent to 'Anyone.' I will continue to think so after this referrer-header change.

The first time you email that link out, you have technologically released your ability to predict who will view the document (was the e-mail sent over secure channels end-to-end? Did it go to a trusted party who won't reshare it? Did you typo the e-mail address and send it to an undesired party? Did a recipient print it out and leave a printed copy lying around in an accessible conference room? Was the printout shredded after use? Was the shredding functionally irreversible? Is the shredding being done by a third-party that might lose some loads of documents on the way to the shred facility? Did you leave the link in your local machine's pastebuffer then walk away without locking your terminal? Etc., etc., etc.).

Even when the link is functionally unguessable, allowing non-authenticated access is just "security through obscurity."


It provides as much security as you have trust in the people you share it with. If I set one of my documents to "anyone with the link" just so that I can access it when I'm not logged in, and I'm careful to not share the link with anyone else, then I'd argue that it's still provides some security because I trust myself.

If I shared a document with only you (directly, not using the "anyone with the link" permission), I couldn't stop you from taking a screenshot and sending that to whomever you pleased (or printing them out and leaving them in a conference room, like you suggested). Sharing documents online in general can be troubling if you don't trust the people you send them to. It's easier to share a link than share a screenshot, though, and a link would provide continued access going forward, and could remove deniability that the screenshot was faked.


But without authentication, there's no way to know if the people who claim to be the people one shared the link with to be the people one actually shared the link with.

Whole-document transformation and copying are a concern, but I think that's usually treated as a different category of issue from "The server I trust to store the data securely gave it up to some anonymous person who passed a correctly-formatted request to it because the server can't know any better."


Yes, certainly this particular bug was serious and a real problem that needed to be fixed. The link should not be shared accidentally like that.

I was just making the case that "Anyone with the link" still provides some security. Just like you don't know if the people accessing the document are the ones that you shared the link with, you don't know that the people you shared the document contents themselves with aren't showing others without your knowledge.


Seriously, I'm not surprised that the average user might not grasp this concept (which is why it's good that Google dealt with this referrer issue), but it's bizarre that so many people on this thread don't. When it comes to pictures etc, it's generally prudent to assume "If you post it on Facebook, it's effectively public"; if anything this is even more true for the "anyone with the link" sharing setting.


> there's always the possibility that someone could accidentally send a link to the file to someone

I think that would be implied when the setting says anyone with the link.

It would be the same if I attached the document to a email, someone could still forward the email.


> "The security hole, which has now been patched by Google"

This has only been fixed for new links. All existing links are still vulnerable.

From Google's Blog:

>"Today’s update to Drive takes extra precaution by ensuring that newly shared documents with hyperlinks to third-party HTTPS websites will not inadvertently relay the original document’s URL."


To clarify, I think what they mean is that it's possible that those old links have already been compromised, even though none of the documents (new or old) are vulnerable anymore.

The bug has been patched so that any document with the "anyone with the link" permission will no longer leak its location in the referrer when someone clicks an HTTPS link in it. But it's possible that that happened in the past and the link was already leaked (and unfortunately, they can't exactly fix that). So if you have an old document, it's not vulnerable anymore, but it may at one time have been vulnerable so you might want to update it to have a new link (following the instructions in the Google blog post).


I helped draft the original blog post :-) To clarify a bit more and help folks evaluate their individual risk, it's worth noting that the impact is limited to a fairly specific scenario.

In essence, you needed to have a non-native document format uploaded to Drive without converting it (PDF is a good example); explicitly share this document with others using a particular setting ("anyone with the link"); and then preview it in the web UI and follow an outgoing HTTPS link (HTTP wouldn't be a problem).


That's still pretty bad, there could be millions of leaked document URL's in the logs of severs all over the internet and users haven't been notified.

It looks like Microsoft Onedrive did a similar thing too:

https://blog.onedrive.com/update-for-shared-links/ >"We chose not to disable all previously shared links, because the change only applies to a small fraction of shared files. If customers disable and then re-share a document, this will prevent further access to a document that might have been accessed."


Highly doubt it's millions. You have to upload a non-native format into Drive, modify the default security settings, and be linking to another HTTPS site and someone has to follow that link. AND the information in the original document has to be sensitive. The odds of that are very, very low.


Maybe I'm misunderstanding, but if the document in Google Drive is served over HTTPS, then the referrer when a user visits a linked site should only show the hostname (ie drive.google.com) not the full URL, right?


How does the fix work? Does it prevent the browser from sending the referrer URL? Or maybe load all documents from the same URL with the document ID in a POST request instead or GET?


Normally most outgoing links that are meant to be private are bounced through a redirection that results in the referrer being a generic Google redirection page. That's probably the solution that was applied here.


But it says this was only a problem for documents that were not converted to Google Sheets etc. I don't think editing everyone's PDFs and every other kind of document that has clickable URLs is the easiest way to solve this.


Looks like it only applies to a "Preview" feature, so I imagine they're already parsing the contents to render the preview and are either adding a redirect or a no-follow attribute.

Native-native docs wouldn't have a referrer because they wouldn't be rendered by the browser, so links would be a direct hop.


How is this different than DropBox? Maybe dropbox is more obscure and google more open, hence finding this vulnerability is actually a good thing. Just thinking aloud.


Dropbox posted a blog article about their approach to this issue in May — https://blog.dropbox.com/2014/05/web-vulnerability-affecting...

Disclaimer: I work for Dropbox


Anyone with the link could also share the link publicly. I think if your document contains sensitive information you should be restricting the access anyway.


I've always wondered what percentage of people understood what "anyone with the link" meant as a security setting.


I think the referrer info itself is a security hole. Browsers should disable it. Or, you can use a plugin for now.


Thousands of people working in enterprise are being banned from using Google Drive right now.


This is news? Come on! You give anyone a link to your data and you expect security! Hello! If I gave folk a key to my house I doubt I'll have any my A/V equipment or computers when I come back after a long weekend. Why should I expect my data to be any safer!?


>This is news? Come on! You give anyone a link to your data and you expect security!

Yes. I expect security from my bank, from my insurance company, from state agencies, from my email provider, etc. I surely don't expect them to leak my data, and if they do, cause of a bug or incompetence, I want them to fix it.

And I want to be informed when they have breaches or fail to secure my data.

>Hello! If I gave folk a key to my house I doubt I'll have any my A/V equipment or computers when I come back after a long weekend.

People hire babysitters, cleaning stuff etc, give them the key to their house, and expect to have their A/V equipment when the come back.

If a cleaning person is reckless, and e.g leaves the door unlocked when he leaves, or a babysitter brings her pals over and have a party with my stuff, they get fired and/or sued, and people hire a more trusty person. Businesses that want to keep our private data should be kept to the same, or actually much higher, standards.

The "helloooo, is this news, of course it's unsafe, whaddaya expected" etc attitude doesn't help raise the bar on data safety.


> The "helloooo, is this news, of course it's unsafe, whaddaya expected" etc attitude doesn't help raise the bar on data safety.

I would think just the opposite is true. Perpetuating the idea that "Anyone with a link" means anything other than "Anyone" doesn't help raise the bar on data safety.

I'm glad Google made the enhancement to this, but wouldn't classify it as a security issue.


https://www.facebook.com/notes/facebook-engineering/protecti...

Facebook Engineering's entry on various methods of hiding referrers. This was 4 years ago, so some of these techniques might not still work.


Wow, don't read the comments on that blog.


[deleted]


Did you read the article?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: