Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not good. These tools (from search engines to AI) are increasingly part of our brains, and we should have confidentiality in using them. I already think too much about everything I put into ChatGPT, since my default assumption is it will all be made public. Now I also have to consider the possibility that random discussions will be used against me and taken out of context if I'm ever accused of committing a crime. (Like all the weird questions I ask about anonymous communications and encryption!) So everything I do with these tools will be with an eye towards the fact that it's all preserved and I'll have to explain it, which has a huge chilling effect on using the system. Just make it easy for me not to log history.


> These tools (from search engines to AI) are increasingly part of our brains, and we should have confidentiality in using them.

But you do, just like you have confidentiality in what you write in your diary.


> Not good. These tools (from search engines to AI) are increasingly part of our brains, and we should have confidentiality in using them.

Don't expect that from products with advertising business models


OpenAI and Anthropic do not have advertising business models


OpenAI is clearly moving in that direction, look at their recent verbiage and hiring.


yet, but surely they will move that way over time?


If you're not the customer, you're most likely the product.


I love the saying, but there's something of an exception here. Both companies very openly have singularity business models.


Business models make money. If your business model is theoretical, we call that "research" instead.

Real research is done on prior art. We research things because prior studies tell us that they are feasible, otherwise you are wasting valuable time on a snipe chase. There is no reproducible or substantial evidence that AGI or singularities exist. It is another Big Tech marketing lie, no different from the reneged "don't be evil" or "privacy is a human right" mottos.


Thanks for the interesting response! I disagree on a few points, though:

  If your business model is theoretical, we call that "research" instead.
Are/were Uber and Lyft "research" companies, then? Is Reddit a research company? Edison Electric?

  There is no reproducible or substantial evidence that AGI or singularities exist
There is also no substantial evidence that the sun will rise tomorrow, that climate change will continue, or a million other things that are critical to science. Physical science is empirical in that it inherently requires physical experiments, but that is not the only cognitive tool in play by a long shot.

Regardless: tell them, not me! I'm just reporting what I'd say is an objective fact: they are planning based on scientific predictions of an intelligence explosion -- at least a soft/cybernetic one if not a scarily-fast/purely-digital one.

  It is another Big Tech marketing lie
I think there's a single fact that counters this common sentiment: there is no way in hell that they'd break ground on the largest private infrastructure projects in human history as a marketing stunt. Companies are woefully-shortsighted these days, but that would be another level of foolishness altogether.

They very well may be wrong of course, but I think you're doing yourself a disservice to assume they're lying about it.


> There is also no substantial evidence that the sun will rise tomorrow

You are wasting my time with facetious arguments. There is no point having a rational discussion about the future potential of AI if we cannot take things like reality for granted.

If you want to argue in defense of AI, do it. Pointing to authority is one third of rhetoric, the other two thirds are emotional investment and logical coherence. If you don't have real proof that AGI exists, you're trying to make a point with emotions that people don't empathize with and authority that isn't authoritative. Cite sources, dammit.


> There is also no substantial evidence that the sun will rise tomorrow

the boosters are getting desperate


It’s just basic epistemology… like literally day one stuff. Too advanced for HN, I guess :(


name a similar sized tech company that hasn't


yet


Serious question. Why should someone have more privacy in a software system than they do within their home?


I have enormous privacy in my home. I can open up any book and read it with nobody logging what I read. I can destroy any notes I take and know they'll stay destroyed. I can even visit the library and do all these things in an environment with massive information access; only the card catalog usage might get logged, and I probably still don't have to tie usage to my identity because once upon a time it was totally normal to make knowledge tools publicly-accessible without the need for authentication credentials.


They maybe (not taking a stance) shouldn't, but I don't think this argument is as simple as one thinks. Doing surveillance on someone's home generally requires a court order beforehand. And depending on the country (I don't believe this applies to the US), words spoken at home also enjoy extended legal protection, i.e. they can't subpoena a friend you had a discussion with.

Now the real question is, do you consider it a conversation or a letter. Any opened¹ letters you have lying around at home can be grabbed with a court-ordered search warrant. But a conversation—you might need the warrant beforehand? It's tricky.

(Again, exact legal situation depends on the country.)

¹ Secrecy of correspondence frequently only applies to letters in sealed envelopes. But then you can get another warrant for the correspondence…


Honest question, why consider the personal home, letters or spoken words at all, considering most countries around the world already have ample and far more applicable laws/precedent for cloud hosted private documents?

For the LLM input, that maps 1:1 to documents a person has written and uploaded to cloud storage. And I don't see how generated output could weigh into that at all.


A simple answer to this is: I use local storage or end-to-end encrypted cloud backup for private stuff, and I don't for work stuff. And I make those decisions on a document-by-document basis, since I have the choice of using both technologies.

The question you are asking is: should I approach my daily search tasks with the same degree of thoughtfulness and caution that I do with my document storage choices, and do I have the same options? And the answers I would give are:

* As a consumer I don't want to have to think about this. I want to be able to answer some private questions or have conversations with a trusted confidant without those conversations being logged to my identity.

* As an OpenAI executive, I would also probably not want my users to have to think about this risk, since a lot of the future value in AI assistants is the knowledge that you can trust them like members of your family. If OpenAI can't provide that, something else will.

* As a member of a society, I really do not love the idea that we're using legal standards developed for 1990s email to protect citizens from privacy violations involving technologies that can think and even testify against you.


> [...] should I approach my daily search tasks with the same degree of thoughtfulness and caution that I do with my document storage choices [...]

Then treat them with the same degree of thoughtfulness and caution you have treated web searches on Google, Bing, DuckDuckGo or Kagi for the last decade.

Again, there is no confidant or entity here, no more so than the search algorithms we have been using for decades are at least.

> I really do not love the idea that we're using legal standards developed for 1990s email to protect citizens [...]

Fair, but again, that is in no way connected to LLMs. I still see no reason presented why LLM input should be treated any differently to cloud hosted files or web search requests.

You want better privacy? Me too, but that is not in any way connected to or changed by LLMs being common place. Same logic I find any attempt to restrict a specific social media company for privacy and algorithmic concerns laughable, if the laws remain so that any local competitors are allowed to do the same invasions.


It's not at all clear how easy it is to obtain a user's search history, when users don't explicitly log in to those services (e.g., incognito/Private browsing), and don't keep history on their local device. I've been trying to find a single example of a court case where this happened, and my Google/ChatGPT searches are coming up completely empty. Tell me if you can find one.

The closest I can find is "keyword warrants" where police ask for users who searched on a given term, but that's not quite the same thing as an exhaustive search history.

Certainly my personal intuition is that historically there has been a lot of default privacy for non-logged in "incognito" web search, which used to be most search -- and is also I think why we came to trust search so much. I expect that will change going forward, and most LLMs require user logins right from the jump.

As far as the "I can see no reason" why LLMs should be treated differently than email, well, there are plenty of good reasons why we should. If you're saying "we can't change the law," you clearly aren't paying attention to how the law has been changing around tech priorities like cryptocurrency recently. AI is an even bigger priority, so a lot of opportunity for big legal changes. Now's the time to make proposals.


> [..] single example of a court case where this happened, and my Google/ChatGPT searches are coming up completely empty.

A massive amount, part of why I am both surprised and starting to feel like this discussion stems from some being unaware of the tracking they tolerated for decades. These have been discussed to no end, covered by the usual suspects like the EFF and constantly get (re)reported across the media in "Incognito mode is not incognito" pieces.

Heck, some I know from memory [0], the rest one could find with a simple ten sec search [1].

> [...] my personal intuition is that historically there has been a lot of default privacy for non-logged in "incognito" web search [...]

There has not. No need for intuition or to believe me, just read the privacy information Google provides [2] whenever you access their sites (whether in an incognito instance or otherwise) as part of the cookie banner (and in the decade beforehand if one looked for it).

> As far as the "I can see no reason" why LLMs should be treated differently than email, well, there are plenty of good reasons why we should.

Not email. Never said email. If you are going to use quotation marks, please quote accurately ("I still see no reason presented why LLM input should be treated any differently to cloud hosted files or web search requests." is what I wrote and means something very different), I do the same to you.

Neither you, nor anyone else has provided a reason why LLM input is inherently different to other files hosted online. Happy to read those "plenty of good reasons", but they have yet to be shared.

> If you're saying "we can't change the law," [...]

I did not. I asked why existing laws should be applied differently in case of LLM input and/or changes are somehow needed for LLMs specifically or suddenly.

This really seems like LLMs, because they can be anthropomorphized, "feel" different to some and that somehow warrants different treatment, when that is an illusion.

Considering your believe that "historically there has been a lot of default privacy for non-logged in "incognito" web search", it honestly sounds like you believe there is less room for stricter regulation than my long immersed in this topic self, if I am being fully honest.

If I could implement any change, I would start with more consistent and transparent information of users at all times, which might dispel some misconceptions and help users make more informed decisions, even if they don't read the privacy policy.

Always liked a traffic light system as a concept. Then again, that is what Chrome already tells you when opening incognito mode and somehow there still seem to be assumptions that are not accurate about what that actually does and doesn't do.

TL;DR:

Yes, Search engine providers are able to identify users in incognito mode. Such tracking has always been public information, not least because they have to include it in their privacy policy.

Yes, such tracking has been used in court cases, in the US and elsewhere, to identify users and link them to their search requests done whilst using such modes.

No, LLM input is no different to search requests or files hosted online. Or at least, no one has said why LLM input is different, happy to hear arguments to the contrary though.

[0] https://www.classaction.org/media/brown-et-al-v-google-llc-e... (Google was forced to remediate billions (yes, with a b) of “Incognito” browsing records which according to plaintiffs precisely identified users at the time including being able to link them to their existing, not logged in, Google accounts. Note that this is one of two (US specific) cases I knew of the top of my head, the other was the Walshe murder, though there is no (public) information on whether incognito was used in that case: https://www.youtube.com/watch?v=cnA6XwVQUHY)

[1] https://law.justia.com/cases/colorado/supreme-court/2023/23s... and https://www.documentcloud.org/documents/23794040-j-s10032-22...

[2] https://policies.google.com/privacy?hl=en-US ("When you’re not signed in [...] we store the information we collect with unique identifiers tied to the browser, application, or device you’re using.", "This information is collected regardless of which browser or browser mode you use [...] third party sites and apps that integrate our services may still share information with Google.", I think you get the point. There never was any "default privacy for non-logged in "incognito" web search" and I can assure you, that data has always been more than sufficient to fingerprint a unique user)


I was retained as an expert witness in some of the cases involving Google, so of course I’m aware that Google keeps logs. (In general on HN I’ve found it’s always helpful to assume the person you’re arguing with might be a domain expert on the topic you’re arguing about; it’s saved me some time in the past.)

But Google’s internal logging is not the question I’m asking. I’m saying: can you find a single criminal case in the literature where police caused Google to disgorge a complete browsing history on someone who took even modest steps not to record it (ie browsed logged out.) Other than keyword search warrants, there doesn’t seem to be much. This really surprised me, since as an expert I “know” that Google has enough internal data to reconstruct this information. Yet from the outside — the experience that matters to people - they’ve managed to operate a product where real-world privacy expectations have been pretty high if you take even modest steps. I think this is where we get many of our privacy expectations from: the actual real-world lived expectations of privacy are much closer to what we want than what’s theoretically possible, or what will be possible in a future LLM-enabled surveilled world.


> [...] can you find a single criminal case in the literature where police caused Google to disgorge a complete browsing history on someone who took even modest steps not to record it (ie browsed logged out.).

Can you first point to me making a claim that would require such a case? Or can you, alternatively, point to why there is a need for change rather than just continue to apply the same level of legal protections to LLM service providers?

The fact that this started with a report about a users ChatGPT account and you felt the need to move us towards people using commercially hosted LLMs without an account (cause getting five queries in before OpenAI forces you to sign in is a realistic use case) I let slide up to this point, because whether we are talking about incognito mode access or a user with an account, doesn't change that no one here says why using Chat.com is different to Google.com. I just wanted to call it into memory, cause it is not very expert like, same with the (mis)quoting.

To make it simple, this is my question, the only thing I'd like to have answered:

When self hosted websites first became a thing, governments across the globe did not write new editorial legislation specific for these. They did just apply what was already established for print media/speech.

In this context, why should LLM input be handled differently to data hosted online?

Use file sharing services with no login requirement and the legal requirements there if this completely irrelevant red herring is absolutely necessary for you. Doesn't change anything about the question.


I think there is a non-zero chance they had no idea about this guy until OpenAI employees uncovered this, reported it, and additional cell phone data backed up the entire thing.


Why do employees need to be involved? It's AI. It is entirely capable of doing the surveillance, monitoring and reporting entirely by itself. If not now, then in the near future.


Just give the ai to user relationship a protection like attorney client privilege.

Edit: ai has already passed the bar exam.


It only "passes the bar exam" when AI, or some other flawed process, is the examiner. See e.g. https://doi.org/10.1007/s10506-024-09396-9 for a debunk.


That's not a debunk. "Calls into question" does not equal "in truth, it failed the exam. "


No, it’s a debunk. ChatGPT-4 scored in the 48th percentile (15th percentile in essays) amongst individuals that passed the bar exam. That’s very poor performance.


Thus it scored higher than almost half the humans who passed the test. In other words it too passed the bar.


Attorney-client privilege has limits. For obvious reasons I haven’t read any affidavits associated with the warrant, but it sure sounds like this would fall outside the bounds of attorney-client privilege.


With an attorney you have a clear sense of when you pass outside of that privilege. With a friend or colleague you have a social sense of what's going to remain confidential, plus memories aren't perfect. "Preserving, recording and reporting every word" is not the same as any of these things. This cannot be the world we all have to live in going forward; it's not safe or healthy.


Seems natural to extend privilege here. People are using it as a therapist.


There are a lot of counterarguments I could bring up, but just of the top, plainly, just because people use LLMs as therapists, lawyers, doctors, deities, doesn't make LLMs such.

My personal believes (we should not rely on models for such things at this stage, let's not anthropomorphize, etc.) to one side, let me ask, do you think if I used my friend Steve, who is not a lawyer but sounds very convincingly like one, to advice me on a legal dispute, that should be covered by attorney client privilege?

Cause, even given the scenario that LLMs suddenly become perfectly reliable enough to verifiably carry out legal/medical/etc. services to a point where they can actually be accepted into day-to-day practice by actual professionals and the companies are willing to take on the financial risks of any malpractice for using their models in such areas (as part of enterprise offerings for an extra fee of course), that still wouldn't and shouldn't mean that your run-of-the-mill private ChatGPT instance has the same privileges or protections that we afford to e.g. patient data when handled digitally as part of medical practice. At best (again, I dislike anthropomorphizing models, but it is easier to talk about such a scenario this way), a hypothetical ChatGPT that provides 100% accurate legal information would be akin to a private person who just happens to know a lot about the law, but never got accredited and does not have the same responsibilities.

Again though, we are far from that hypothetical anyways, "people" using LLMs that way does not change this fact. I know, unfortunately, there are people who are convinced that current day LLMs have already attain Godhood and are merely biding their time and that doesn't become real either, just because they act according to their assumptions.

I really struggle to understand, nor do I see any cogent arguments across this comment section why current day LLMs in such a scenario should be treated differently to e.g. a PKM software or cloud hosted diary and afforded the same legal protections (or lack thereof depending on viewpoint, personal stance and your local data privacy laws).


You'll find these laws privileging certain folks are contoured and controlled by the individuals who have already been granted such privilege to discourage and limit competition. Not because it's good in any way for the client.

Protectionism hurts all of society to benefit a few.


Perhaps this is a language barrier, but I genuinely do not understand what is meant by this. Like, what does this have to do with protectionism, who are the "folks" in this case, etc. Honestly asking.


Doctors control who can be a doctor, what is required to be a doctor, what doctors can and can't do, and that people are forced to go to them for Healthcare ... all to protect their personal income. Not to better Healthcare. Not to expand access to Healthcare. But precisely to make it cost more to get. They are hurting society to benefit themselves.

Milton Friedman explains it to doctors here: https://m.youtube.com/watch?v=ss5PxPlnmFk


Yeah, politely, respectfully, no.

Don't know where to start, but I want to assure you, no matter where on this planet you live, Medical Doctors are generally not at fault for high costs of care. Depending on which health care system we are talking about, the particulars may be different, but no, MDs are not interested in worsening patient care for their own benefit. Kinda difficult considering the amount of uncompensated labor and stress compared to other higher paying occupations. Ask a trainee/resident/equivalent for your local health care system if you want some details.

And people are "forced" to go to an MD for medical treatment in the same way they are "forced" to go to any other domain specific expert, it is where the experience and liability lie because they have undertaken the time, training and exams to ideally assure a specific level of care.

Incidentally, has absolutely zero to do with LLMs and the fact that this is cloud hosted software, not an entity, being or anything of the sort, so shouldn't receive any special considerations beyond what we afford to cloud hosted content. Couldn't find anything on patient data processing in that MF collection linked and as that was his area of work, was purely US centric. Medical care is however the purview of medical professionals outside the US as well, including in countries with far higher patient outcomes. If there is an applicable argument, just quote that directly over linking a collection of clips.

To bring this back to the topic at hand, LLMs can and are being used in Medical Practice already. And neither did Doctors prevent that, nor did that require a law change, because, as stated before, it is merely data being input and processed. There are EU MDRd apps for skin cancer, there are on-prem LLM solutions that adhere to existing patient privacy regulations, etc.

Basically, Doctors do not stand in the way of LLM usage (neither could they, nor do they have the time) and even if they wanted to, LLM input and output is just data and gets treated accordingly.


I can represent myself in court, but I can't prescribe my own medication. If one does not go to the doctor to get those drugs they'll die, so yes: forced.

All you assured me of is that you didn't watch the video.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: