Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
TotalRecall: Extracts and displays data from the Windows 11 Recall feature (github.com/xaitax)
101 points by lsde on June 4, 2024 | hide | past | favorite | 94 comments


Security aside, I'm genuinely unsure what problem Recall is even meant to solve.

Like most AI products and features announced over the past 18 months, it feels like a bunch of product people got into a meeting where they looked at the capabilities of the latest OpenAI model and then started spitballing feature ideas based not on user needs but on what GPTs can do. "Oh, these models can do OCR... why don't we screenshot everything that a user has ever done, OCR it, and make it searchable!"

I'm genuinely interested to know if there are use cases that people see for this beyond the obvious retroactive infostealing on a grand scale. What would you do with this feature if you had it?


I think this is the feature everybody wants. Not in Windows, not from Microsoft, probably not yet, but it is so obvious and useful that I'm sure this will be a default thing for a future human.

Imagine never forgetting anything. That tune, what your spouse told you to get, a meeting summary, recall or replay a dear memory. People _without_ this ability would be outliers and strangely left behind. I know there was a Black Mirror on this, but imagine this to be safe, private, local, toggleable even. Life becomes a book you can flip back.

Humanity longed for this for ages. From written diaries to personal data exports of social networks... for better or worse, it's a natural causation of our life becoming digital.


I agree that it will super useful and can't wait for the OSS version of it on Linux that respects my privacy but only kinda does the same thing with a less useful LLM at its core.

That said, I wonder if general folks will see the toggling of this is the same way they initially did for incognito mode? Namely, that you would only use it to watch porn or do something nefarious. The classic "If you don't have anything to hide, why hide?" History has obviously proven that first impression as a rather reductive and silly one. Plus cultural norms/opinions around incognito mode are very neutral now.

Or if this will be like turning a bodycam off for cops? Imagine the distopia of a boss asking why you turned recall off for 30 minutes while you were on your lunchbreak...

I guess it is good that we have pocket supercomputers.


I mean, there's already software to record your screen for companies. Getting access to recall would be the same as installing tracking software, which they can and do already use


> I know there was a Black Mirror on this, but imagine this to be safe, private, local, toggleable even.

Wasn't it private by default, local, and erasable in Black Mirror? And what does safe mean?


> Imagine never forgetting anything

You mean imagine forgetting everything because you don't need to remember anything since that is done by your personal agent - until the battery is dead, the model crashes, your neurocannula gets clogged, an EMP takes out your memory module or an update accidentally wipes your life.

Dear Mr. Poisonborz,

During a routine update of your MegaMind Ti666GX module a mishap occurred which wiped the module and its cloud backup clean. Since you did not purchase the optional MindCare insurance and there does not seem to be a recent off-line backup we are sad to inform you that your life's experiences have been lost for good. To make up for this inconvenience we can offer you a 1 year free MyStorey subscription which will help you fill in the blanks using our state of the art Experiencer technology - just tell it what you want to remember and it will create the memory for you.


You assume this has to be some convoluted black box, a proprietary service on some corporate servers, but people are not stupid, not after what happened in the last decade - look how even leading LLMs become local-first and offline already. The stored data should also remain simple - images, text files, video.


Have you ever heard about notes...


Limitless aka Rewind has been doing it for a while, starting on macOS and expanding: https://www.rewind.ai

I gave it a shot for a few months and it was useful to go back to catch things that I couldn't quite recall. It was handy to ask it things instead of doing direct search sometimes. It was helpful for work, where I tend to take notes on things but never review them because of their volume (and SNR). I wound up uninstalling it because it was not the 'superpower' they alluded to in the marketing material, but it was interesting nonetheless. I also whipped up my own version using Ollama to provide access to a decent LLM (Command-R) on my LAN, recording screenshots, text, and transcripts to LanceDB so I could do combined search by vector and keyword and then re-rank. Also a fun project to learn from, but not that useful in the end.


> What would you do with this feature if you had it?

I would disable it.


I trust that disabling it will disable interacting with it. I'm less convinced that it will disable the collection of the data, especially given the occasional stories of updates changing people's settings.

Instead, I'm going back to Linux for everything but gaming. Nothing of any actual import will happen in Windows. My most recent purchase i left as a plain, single boot windows setup to see if WSL was all that it was cracked up to be.

I thought I'd finally accepted that it wasn't as unpleasant as I anticipated, but there's gotta be a line in the sand somewhere.


If the goal was to train a large model to interact with the OS, this data would be extremely useful. But that can't possibly be their ultimate goal, because they've assured us that the data remains local. I do wonder if there's a gray area where verbatim data remains local, but maybe a substantially transformed version like a UI state-transition diagram is shared with MSFT to "enable product improvements".


This is completely my opinion, but this data is definitely going to be accessible to law enforcement/intelligence agencies when it's more widely used and running by default. All the compute and AI work is done locally and a simple database is given on request.


I agree completely. The AI rush feels exactly identical to the cryptocurrency rush where people saw a novel technology and started trying to find a problem to solve with it. It feels like it's going nowhere fast, at least to me. The overlap between tasks that are important, time-consuming, amenable to AI and also non-critical enough that I can tolerate serious errors in the output seems very small.


https://techcrunch.com/2023/04/24/greywing-seagpt/

This helps container ships operate. Errors matter here. They solved them. Please show me where the scam is here?


You are pointing to the wrong set of users there. It is not that PC user the one that will use that information, but something somewhere upwards in the "security" food chain. Those are the users for which that feature is intended to.


> Security aside, I'm genuinely unsure what problem Recall is even meant to solve.

The same problem as https://atuin.sh/ but for everything rather than just shell command history.


I would like to have full text search of websites I've been to, since I often don't remember enough exact details to re-find something. That doesn't need the AI model or screenshots though.

A retroactive ability to screenshot recent things would also be nice to have.


Emacs' BBDB had similar functionality (without the AI): https://elpa.gnu.org/packages/bbdb.html


"Summarize what I worked on last week"


If an intruder gets into my local account, I'm far more worried about them stealing my cookies or accessing company IP than my browser history.

With cookies and browser access, they could get into my emails, family photos, bank accounts, and even read desktop notifications from my phone's SMSs.

For developers, the real risk lies in the variety of dependencies our apps have, which could get compromised.

So, this isn't really news. There are also tools to access all your iMessage history from a Mac, for example.

I believe the feature is really useful, and for sure you can turn it off.


It's more than just browser history.

What if the screenshot that is safed is taken while you have a password in plain sight?

Companies will use it to check on their employees.

Hackers will get material to extort you. "Interesting porn you watched three weeks ago". No need to caught you in the act. It's enough to get access some time later.

Abusers can control their partners.


When do you have a password in plain sight? On the other hand, a key logger that extracts the passphrase for my password manager and steals the database file of it would be a disaster. I’d rather have an attacker browse through years of screenshots.


> When do you have a password in plain sight?

When I generate a new password.


Or when you click the little eye icon to show you a password when you’re typing it in or looking at it in your password manager


Recall is just a massive trove of data and there is no single way it can be abused. For example, key loggers are extremely noisy (bunch of keystrokes being dumped into a log) and an attacker could use the data to see when the last time the user logged into their password manager was to narrow down the search.


>On the other hand, a key logger that extracts the passphrase for my password manager and steals the database file of it would be a disaster

Too bad that's it's not either or but on top of that.

So they can steal your current and past secrets.

The total package.


Oh, why care about key loggers. They are rare. On the other hand, a serially killer in my house would kill me literally!


How is it useful unless you suffer from complete amnesia?

Since when is storing plaintext passwords on disk not a security concern anymore, which is precisely what this tool will do?


Here's what a SQL search looks like against it:

    SELECT c1, c2 
    FROM WindowCaptureTextIndex_content 
    WHERE c1 LIKE '%{search_term}%'
      OR c2 LIKE '%{search_term}%'
https://github.com/xaitax/TotalRecall/blob/28a9f75de005a3d82...


I understand that this feature is highly controversial and I myself have mixed feelings about it. But I don’t understand what is being “pwned” here: isn’t providing a photographic memory of everything you do on your PC exactly what this is supposed to do? Yes, you can access the data through other means than the official UI - this is the case for every software that runs on my PC.


> isn’t providing a photographic memory of everything you do on your PC exactly what this is supposed to do?

Yes, but look at the Q&A at the bottom: apparently Microsoft told the BBC that hackers would have to have physical access to the device in order to access the data. This repo proves that's nonsense because all an attacker would have to do is install this code on your computer, which is something we already know they could do.

> Yes, you can access the data through other means than the official UI - this is the case for every software that runs on my PC.

I don't have software on my PC that indiscriminately takes screenshots of what I'm working on every few seconds, OCRs it, and indexes it in a convenient searchable database. A hacker can get a ton of information off my computer as it is, but a lot of what Recall will be saving has been hitherto ephemeral.


So MS lied in an interview. Or the press person was not very knowledgeable. But how would that even work, that data can only be accessed locally? How does a computer decide if the intent to access a file is coming from the user in front of the PC or from someone who installed malware that sends keystrokes or mouse clicks on behalf of the user?


https://doublepulsar.com/recall-stealing-everything-youve-ev...

Apple maintains some databases on macOS that are not sudo-accessible.


Yes, this is what Raymond Chen calls being on the other side of the airtight hatchway. Once you're admin, you can do anything. You can install keyloggers, install remote access tools, etc.

The only thing that Recall enables is that it creates this data. And since the data exists, it can be stolen. Data that doesn't exist cannot be stolen.


Previously, malware/stalkerware can monitor anything after they’re installed. They don’t get much if they’re purged quickly.

Now, immediate treasure trove upon installation. No matter how quickly you catch it.

There’s a fairly substantial difference here.


As you point out though, the data does exist. It just tends to be more ephemeral. There's some question as to whether the ephemeral aspect of it is more feature or bug.


See the README under "Q: But the BBC said data cannot be accessed remotely by hackers."


I am not sure if that justifies this.

If hackers pwn your desktop computer and you do not notice it, they get all the data you have access by yourself. Nothing new here.


If MS claims the data is not accessible [by malware] remotely and can only be accessed if the attacker has physical access, a POC just just fopens the file and prints it is a fair exploit POC. No matter how trivial that looks to you.


Where's the disproof of that? Am I missing something?

"this is wrong" is not a sufficient counterargument.


The repo is along the lines of "2+2=4". GP said there is nothing interesting in that, to which I pointed to the QnA entry which shows that MS did tell a journalist that "2+2=22", so to speak. What else is there to disprove?

The significance of the repo is not to show 2+2=4 but that 2+2 != 22


The BBC quote says "a would-be hacker". I interpreted that as a general claim about windows security, not saying that this particular feature is invisible to malware. They have to break the security of your particular device, the data is nowhere else.


> I interpreted that as a general claim about windows security

Sure, because you understand that the other interpretation is nonsensical. All the publications that are popping up showing that the Recall DB is locally accessible are aimed at all the other Windows users.

Now, I would fully agree if you question what's the benefit of posting this on GH and not on FB, for example, and what's in there to surprise the HN crowd.


> Now, I would fully agree if you question what's the benefit of posting this on GH and not on FB, for example, and what's in there to surprise the HN crowd.

The author posted Wired's article about the tool on LinkedIn. Does Facebook host code and render Markdown? Does the author have a Facebook account? Would you bet your Facebook account they wouldn't consider it distributing a hacking tool and lock the account?

Must it be surprising? Some in the HN crowd would want to explore their own databases I think. Some will have family and friends ask them about Recall security.


> All the publications that are popping up showing that the Recall DB is locally accessible are aimed at all the other Windows users.

See, that's the thing. Proving it's locally accessible...

Microsoft never even implied it wasn't locally accessible.


> Microsoft never even implied it wasn't locally accessible.

BBC said Microsoft said a hacker would need physical access. You can think this meant to hack Recall or Windows.


Right. They specifically laid out a scenario where the data is locally accessible.

So code that proves the data is locally accessible doesn't contradict them.



? That comment thread is about a completely different thing at this point.


I assumed your meaning of local was consistent across comments. Was it not?


You interpreted a statement about saved screenshots in an article about Recall as a general claim about Windows security even the general public would know was false?


I guess I worded that badly. Let me try again:

I interpreted that line as analogous to normal Windows security.

As a general rule, a would-be hacker can't get to any of your on-device data, Recall included, without a local user giving them access.

So the intent of the statement is to say it's immune to anything else being hacked, like servers. Not to say they finally invented a completely hack-proof system... and only used it for this single program.


> As a general rule, a would-be hacker can't get to any of your on-device data, Recall included, without a local user giving them access.

Physical access means physical access to experts and the general public. Not physical access, social engineering, supply chain exploit, or remote code execution. Saying Windows can't be hacked without physical access would be false too.

> So the intent of the statement is to say it's immune to anything else being hacked, like servers.

Anything else would include Windows Update and Microsoft accounts.

They said Recall snapshots were stored on the PC itself and not available to Microsoft. Adding a misleading description of Windows security did nothing but confuse people.


Nothing in this FAQ is in any way surprising. If someone (“hackers”) gains remote access to your PC, they can also install a key logger and take screenshots. They just don’t have access to the historic data.


"this is wrong. Data can be accessed remotely" is not a super illuminating FAQ item. how can it be accessed remotely?


With any available information stealing malware: https://doublepulsar.com/recall-stealing-everything-youve-ev...


okay, so the only "pwning" of recall is that if your computer is already hacked, the hackers can get the recall data along with all the other data on your computer...


Yes. With the difference that now the have the DB which they can interrogate with a prompt like "list all credit card numbers entered in the past 3 months". The post touches on your question/remark.


The data is supposed to be stored secure and encrypted.

It's not, so it's a selfpwn by MS


This is cool. What would be a good way to prevent this type of extraction? We just launched OpenRecall https://github.com/openrecall/openrecall with which we want to offer a fully open source/auditable and privacy/security focused alternative.


I have not taken the time to fully read your GitHub, but here is my view.

You making this is inherently different than Microsoft including this by default in all future versions of Windows. If someone downloads your tool they are making the conscious choice to give up some data protection for and admittedly cool feature. It is also a more limited number of people with data stored in a particular way.

Every Windows 11 having it, is painting a target on everyone's back since it would be somewhat easy to assume, if Windows 11 this is probably enabled. It is also not properly educating people on the risks.

Personally I don't have a problem with the tool, or necessarily how it is designed (it could be better, don't get me wrong). But it has to be opt in, properly educate on the risks, and probably shouldn't be built into the OS.


Don't collect the data.


From your link :

> Your data is stored locally on your device, and you have the option (soon to be implemented) to encrypt it with a password for added security.

Security focused my ass.


Encrypting the data with a key that's stored on - or only accessible using a hardware token like a YubiKey would be a good start. That way the data can't be decrypted without explicit user action.


What privacy/security features make this meaningfully different than Microsoft's offering?


This was obviously bound to become a thing. I never got why Microsoft created this "feature" despite the self-evidently drastic risks, though I suppose I'm not too surprised.

What's more surprising though is how I remember HN folk being curiously welcoming to the whole thing when it was first announced.


Recall sounds like a gigantic pile of "fuck no". It's crazy that Microsoft just lets some PMs scrunch up and throw out the company's reputation.


I really love how there are already ways to run this and people that took the time to find all this! Props to everyone


Pwned in what way? This is just a SQLite database, you can "pwn" it with any SQLite client of your choice.


Weren't were told it was encrypted and that this sort of access wouldn't be possible?


Can you quote anything? I don't remember microsoft saying you couldn't access your own data.

The fact that a different user can get to it so easily is bad though.

And the FAQ claims that remote access is possible but does not elaborate, so that's confusing.


> The fact that a different user can get to it so easily is bad though.

This is what I was referring to. The data this collects is of high sensitivity and value. It will, without question, be targeted aggressively. It needs to be handled accordingly.

While I think that this service is dangerous and misguided and shouldn't be used by most people, I would hope that Microsoft would at least be a whole lot more careful about protecting those who do.

About being encrypted, here are quotes from Microsoft docs (https://support.microsoft.com/en-au/windows/privacy-and-cont...):

> Recall processes your content locally on the Copilot+ PC and securely stores it on your device

While it doesn't use the word "encrypted" here, "stores securely" certainly implies that.

> Snapshots are encrypted by Device Encryption or BitLocker, which are enabled by default on Windows 11.

Here is where they say encrypted. They also say it's just from BitLocker, which means it's not really encrypted in the sense that security-minded people would assume (encrypted separately from the whole-disk encryption). I also think most laypeople won't really understand what this means.


The point is that the existence of this code proves that remote access is possible, you just use any one of the many proven malware vectors to get the user to install a binary that does the same thing as this repo does but ships it over the network to your servers.


That seems like an unreasonably broad definition of remote access to me. If installing a local program that proxies data counts, then the only true way to make "remote access" impossible is by installing it in a secure room with no networking and where no other electronics are allowed in. How many people interpreted that claim as SCIF-equivalency?


> How many people interpreted that claim as SCIF-equivalency?

Basically everyone who isn't employed in tech? This is what the BBC said [0]:

> And it said a would-be hacker would need to gain physical access to your device, unlock it and sign in before they could access saved screenshots.

Those of us here can readily see that this "physical access" claim is bunk, but that's what Microsoft represented to the BBC and what the BBC is telling the world.

[0] https://www.bbc.com/news/articles/cpwwqp6nx14o


If it's just to prove those two words wrong, then this repo seems extremely overblown. "It works like every other program in the world" isn't very exploity.

Also I don't think many people even saw or noticed that particular claim. They just saw the part about it saving everything you do to your computer and were rightfully worried.


> If it's just to prove those two words wrong, then this repo seems extremely overblown. "It works like every other program in the world" isn't very exploity.

The FAQ author explained it worked like every other program in the world. Some people doubted him because why wouldn't he show proof if it was so easy? The tool author called it a very simple tool and no rocket science whatsoever.

> Also I don't think many people even saw or noticed that particular claim.

Fewer people will see this repo. What is the correct number of people before misinformation should be corrected?


> The FAQ author explained it worked like every other program in the world. Some people doubted him because why wouldn't he show proof if it was so easy? The tool author called it a very simple tool and no rocket science whatsoever.

The repo overall makes it sound like it's a way bigger issue than that.


> The repo overall makes it sound like it's a way bigger issue than that.

The repo overall contains a tool the author said 3x was simple or not rocket science, an explanation of what the tool does, and someone else's FAQ about the context.


It comes across as a rebuttal to any and all claims of security, not just the phrase "physical access".


The idea the FAQ and tool were created to rebut the physical access claim only was a straw man you created. You said the claim remote access is possible confused you. lolinder explained it.

The repo demonstrated the FAQ's claims and gave people a tool to inspect the databases. Any other interpretation is your problem.


A different user could also get your browser cache, your cookies, and install all sorts of horrible programs running as Admin as well.

This isn't really different.


Encryption is not authentication.


Are we really surprised by this point?

Over the last 2 years (and really longer if you look at Google) time and time again privacy has been put in the background, properly vetting tools for reliability has been put in the background, all in the name of shoving AI into every single thing for investors or because they are scared.

In a few months we will have the next horrible privacy invasive AI thing from one of these companies that is being shoved at us if we want it or not.

On the topic of this actually existing, this is opt out and not opt in I assume? Which just makes this 100% worse.

Will be honest, when this was announced I did not even think of the ability for trojans to get onto your computer and get a lot of sensitive data. I was more concerned about this data being synced between devices and stored on a server somewhere. But that really is one hell of a problem that could cause a lot of issues.

I just think about how often for some game launchers I need to open 1Password to copy my password.

This just makes it more and more that Windows is only for gaming and I will never do anything serious on Windows again.

Not looking forward to this hitting the corporate world, I am sure some companies won't get the memo and just leave this feature enabled. The secrets that are going to be leaked. Screw message retention policies.


Well that's not ideal


was about to post url myself lol. good thing i switched to linux yesterday


The movie poster for Total Recall hit a core memory. I am gonna need to re-watch that classic tonight.


This comment got downvoted because you’re about to watch a racist and problematic movie.


Uhh. What? I've never heard that before in my life. What am I missing?


Never heard of that theory either. Enjoy the movie, it’s a good one.


After years of watching spyware make a ton of money on their platform, Microsoft took a page out of Apple's book and created their own spyware for Windows.


What's the Apple reference to?


Neat tool, kudos, but is it "pwning"? I don't think so. It's more like another viewer for the same data. You need to have the right permissions to access the SQLite and JPEG files anyway.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: