Microsoft’s Recall feature recently made its way back to Windows Insiders after having been pulled from test builds back in June, due to security and privacy concerns. The new version of Recall encrypts the screens it captures and, by default, it has a “Filter sensitive information,” setting enabled, which is supposed to prevent it from recording any app or website that is showing credit card numbers, social security numbers, or other important financial / personal info. In my tests, however, this filter only worked in some situations (on two e-commerce sites), leaving a gaping hole in the protection it promises.
↫ Avram Piltch at Tom’s Hardware
Recall might be one of the biggest own goals I have seen in recent technology history. In fact, it’s more of a series of own goals that just keep on coming, and I honestly have no idea why Microsoft keeps making them, other than the fact that they’re so high on their own “AI” supply that they just lost all touch with reality at this point. There’s some serious Longhorn-esque tunnel vision here, a project during which the company also kind of forgot the outside world existed beyond the walls of Microsoft’s Redmond headquarters.
It’s clear by now that just like many other tech companies, Microsoft is so utterly convinced it needs to shove “AI” into every corner of its products, that it no longer seems to be asking the most important question during product development: do people actually want this? The response to Windows Recall has been particularly negative, yet Microsoft keep pushing and pushing it, making all the mistakes along the way everybody has warned them about. It’s astonishing just how dedicated they are to a feature nobody seem to want, and everybody seems to warn them about. It’s like we’re all Kassandra.
The issue in question here is exactly as dumb as you expect it to be. The “Filter sensitive information” setting is so absurdly basic and dumb it basically only seems to work on shopping sites, not anywhere else where credit card or other sensitive information might be shown. This shortcoming is obvious to anyone who think about what Recall does for more than one nanosecond, but Microsoft clearly didn’t take a few moments to think about this, because their response is to let them know through the Feedback Hub any time Recall fails to detect and sensitive information.
They’re basically asking you, the consumer, to be the filter. Unpaid, of course. After the damage has already been done. Wild.
If you can ditch Windows, you should. Windows is not a place of honour.
Today’s news. Pre-alpha software has bugs.
It’s Microsoft software. It is riddled with bugs as usual.
Eh…. Bad take. Software redesign to remove critical security bugs still has critical security bugs. Its not faith building. As terrible as the first version was, the next had to be bullet proof in order to gain trust and adoption. I understand the challenges here, believe me I do, but just because its really difficult to do does not mean a half baked version with tons of security vulnerabilities is acceptable or commendable.
Having said all of that, the types of bugs they found are instructive. When there are labeled field as in the web form, it didn’t save it, but in notepad with no labeling ( the text of visa nearby doesn’t really count) it can’t. And in a PDF it can’t detect a ssn. The only perfect solution is to aggressively over mask, and have all of the possible forms of sensitive data known. Not just credit card numbers but bank account numbers world wide, Not just US social security numbers, but government id numbers from around the world. I think the only perferctish solution would be to mask them all. And basically all numbers will be masked.
Ah but what about passwords? A text file full of sensitive passwords is sadly common, those aren’t pure numbers. So… it seems its fatally flawed unless it just masks anything it doesn’t trust. Which brings me back to the conclusion that the feature should exist as is, and would need to be reworked to simply provide context about what was previously done without capturing the actual work being done. Like 2 hours ago you were editing a txt file named “passwords.txt”.
Bill Shooter of Bul,
A searchable metadata database could be genually useful for users, but several things would be important here 1) the user knows it’s happening, 2) the owner is in full control (view/edit/disabled all data collected) and 3) it doesn’t grant any additional access to anyone who didn’t already have access to the original file.
I’m not sure, but it sounds like microsoft recall might be failing at all three of these and is sending information to microsoft cloud servers without their permission. If the user is already explicitly saving files to microsoft servers, then arguably in that case MS already has explicit access to the content anyways. But if this is done with a user’s local files, then it’s not only a privacy issue but I feel it should actually be illegal. I’m aware that what should be illegal and what is illegal are two different things.
Save that same password.txt on a mac and spotlight will happily have indexed that file.
Search for the password and *boom* it’s returned. Other search index tools on Linux that are default do much the same.
This doesn’t do anything much different in that regard, but because it’s trying to intelligently handle some results, but it’s considered a security failure for even attempting to do so.
One major difference that should be repeatedly thrown in everyone’s face is that your typical linux distribution does NOT have a license agreement that Microsoft can hoover any and all data at any time for any purpose they choose. MS has had these kinds of clauses in multiple of their products at times, and as of last I checked, they still did. Not gonna speak for Apple. That said, it’s down to user education to decide if things like baloo should be left enabled.
And it takes less than 3 minutes last I checked to uninstall AND you have to have an NPU which a good 95%+ of PCs don’t even have. Heck now that MSFT has pulled the stupid Win 11 systemreqs (which after they pulled it I upgraded an E3 1230V3 desktop with no hacks) I would say it is going to be the better part of a decade before NPUs become a majority on PCs.
That’s true – but let’s add that this tech is based on LLMs, which are non-deterministic – or let’s say, probabilistic, prediction engines. We don’t know how to scale that. We don’t know how to secure that. It’s an open question whether we even can scale it or secure it, or simply make it reliable – all questions advocates for the tech won’t ask.
So yeah, we can assume those problems can be solved, and say “pre-alpha software has bugs” – but if we don’t make that assumption (and I don’t), we can also say “these are potentially unsolvable bugs.”
CaptainN-,
This argument doesn’t make sense to me. The randomness is done on purpose so that we don’t get the same result every time (ie it’s a feature and not a bug). But if so desired, it would be trivial to make an LLM deterministic by replacing all random sources in the algorithms. Furthermore it would be trivial to make this secure using cryptographic hashes. It would also be trivial to scale that and make it reliable. Your questions as stated don’t make any sense because they are very easily answered. Can you clarify what you mean?
Accuracy and garbage in garbage out are much bigger challenges in my mind. You can have an LLM spew bad information, which is a legitimate concern and that’s harder to fix. Still we have to remember that humans are also notoriously fallible and a fair scoring system needs to reflect this.
BTW what do you think the turing test is these days? Does it require a test that an average human would fail? I think some of these models may already be passable today, at least when judged by an average human. I’ve been playing around with LLAMA 70B Q4 running on my computer and I find it absolutely fascinating. I try to corner it like this guy does…
“Gaslighting ChatGPT With Ethical Dilemmas”
https://www.youtube.com/watch?v=UsOLlhGA9zg
It’s incredible to experience this in action. LLAMA 7B Q4 seems to have a similar feel, but the overall discussion has more tells if you know what to look for and is more forgetful. I’d like to try out the 400B+ model, but it’s hard for me to download that, much less run it. The 70B model runs decently in real time on my 16C32T rysen CPU, but I’d at least need more ram for bigger models, haha.
“BTW what do you think the turing test is these days?”
Wow this was awkwardly stated, haha…
Let me restate that: what is your opinion about how well LLMs fair turing tests these days? I’ve been conversing with LLAMA and feel I could identify it running the default template because I’m familiar with it. For example it tends to be compliant and take things a bit too literally. But I wonder how well I could identify it running a more unique template that I haven’t seen before. Maybe I’ll experiment with this more.
The turing test is a pretty cool idea, written by a guy in a different time, with a different set of expectations about what makes a human mind work, that has been significantly eclipsed by the progress of various social sciences since he wrote that stuff over 75 years ago. We just know a lot more about what “thinking” is than we did in 1950.
I get in to the same territory when arguing with Marxists – a 175 year old polemic, that while fascinating, and foundational for entire kinds of science, isn’t exactly a fit in the modern age given how much progress the state of the art has made since it was written.
In short – it’s founder porn. I like to understand historical documents, but I’m not in to founder porn.
CaptainN-,
I think it’s an important milestone. It’s more than just a thought experiment, there are broad implications for automating human jobs. Then it was only imaginable, whereas today’s technology can actually make that a reality (for better or for worse).
That’s putting the horse before the cart. The randomness is neither a feature nor a bug. It’s just how it the text generation (or pixel generation, or whatever) works. It uses a very large vectorized data set, with tokenized language “learned” from basically all of the text ever produced, and then it uses a predictive algorithm to generate tokens based on an inquiry. It essentially simulates a response, based on predicting, token by token, what a user’s response might look like to an inquiry (and all the generated tokens through each iteration), based on the history of responses in the dataset (trained model).
The randomness is just inherent in how that works. It’s neither a feature nor a bug, and if they could get rid of it, they would.
People keep countering what I say with stuff similar to “it’s more accurate to humans” – but that’s really beside the point. I’ve often made it clear – there is a kernel or a core of usefulness here. I’m not saying it’s not useful – I’m saying, it has a set of problems, and the hype is over blown. These are different statements.
I could list half a dozen or more of extremely valuable problem domains for a text, image, video and audio generation platform, which avoids the pitfalls of non-deterministic output, because those domains don’t need it to be. But most of the hype driven applications are completely inappropriate, and most of it will ultimately fail. That’s fine, I guess we need to try everything. But I don’t understand why we have to worship as this bizarre altar, rather than simply jumping to the part where we understand what the technology is, and how it works.
CaptainN-,
Ok, that’s not the same as non-determinism in computer science terms. But regardless of terminology, I still find your criticism questionable. You clearly consider the AI LLM models inferior because they “predict” stimuli responses. That’s your opinion, but we shouldn’t shortcut objective testing with subjective rational.
A turing test does NOT prescribe underlying mechanisms to be valid. There’s absolutely nothing wrong about a predictive models passing Turing tests. Random models are absolutely fair game. The whole point of a Turing test is to rate intelligence in a way that specifically eliminates these sources of observer bias. Your opinions about the inferiority of LLMs are precisely the kind of bias the Turing Test aims to mitigate.
Let’s try out a thought experiment, you’ve got ten boxes:
Each boxes randomly contains either a human responding to stimuli, or an AI model that “predicts” a human’s response to stimuli. Now take a hundred random humans off the street acting as judges and guessing whether what’s inside each box is a human or AI. The judges are blind to what’s inside the boxes and make their determination based on stimuli alone. The fact that AI uses a predictive model is 100% irrelevant. If you have to open the box and point your finger at predictive AI models to declare they’re not human, well congratulations you will identify 100% of the LLM models this way. but in doing so you’ve cheated and corrupted the integrity of the test.
With my tin-foil hat firmly in place, I wouldn’t be surprised if — despite its obvious flaws and overwhelming ridicule — Microsoft is being pushed hard by the US government to continue implementing Recall. After all, they did it with the telecoms a couple of decades ago, and back then AT&T was as big as Microsoft is now. It wouldn’t even have to be a threat, it could be the carrot as easily as the stick.
Perhaps not completely off-topic and a hint to what is happening : https://x.com/WallStreetApes/status/1868138972200739148
Without getting into politics, let me just be the first one to say Marc is very good at making a profit, and does not care about reality, or creating any actual value that anything may produce. Take everything he says with an iceberg of salt.
[red alert light emoji] BS Alert! BS Alert! [red alert light emoji]
A nonsense scare claim like “they were going to classify knowledge that is already available in every country where someone wants it, and backed up, in implemented form, on the computer of everyone who’s installed a local copy of something like Stable Diffusion” is… I’m not even sure what to say about what that says for someone’s credibility.
Yeah not to mention he gave a few other reasons why he supported who he did. He honestly just comes off as wanting to take make money anyway he can regardless of the social impacts. If he does care about them, its only as far as addressing them could or could not make him money.
Could you give me your insightful thoughts about https://www.youtube.com/watch?v=KdMlNcSXHhE
Nah it was Qualcomm trying to find a reason to push their new ARM laptops, with MSFT never attribute to malice that which can be explained by incompetence, Heck look at the regression on Zen 5 and Core Ultra due to MSFT bugs.
On a positive note MSFT is in full panic mode as users of Windows 10 is going UP not down with just 10 months to go on support so MSFT took the Win 11 TPM and Secureboot requirements out back and had them shot so all of us with systems in the family that didn’t support Win 11 now upgrade just fine. Just did it with my E3 130V3 I keep for the grandkids to play on, no secureboot or TPM, activated Win 11 hassle free and purrs like a kitten once I got rid of the MSFT included bloatware.
You have a script to “clean” Win 11 from its bloatwares (Recall included) ?
“It’s clear by now that just like many other tech companies, Microsoft is so utterly convinced it needs to shove “AI” into every corner of its products, that it no longer seems to be asking the most important question during product development: do people actually want this?”
I remember when everyone was desperate — major update desperate — to add XML support, even when it was not logical or needed and had to be creatively shoehorned in as a way to make files bigger and slower. In that case, a lot of people DID think they wanted it, until they had to find a use for it.
Happens constantly. Have to learn to laugh instead of bemoaning the lost time that could have produced something useful.
Yup it is just a fad, Remember when “blockchain” was added to everything or before that “On The Internet”? From stuffing WinXP and 7 onto netbooks to the Zune to WinPhone to making Win 8 look like a tablet, if there is one thing MSFT is famous for it is trying to jump onto every bandwagon that gets buzz, failing miserably, then abandoning it faster than Google.
Once Qualcomm fails to sell $1600 ARM laptops and Nvidia uses SteamOS for its ARM based RTX laptops MSFT will dump the AI like it did every other half-baked hairbrained idea it has had in the past and it’ll go back to kissing the X86 ring and jumping on the next craze.
Perhaps not so fast, AI is actually more useful than cryptoshit.
Kochise,
I agree with this. I could not tell if bassbeast was only referring to this recall feature or all of AI. But AI is making very big strides. Throughout history there have been many instances of the public adamantly refusing to believing in AI’s ability to pass some arbitrary skills, like playing chess, go, jeopardy, captchas, driving, etc. Yet AI kept passing more milestones despite naysayers. Every few years the naysayers have to move goalposts. Even turing tests might be defeated if we conduct the test against average humans. Of course I don’t want to imply there aren’t challenges, but these are being overcome at a rapid pace. More and more corporations will take interest in building/acquiring AI to handle roles that had been done by employees and are very unlikely to go back to paying employees to do what they can automate. This is what’s going to drive AI’s long term staying power rather than some gimmicky fad or feature.
This is on what monopoly abuse comes down to. If “AI” and “Recall” would be separate apps, then highly likely a small percentage of people would actually be using it and not much news traction would be involved. Now instead Windows and its position on the market is being used to force such products on people that have no intention of using it. So what will likely happen is for a while this products will have inflated user share numbers and after a while to be phased out. Due to realistic user base making such products a niche products. On top of that with each such failed abuse attempt Windows reputation takes a hit, and it’s accumulating.
What if it’s a corporate goal to reach and leaders can only get their bonuses if the goal is reached, hence their incentive to shove it down our throats to tick the boxes for the annual reports ?
For sure that is involved too, bonuses. And i guess Microsoft is rather desperate, not to miss out on the next big thing, like for example they did on the whole mobile era. They i guess are prepared to bet on everything with some potential but the problem here is most of it failed and this is starting to damage Windows reputation. As Windows still is the medium allowing them to do it. Without Windows we wouldn’t be talking about them now and in this context. Nobody, beyond some niche group, would care about Recall.
Yeah, but basically all “what’s left” of Microsoft is Windows (declining but still strong presence due to legacy reasons), Office (same) and Azure (Drive, 365 et Teams). Still, the “gateway” to the later two is the former, which they are spoiling into oblivion with their hardware requirements (TPM and stuff) and AI (Recall). Hence people will soon be having enough of this shit, even corporate/IT and look for realistic alternatives (Linux, BSD, …). Gaming is already slowly drifting through their fingers (Steam/Proton) and Win32 emulation is getting better too, Office equivalents are getting more mature (LibreOffice), CAD and rendering too (Blender) so Windows is no longer the absolute necessity to “get the job done” other than in a 90s hackerman’s way.
As Cathode Ray Dude said when talking about HP using a System Management Mode rootkit to draw Outlook previews on top of Windows’s boot screen and a UEFI-native application and an un-encrypted FAT partition to implement a “Boot quickly to peek at your Outlook stuff”:
“Because Bob needs a bonus”
As he said in reply to one comment, “I came so close to putting a screenshot up of an article from The Register about how AI is happening because ‘win 11 didn’t trigger a refresh cycle'”.
Thank for the video, very insightful from the beginning to the end, not just about the “Bob needs a bonus” part .
* It’s astonishing just how dedicated they are to a feature nobody seem to want, and everybody seems to warn them about.*
I wouldn’t say that nobody wants this. A text searchable database with what a user has been doing on their computer and an accompanying screenshot as proof? This might not be what the user of the machine wants, but I can come up with some scenarios and institutions who would be very interested in such a “feature”.
This is why we should move away from centralized corporatism. This has all the bad side of strong communism with the spying and all, but without the perks.
Tell us about the perks.
As a very happy user of timesnapper.com on Windows, I couldn’t disagree more. Recall seems like Time Snapper with integrated search.
Sure it was correct to get Microsoft to fix that first version of Recall to be more secure. But as long as it’s local and only available for my user account (which both seem to be the case), I’m really fine with it and even appreciate such a feature.
Just don’t use it. Problem solved.
Note: it in this case can be Windows, or Recall. There is even an option where one does not enter credit cards on your Windows computer at all. You have the power, assuming you are in a free country,
“I forgot my bank PIN, but I don’t have my reading glasses and can’t see what Recall found.”
MIcrosoft: Hey, Bob…. do you have their PIN? … no… no… their bank pin. Bob, I don’t need their full frontal as a backdrop, just send it to me using beautifully secure SMS text, ok?
The American political class (elites), realizes they lost the international battle in fin-tech and other commerce platform tech to China (if this is news to you, I’ve very sorry to be the one to tell you), and are desperate for something else to give them a competitive edge. Seriously, you can’t watch anything with any of those “masters of the universe” types in any media, without them gushing over how far ahead of the Chinese they are with AI. The thing is, AI is mostly overblown hype, with a core of something useful, but not revolutionary. Real advances in AI progress very very slowly. We just had this LLM breakthrough which has the appearance of a larger leap forward than it actually is. But they need it to be a game changer – so what we are seeing is politically motivated reasoning – there is not logic in this reason. (Economics and finance is just another kind of politics, like war.)
So that’s why Microsoft is rushing to build these useless features on top of their dead end platform (Windows). Desperation or relevance is the name of the game here.
CaptainN-,
You’re looking at LLM in the wrong way. Think outside the box and don’t limit yourself to looking at LLMs in isolation; LLMs are a critically important piece of a much bigger puzzle, specifically the human interaction part. An LLM is only the beginning of high level human machine interactions. It’s the glue that allows humans to interact with other advanced AI systems, which were previously only accessible to specialists. For example, it doesn’t have to tell the robot how to walk or drive a car, we have other AI reinforcement learning techniques for those that are highly optimized, however the LLM plays a strong role in human interaction. That’s the important thing to consider here. An LLM isn’t necessarily have to be able to know everything and do everything by itself. It’s ok for an LLM to defer to more specialized AI. This will be the future of specialized AI systems…
It’s kind of like the ship’s computer in star trek…. it doesn’t have to reach the level of AGI like commander Data (the android) to be incredibly useful.
That’s built in to what I said:
“AI is mostly overblown hype, with a core of something useful, but not revolutionary. Real advances in AI progress very very slowly. We just had this LLM breakthrough which has the appearance of a larger leap forward than it actually is.”
I’m not the one trying to write self-executing programs with LLMs. Yes, AI will be able to do that – eventually. But most of the progress in AI is really quite incremental. We just saw this amazing quick looking advance with LLMs. That’s not typically the pace at which AI technology moves forward.
I am not sure LLM is critically important. Small language models (SLM) are being made. SLM versions of LLM keep 90+% of the functionality for about 1/20 of the processing. SLM are more used in AI customer support because they are more trustworthy not to give poor end user made up garbage answers.
Something that you chat GPT and other major LLM models don’t talk about is how many name filter they have having to add that causes the AI to brick itself when that human name is in input so that the AI is not committing deformation by producing made up garbage about a real human. Remember this is all wasted processing power.
SLM are more specialized so the input data can have been more filtered so they are less likely to be legal trouble.
Alfman “””It’s kind of like the ship’s computer in star trek…””” the general ship AI in star trek is SLM or weaker. If you watch the show closely they tell the AI to search something and give it a list of items to search for. This is SLM or lower yes a compact run local home assistant AI reaching out to decanted search when asked to.
The medical holigram from voyager and Data in star trek are both examples of general AIs. With Holodeck char AI been being a mix of different levels deep space nine them talking about having to optimize the AI for size and performance so they worked on holidecks..
Personally think LLM have taken us down the incorrect path. LLM have been making bigger and bigger machines to process data from a very untrustworthy source being general humans does not seam like a ultra smart idea. Particularly if you want to avoid legal liability over deformation or giving people incorrect instructions.
Remember you watch star trek ship AI doing searching its giving real non processed source documents. This AI designed to assist with interfacing with system not a AI designed to attempt to have any form of general intelligence. Your LLMs are attempt to come a general intelligent . Your SLM are attempting to be specialists.
Lot of the cases were you want AI assistant you are not after a LLM. The key is in the word assistant you are wanting the AI to assist you in your task by proving you with the real information so you can make the choices not it making the choices for you. LLM like chat GPT end up making a lot of choices for you instead of being unbiased data collector.
Also remember LLM due to being large models are very heavy on processing power even in the star trek universe their computers are not unlimited processing power the reason why they have to optimize holideck programs.
oiaohm,
That’s a valid point, smaller models might suffice for specific applications. Still, in my own testing of LLAMA, there’s a noticeable difference in quality between 70b versus 7b even though they have similar personalities. I think that’s to be expected. But when it comes to more specific applications large models could be overkill.
I haven’t witnessed this exactly, although I haven’t specifically tested it. I do notice some censorship though and it’s probably accomplished using a similar “filter”.. Conversations can takes an abrupt and jarring turn as a result.
It’s funny because while star trek is meant to be futuristic, the computers in the original star trek series actually seem old and quaint compared to today’s technology, haha. They totally underestimated what computers would be capable of.
I wouldn’t say LLMs are an incorrect path, but obviously their necessity depends on task requirements. Some tasks might be effectively accomplished with smaller (and more specialized) models, which seems to be your overall point that I can agree with.
I think that clusters of smaller more specialized models working together will prove useful too. It’s not just large models that are important.
I’d expect the resources needed for a convincing holodeck simulation would eclipse the resources of a LLM.
It’s admittedly a leap to compare star trek holodeck technology to x86 computers with RTX GPUs, nevertheless my computer uses more power running simulations like this “Fluid Flux” demo than running LLMs.
https://www.youtube.com/watch?v=mM9nEUUbtSo
I find myself chucking that we’re here geeking out about star trek 🙂