Facebook has threatened to pack up its toys and go home if European regulators don’t back down and let the social network get its own way.
In a court filing in Dublin, Facebook said that a decision by Ireland’s Data Protection Commission (DPC) would force the company to pull up stakes and leave the 410 million people who use Facebook and photo-sharing service Instagram in the lurch.
[…]The decision Facebook’s referring to is a preliminary order handed down last month to stop the transfer of data about European customers to servers in the U.S., over concerns about U.S. government surveillance of the data.
…is this supposed to be a threat? Because it sounds more like a gift to me. Please, Zuck, go home! I think we here in Europe will do just fine without your criminal enterprise.
How do we replicate this European success back here in the states?
Sounds like the key is not backing down… aka being spinless like most politicians.
Sounds like the key is not backing down… which is really hard for most spinless politicians.
Bye, bye, Facebook! And please, do not come back (AT ALL)!
Ohh no, please don’t leave. How would I do without someone spying on everything I do, listen to what I speak with family or friends, and use that data to shove creepy ads and whatnot.
There’s plenty of vpn providers that’ll be happy to help you. 🙂
I won’t miss facebook one bit.
I have touched this before, and it might need repeating. Technically Facebook has a point.
Without going into all the ads / tracking / etc issues, just looking as a generic social network.
If you have a many-to-many connections as a geographically dispersed graph, it makes sense to make local cached copies of frequently used nodes in other locations. There is no efficient way around that.
I know, since I had a bug in a very large data set, and believe me copying multi-TB data across continents, every day, for many many days, until you catch the bug, is not cheap. In Facebook’s case it would be needless transfer of PBs (peta bytes). That is wasteful, even very very expensive.
They could focus on better GDPR, or accountability over trans-atlantic uses. However restricting user data strictly inside a geographical area, while data shape is a graph, with edges across multiple areas is not workable.
It’s always going to depend on the nature of the data…
If the data is the user’s private profile which is not shared with anyone else, then there is no reason for this data to ever be stored outside the location where that user registered their profile.
If the data is something the user has shared publicly, then there is no reason to keep it private. It can be cached and distributed anywhere, as the user has explicitly chosen to publish the data.
Unfortunately it is not that straightforward.
There are many other categories than “public” and “private”. For example, shared with friends, public follows, reverse follows, groups, app logins, and much, much, much more.
This would only work, if all your friends and family are in the EU, do not follow any public figures outside of EU, do not use any connected apps outside of EU, do not even have any “public” posts, do not join groups with any members outside EU, and make sure those groups also have private setup, you personally do not have any followers outside of EU, and so on.
Otherwise the profile needs to move across continents, and necessarily as a preemptive cache operation.
If, the law says the profile can be copied when there is any outside use, then once again it becomes entirely toothless as soon as you follow @ElonMusk.
Let’s phrase this way.
These attempts to hide data focus on “nodes”.
What makes social networks tick are the edges.
You cannot have a black box of nodes, if they have edges going outside. In order to make those edges useful on the other hand at least some of the profile information (node) has to be copied.
There is no practical way around that.
No, it very much is that straightforward.
Sure, there’s categories between “public” and “private”, but that’s irrelevant because they’re only worried about the user’s private data and do not give a shit about any of the other categories.
Specifically; they’d be worried about the “user tracking” (used for targeted spam) and NOT any of the content that the user published (to share with friends, family, …).
The way this works is that you log into FaceBook once, and they get your IP address. Then you close FaceBook and visit 50 other web sites that have nothing to do with FaceBook at all (but happen to have a FaceBook logo or some other “innocent” looking scrap) which is fetched from FaceBook; which allows FaceBook to link the IP address you used on their site with the same IP address you used on 50 other sites, and track almost every web site you visit while you’re not using FaceBook at all.
That is the “user data” that EU doesn’t think should be sent to the US (complete tracking of almost every web site you visit). Note that this user data has absolutely nothing to do with social media whatsoever.
Circling back:
”
Without going into all the ads / tracking / etc issues, just looking as a generic social network.
”
User data has a specific meaning in the business, and that includes everything, even the email address or the real name.
If in US, one of my friends thumb up my post from France, how can you engineer it so that I can see who did that without storing a copy of the data on this end, and without transferring profiles every single request?
sukru,
You’re talking about something you do when you are logged in, but recording up and down votes doesn’t technically need a full profile to be transferred, even just a single surrogate key would work and it wouldn’t matter where the user data resides. Now maybe the locality wouldn’t be optimal, but it would technically work.
I thought Brendan made a good point about facebook tracking you for their benefit even when you aren’t logged in. That’s extremely invasive and they’re not doing it for our sake.
Alfman,
I am not talking about full profile. Again even email address, user id, real name are PII data.
Second, caching is really, really, really necessary. This is one of the early interview questions we would ask, and if you cannot come up with good replication and caching, nothing else works.
Take Facebook. They have over 1.5 billion DAUs. If each request needs roughly 2MB of data to be processed (lowball estimate), and there are 10 requests per user per day, that is enough to saturate 2.7 Tbs/second (assuming no peaks).
So, no there is no good way to solve this without copying *Some* of the user information over here.
(edit: does not allow links)
sukru,
You don’t need any of those, a surrogate key is fine.
I’m not sure what european laws say about replication and caching.
However, on a strictly technical basis I disagree with your assertion. For example, say a user’s profile picture is private information. A site can link to their profile picture that is hosted anywhere on the internet. It doesn’t have to be stored in the US. The same goes for other information that can be fetched by the clients as needed. Facebook’s US servers don’t have to be involved in transferring private data and it’s technically possible to instruct the browsers to fetch it directly from overseas.
Now, you can criticize it for not being optimal, but it does work and a lot of people actually do use websites and ajax calls across the globe without major issues.
Additionally you can fix the locality problem by having a content distribution network that only holds encrypted files. The decryption keys would only be available from European servers, so you’d only need to download the tiny decryption keys from europe, but all the heavy lifting would be done be an encrypted CDN. This way facebook’s US servers would never get unencrypted media.
A lot of the problems are technically solvable, you just need to think outside the box.
Alfman,
A key that uniquely identifies a user is PII by definition. After all those “advertising ids” are usually a simple random 64-bit number (or 128, or 256, etc).
From US official guidelines at Department of Labor:
https://www.dol.gov/general/ppii
Second, I would recommend doing the math exercise. The NA-EU undersea cable capacity is measured in 100 Tb/s:
https://en.wikipedia.org/wiki/Submarine_communications_cable
https://upload.wikimedia.org/wikipedia/commons/2/22/African_undersea_cables_v44.jpg
Consider peak times, and multiple services (Instagram, Twitter, etc). This would be literally breaking the Internet.
Third, any privacy team worth their salt would already store all data in encrypted form. They would be encrypted on disk, they would be encrypted during site-to-site transport, and they would be encrypted when being served to the user.
I wish there would be an easy answer. It requires really big effort by really smart people to have a workable thing. That is why Facebook would be willing to pause all EU operations until all this is sorted out.
sukru,
You are kind of using that out of context, it’s only deemed personally identifiable information when it gets linked to personally identifiable information. Here’s the relevant text:
Think about it carefully. A website could even go so far as to publish your session keys, there’s no risk to your personal information when no information is being logged or collected. It’s not until keys are otherwise linked to personal information that they become personally identifiable.
In the context of our discussion, the surrogate key is linked to personally identifying information in a european database, however arguably so long as that database remains in Europe, facebook USA would not have the personally identifying information, which would very likely satisfy european regulators.
I don’t know why you ignored the points I already made about scaling via encrypted CDN. I’ll await your response.
This feels disingenuous to me. Site to site VPNs only protect data in transit, client to server HTTPS only protects data in transit, disk encryption only hides information from those who don’t have the keys. In every one of these cases, the companies are in possession of the keys and data.
Consider backup service A) Use HTTPS, VPNs, disk encryption, etc, but have access to all your data. Now consider backup service B) Use client side encryption such that the company doesn’t have your key and cannot decrypt your files. There’s a big difference here and you’d miss it by just looking at things like VPNs and HTTPS.
Hey, I completely understand why facebook doesn’t want to do it…they would be investing in privacy measures that are clearly detrimental to their business model. I just wanted to point out that the scenarios you brought up could be done while doing more to protect user privacy if they really cared to.
Alfman,
I do not know how to convince you. I think telling I do this for a living will not help. So let’s get technical 🙂
De-anonimization attacks are real. For a very newsworthy one, a lesbian woman could be identified by the Netflix dataset.
https://www.theregister.com/2009/12/21/netflix_privacy_flap/
The dataset was part of a competition for researchers, and it was anonymized to the best of their capabilities. Obviously that was not enough. There is a lot of scholarly research just on that dataset alone.
One such random paper:
http://www.cs.cornell.edu/~shmat/shmat_oak08netflix.pdf
And this is not a remote thing. It happens every day. When you go to a supermarket, and they ask your zipcode at the checkout, this is not to do some statistics. They will uniquely identify you with very high success ratios. (Just Initial. Surname + Zip is enough most of the time).
https://iapp.org/news/a/2013-05-01-zip-codes-are-courts-set-to-protect-consumers-from-marketing/. They do this by joining with third party datasets.
That is why any unique id is a big “no-no”.
From the same paragraph:
(It is an -or- clause, any of those are PII).
That is why when working with actual use data, the first piece of information that is scrubbed are the ids. It could be email, it could be name, it could be a random GUID row key. Does not matter. It also does not mean much if you have a unique id for each request or session, as long as those ids can be mapped to a single one in an offline process.
(Of course many other fields are scrubbed. I would recommend following the fascinating discussions on this research topic).
So, no, it is not even possible to store an external id on a database that is not vetted to for privacy and security.
And CDN is actually worse. In that case the system not only copies the data to another locale, they give the custody to an external entity (like Cloudfare or Akamai). So if you cannot convince your privacy consult lawyers to store the data yourself, how will you convince them to have a third party to store the same data?
Again, there could be ways around this with minimal damage. But that will take time to research, and I don’t know any existing system available today that can achieve that for a real Billion user system.
sukru,
You should also know that this is what I do for a living too, but you are right to point out that it’s not a good argument.
I recall the AOL snafu too. The output buckets didn’t have enough entropy to prevent identifying individuals. To be clear though they started with personal information and extracted data from it, this is quite a bit different from the scenario we’re talking about here where facebook USA wouldn’t even have personally identifying information in the first place.
I mean, sure they could easily head over to facebook europe and get all the information from themselves, but for the purposes of the law as long as the data stays in Europe I believe they’d be in compliance.
It’s not an apples to apples comparison. Facebook US wouldn’t even have the initial, surname, or zip code in the first place.
That’s exactly what I was talking about, it stops being a random number once there’s a database connecting it to personally identifying information. However in our scenario this database could be stored in europe, which probably means it would be compliant.
It does in fact matter though. There is a huge difference between distributing the names/emails/ssn/address/etc outright versus a surrogate key. 133688 – Here’s a surrogate key from one of my databases, it links to personally identifying information on the system it came from, but without access to the database you don’t know what or who it refers to. While obviously it would be trivial for facebook US to get access to facebook EU database, I think that for the purposes of this regulation it would be sufficient that the data remains stored in europe.
Having a surrogate key isn’t useful without a mapping. In this specific scenario facebook US would explicitly not have this database. Facebook US knows that user #8383 upvoted a post, but unless it chooses to violate EU directives it won’t know who that is. Obviously facebook EU would in fact know, but that’s allowed under the EU directives.
As a side note: why would you assume it has to be a 3rd party CDN? Surely facebook runs it’s own CDN, no?
We went from “unworkable” to “there could be ways around this with minimal damage”. I’m happy with that.
sukru,
There is actually, peer to peer and federated technology can already solve the problem in various ways. You can even encrypt data before a provider sees it and make the clients responsible for sharing decryption keys. There’s no shortage of privacy increasing innovation, however the real problem is that advertising and social media company business models are wholly incompatible with privacy; I would go as far as to say that privacy is an existential threat to them. This is the reason I don’t see google or facebook ever improving privacy voluntarily.
And as for government’s forcing them to…yeah they’re usually not fans of strong cryptographic privacy either.
I was once heavily believing that future would be peer-to-peer. However people have chosen centralized systems over federalized ones.
Even at the campus, the university will no longer give you a dedicated IP and unrestricted access. They will have you behind a heavily restricted NAT. And even then there is not much you can do with a mobile device that will not be able to run 24/7.
Gone are the times when we could telnet to SMTP port, and just say HELO/FROM and type our emails on the console. Now every email needs to go thru TLS/DNSSEC or they would most likely be marked as spam. In fact even getting IMAP access to your own inbox is no longer easy.
I am realistic to accept that when more than half of the users have no idea on how the technical details work, preferring security and convenience takes place. But I don’t like it.
sukru,
I agree, this shift has happened. I believe a big reason why it happened is because companies stopped investing in P2P and centralized was more profitable.. MPAA/RIAA lawsuits didn’t help at all. The future will remain centralized, not necessarily because it’s better, but because the companies at the top don’t want to give up their highly centralized control.
I’m fairly certain that if P2P hadn’t been replaced by centralized streaming services, everyone would be on IPv6 today because they would have demanded P2P. The proliferation of centralized services has promoted widespread tolerance of IPv4 with NAT 🙁
IMHO SMTP has grown unwieldy. I’d like to see it replaced by a far cleaner and more consistent standard. However I’d be very concerned that the giants might just ditch the federated elements altogether and we’d end up with centralized service providers becoming obligatory.
Why does Europe get all the perks? How do we get this to happen in the US? Haha.
This other article came up the sidebar:
https://www.vice.com/en_ca/article/k7e599/zoom-ios-app-sends-data-to-facebook-even-if-you-dont-have-a-facebook-account
This stuff bugs me because it removes consumers from the choice of where their data goes. Whether it’s facebook, google, microsoft, I think there’s all making it more difficult to protest their data practices.
Sometimes it’s done in secret, like bank transactions sold to google without user conscent. Other times even when you know about it, you lack control anyways, like google’s damn captchas. Sites including osnews are hosted at google. Just recently one of the companies I work for implemented a Watchguard “cloud VPN”, which it turns out is implemented with a hard dependency on google services. Since I don’t have a google account on my lineage phone I can no longer log in at work. Prior to this I’ve never been forced to have a google account on my personal device. Seriously I made a stink about it, but nobody cares and I have to borrow other people’s phones now to log into my own damn account, this is BS. They were sold on the “cloud VPN” and nobody cares that it’s actually more fragile and way more vendor locked than before. I’m just tired of it.
[quote]
…is this supposed to be a threat? Because it sounds more like a gift to me. Please, Zuck, go home! I think we here in Europe will do just fine without your criminal enterprise.
[/quote]
I guss you don’t have a Oculus Quest which will need a FB Account
Well I just avoid occulus because of that 🙂
Yes, please!!!! 🙂
And what about the European advertising business? With the current economic crisis, ad prices crashed hard. Now if there was no more Facebook ads, the European advertising market could crash even harder. That means that companies will have difficulties selling their junk/stuff to people, and many people will get ousted of their jobs.
As much as I hate Facebook, this is a company so entrenched in the advertising business that it will be damn hard to get rid of it.
On my facebook i only get adds from chinese companies. 🙂
Facebook pulling out would put 10,000 direct jobs at risk, but plus all the supporting jobs (coffee/sandwich shops and similar). Then there is the multi-billion investment that they have made in the multiple countries they operate in. Remember there are Real people who will lose their livelyhood during a very difficult time.
Adurbe,
The thing about that is that many of these large companies actually displaced a lot of their competitors and consolidated a lot of the jobs in the first place. When these huge companies increase their market-share, the job growth is only going to be logarithmic, which is why huge companies tend to increase jobs on a microeconomic scale for themselves, but actually take away jobs on a macroeconomic scale.
Admittedly, a sudden change could create a lot of hardships, but in the medium and long term there would undoubtedly be new competitors that would be willing and able to take the reigns from facebook.
Your description only applies in a competitive market. If you take a social media monopoly such as FB then there is not a competitive market to generate whe economic conditions you describe. FB only works because everyone is on it.
Adurbe,
Huge multinational corps like facebook have grown so big that there’s nothing left in the pie for anyone else. However If facebook disappeared one day, that leaves a whole lot of pie on the table and I’m sure a lot of companies would be willing and able to fill the void to get a piece of it. Incumbent companies, by virtue of monopolizing the market for themselves, tend to make the markets non-viable for competitors. I honestly believe there would be a lot more competition from new startups if facebook left.
If Facebook goes bye-bye, people will still drink coffee and eat sandwiches. Nobody’s career is going to come crashing down unless they choose to stop working. People will just find somewhere else to work, as is always the case. In the event that Facebook leaving could actually kill a town somewhere by significantly reducing the local economy, then the people can either reboot their town or they can move somewhere the money well hasn’t dried up. One way or another, life will go on.
Great news! Maybe the EU should consider donating to the development of Mastodon, Diaspora and similar open source platforms.
Matrix.org comes to mind as well.
Which is federated.
Development is partly funded by France who is trying to not depend on US companies.
So I guess Facebook wants to pay more taxes ? By evading paying taxes in the US ? Because isn’t that a big reason why they had offices in Ireland in the first place ?
To bad for them I guess.
Well, that would be a shame wouldn’t it! /s
I hope they take Twitter and TikTok with them.
Oh i wish they go sooner than later. My parents believe all that sh** posted on Facebook and its really hard for me to teach them to crosscheck everything. I want them back to consume serious media. And for secure messaging there are several others than Whatsapp (if they pull the plug on that too).
“In the worst-case scenario, this could mean that a small tech start-up in Germany would no longer be able to use a US-based cloud provider.”
This is already the case in some countries. For Russia we had to find a DC provider in Russia, because it’s illegal to process and store user data of its citizens out of the country.
Not sure, why so many are applauding to this. Luckily noone is forced to use facebook. So, don’t use it, if you don’t like it. No? I see internet-business in the EU generally in risk with all those restrictions and laws – in my opinion the danger is, that it get’s more and more over-regulated, just think of the upload-filters for example.
aurora
I posted this already, but it’s relevant here in response to your point.
https://www.vice.com/en_ca/article/k7e599/zoom-ios-app-sends-data-to-facebook-even-if-you-dont-have-a-facebook-account
Because of 3rd party APIs, many people end up having facebook/google/etc track them regardless of having an account with them. And it’s usually done without the awareness or consent of users.
For example, my daughter uses duolingo, but in doing so she gets tracked by both facebook and google..
Does she have a facebook or google account? No.
Did she (or I) consent to facebook or google tracking her? No.
Was she notified that facebook or google would be tracking her? No.
Was she given the opportunity to opt out of their tracking? No.
Is the tracking critical to the website’s functionality? No.
Is this data going to be used to create advertising profiles? Yes.
This is reality whether we like it or not. Some people don’t give this any thought and maybe they’re ok with it. But still these tracking companies have vast hidden tracking networks profiling users whether or not the users explicitly signed up for the company’s services.
I’ll concede there’s a risk of abusive and/or over regulation, however it comes on the back of many years of under regulation where many of our top companies are just abusive and have accountability to no one but themselves. Consumer rights have been ignored for too long. While I’m not in europe, I’m glad that at least someone is standing up for consumer interests.
I partly agree regarding the APIs – on the other hand: shouldn’t you blame those companies implementing the APIs in the first place? Regarding over-regulation – i am not sure, that the EU really is standing up for consumer interests. There are way to many things going on, that stand contrary to this – for example upload-filters.
aurora,
Both are involved and deserve blame.
I understand why many people hate regulation, but without it there’s no incentive for corporations to respect things like privacy. I do not generally stay on top of european regulations, I just know that in the US the lack of regulation has shifted all the balance of power to corporations. Combine this with mass consolidation and we’re left with not much choice in many markets.
Sounds reasonable – i am curious, if this is just an empty threat of facebook / if they can find some agreement or if they will indeed go this route. I still don’t think, that would be that positive for us EU people, because which platform will be next to decide in this regards and what are the alternatives?
“This is reality whether we like it or not. Some people don’t give this any thought and maybe they’re ok with it. ”
Most are not aware that this is done/possible.
Including a Facebook share button on a site is enough.
A lot of people who are including the share button on their blog or whatever don’t even know this is possible, let alone the general public.
@Lennie yes, but not knowing this is a thing of the past in my opinion. Since the new privacy law the people should be sensibilized regarding things like Facebook Share-Buttons. It was discussed quite extensivly in media and if you haven’t lived under a stone, you should at least have read one article about that all.
Also – not directed to you specially – people should know, that nothing is free in the world. So what’s Facebooks business-model if not personal data?