Executive Summary
- In June 2021, a hacker on an underground forum claimed to be in possession of the data for 700 million LinkedIn users, including name, gender, email address, job title, industry, and more.
- The data was accessed using a technique called “data scraping” which, while against LinkedIn’s rules, wasn’t technically a hack in the traditional sense.
- While nobody purchased the scraped data, it was eventually leaked online for anyone to access, emphasising the need for businesses to understand the risks of data scraping.
Introduction
Did you know that LinkedIn now has over 700 million monthly active users across the world?
That figure of 700 million is interesting not simply because it’s a staggering amount of individuals engaging with what was once thought of as the ‘Facebook for professionals’, but also because it’s the same number of users whose LinkedIn data was compromised during a cyber-attack in mid-2021.
This “attack” resulted in the collection of personal details about these particular LinkedIn users, which were subsequently offered for sale before being intentionally leaked. The attack used a particular method known as “data scraping”, but whether it’s truly malicious or not may be up for debate.
Here’s what UK businesses need to know about this leak, data scraping, and what you can do to prevent it happening to you and your team.
What was the great LinkedIn data scrape of 2021?
On June 22nd 2021, a user going by the name TomLiner on the RaidForums – a site known for sharing breached information – claimed to be in possession of 700 million records containing the personal details on LinkedIn users.
This collection of records included the following data:
- Full names
- LinkedIn IDs
- Date of birth
- Workplace address
- Facebook and Twitter IDs
- Job titles
- GPS data
This isn’t an exhaustive list, but you get the idea.
The user TomLiner initially attempted to sell these 700 million records, but it seems nobody took the bait, because he (or his group) then leaked all 187GB of the data later in September for anyone to download.
While this sounds like a tough situation, you might notice something about the data that was leaked – it’s all technically publicly facing. And, as LinkedIn themselves attested, this wasn’t actually a breach of LinkedIn’s systems.
What is data scraping? Is it a cyber-attack?
Data scraping is a methodology used to extract large amounts of data from websites in a short space of time. It uses automated programs to run through all of the page content from top to bottom – the ‘scrape’ – and then saves it to another location. Over time, this scraping process can create a large repository of almost any type of text content from a website – which is precisely what happened with the LinkedIn scrape.
Is data scraping legal? The surprising answer is generally yes, because scraping targets publicly available data; it simply does it at a scale that no single human being could match. Having said that, there may be specific scenarios where data scraping does break the law. For example, if an attacker were to compile data containing registered trademarks and then try to sell it, there could be a case for the data owner.
Is data scraping a cyber-attack? Well… not strictly.
It doesn’t require any level of ‘hacking’, whether that be brute force attacks or other forms of data breaching. All it’s really doing is taking data which is already online and compiling it. What people do with that data could comprise a breach in some scenarios, but the act of scraping isn’t against the law and doesn’t fit the definition of a cyberattack.
All of that said, data scraping is almost always against the terms of service of specific websites, but such policies aren’t generally legally enforceable, limiting a website’s defensive options to IP bans, user account bans, and other mitigation measures.
What can businesses and individuals do to protect against data scrapes?
There are a few different ways that your IT support team and your employees can help your organisation prevent data scrapes like that seen at LinkedIn.
Without legal recourse – or at least confronted with many legal grey areas – businesses must do what they can to prevent their publicly facing data being scraped and used for nefarious purposes.
Here are a few key ideas to get you started:
- Monitor your website traffic for any large jumps in activity. Data scraping naturally leaves a big footprint on your website, because the data has to be sent from your server to whoever is doing the scraping. If you or your IT support team can catch this activity before the scrape is complete, you may limit the potential risk.
- Use authenticated content gating. If you have data that you think may be a target of potential data scrapes – such as publicly exposed customer or user information – you can prevent a good percentage of scrape attempts by forcing users to sign up (ideally with a double opt-in) before they can see the content.
- Limit the activity users can carry out. Many scrapers rely on the ability to constantly search or request data from a website in order to carry out their task. By limiting the amount of requests a single user can make in a certain space of time, you’ll effectively slow down – and perhaps even halt – the progress of a scrape attempt.
- Use a captcha for certain users. Scrapers are often hosted on certain web or cloud services, so one option (probably for your website developers to implement) is to show a captcha to prevent access by automated programs which are coming from these sources.
Of course, in addition to all of these, one of the most important things any business can do is to review the content they have on their site and whether it represents a scrape risk. That’s probably a good place to start if this is the first time you’re hearing about data scraping.
Keep your business cybersafe with Get Support
If anything we’ve covered here has been helpful, or even surprising, it’s worth knowing that this sort of best practice cyber security due diligence is all part of the service at Get Support
If you’d like to know more about our IT support packages, including comprehensive cyber security consultation, our team is waiting to hear from you.
You can reach us now on the phone by calling the IT support experts on 01865 594 000 or simply entering your details in the form below.