What is Elasticsearch and why is it involved in so many data leaks? null


The term Elasticsearch is never far away from the news headlines and usually for the wrong reasons. Seemingly every week that goes by brings a new story about an Elasticsearch server that has been breached, often resulting in troves of data being exposed. But why are so many breaches originating from Elasticsearch buckets, and how can businesses that leverage this technology use it to its fullest extent while still preventing a data leak?
To answer these questions, firstly, one must understand what Elasticsearch is. Elasticsearch is an open source search and analytics engine as well as a data store developed by Elastic.
Regardless of whether an organization has a thousand or a billion discrete pieces of information, by using Elasticsearch, they have the capabilities to search through huge amounts of data, running calculations with the blink of an eye. Elasticsearch is a cloud-based service, but businesses can also use Elasticsearch locally or in tandem with another cloud offering.
Organizations will then use the platform to store all of its information in depositories (also known as buckets), and these buckets can include emails, spreadsheets, social media posts, files – basically any raw data in the form of text, numbers, or geospatial data. As convenient as this sounds, it can be disastrous when mass amounts of data are left unprotected and exposed online. Unfortunately for Elastic, this has resulted in many high-profile breaches involving well-known brands from a variety of industries.
During 2020 alone, cosmetics giant Avon had 19 million records leaked on an Elasticsearch database. Another misconfigured bucket involving Family Tree Maker, an online genealogy service, experienced over 25GB of sensitive data exposed. The same happened with sports giant, Decathlon, which saw 123 million records leaked. Then, more than five billion records were exposed after another Elasticsearch database was left unprotected. Surprisingly, it contained a massive database of previously breached user information from 2012 to 2019.
From what has been disclosed so far, clearly those who chose to use cloud-based databases must also perform the necessary due diligence to configure and secure every corner of the system. Also, quite clearly, this necessity is often being overlooked or just plain ignored. A security researcher even went to the length to discover how long it would take for hackers to locate, attack, and exploit an unprotected Elasticsearch server which was left purposely exposed online – eight hours was all it took.
Digital transformation has definitely changed the mindset of the modern business, with cloud seen as a novel technology that must be adopted. While cloud technologies certainly have their benefits, improper use of them has very negative consequences. Failing or refusing to understand the security ramifications of this technology can have a dangerous impact on business.
As such, it is important to realize that in the case of Elasticsearch, just because a product is freely available and highly scalable doesn’t mean you can skip the basic security recommendations and configurations. Furthermore, given the fact that data is widely hailed as the new gold coinage, demand for monetising up-to-date data has never been greater. Evidently for some organizations, data privacy and security have played second fiddle to profit as they do their utmost to capitalize on the data-gold rush.
Is there only one attack vector for a server to be breached? Not really. In truth, there are a variety of different ways for the contents of a server to be leaked – a password being stolen, hackers infiltrating systems, or even the threat of an insider breaching from within the protected environment itself. The most common, however, occurs when a database is left online without any security (even lacking a password), leaving it open for anyone to access the data. So, if this is the case, then there is clearly a poor understanding of the Elasticsearch security features and what is expected from organizations when protecting sensitive customer data. This could derive from the common misconception that the responsibility of security automatically transfers to the cloud service provider. This is a false assumption and often results in misconfigured or under-protected servers. Cloud security is a shared responsibility between the organization’s security team and the cloud service provider; however, as a minimum, the organization itself owns the responsibility to perform the necessary due diligence to configure and secure every corner of the system properly to mitigate any potential risks.
To effectively avoid Elasticsearch (or similar) data breaches, a different mindset to data security is required and one that allows data to be a) protected wherever it may exist, and b) by whomever may be managing it on their behalf. This is why a data-centric security model is more appropriate, as it allows a company to secure data and use it while it is protected for analytics and data sharing on cloud-based resources.
Standard encryption-based security is one way to do this, but encryption methods come with sometimes-complicated administrative overhead to manage keys. Also, many encryption algorithms can be easily cracked. Tokenization, on the other hand, is a data-centric security method that replaces sensitive information with innocuous representational tokens. This means that, even if the data falls into the wrong hands, no clear meaning can be derived from the tokens. Sensitive information remains protected, resulting in the inability of threat actors to capitalise on the breach and data theft.
With GDPR and the new wave of similar data privacy & security laws, consumers are more aware of what is expected when they hand over their sensitive information to vendors and service providers, thus making protecting data more important than ever before. Had techniques like tokenization been deployed to mask the information in many of these Elasticsearch server leaks, that data would have been indecipherable by criminal threat actors—the information itself would not have been compromised, and the organization at fault would have been compliant and avoided liability-based repercussions.
This is a lesson to all of us in the business of working with data – if anyone is actually day-dreaming that their data is safe while “hidden in plain sight” on an “anonymous” cloud resource, the string of lapses around Elasticsearch and other cloud service providers should provide the necessary wake-up call to act now. Nobody wants to deal with the fall-out when a real alarm bell goes off!
The term Elasticsearch is never far away from the news headlines and usually for the wrong reasons. Seemingly every week that goes by brings a new story about an Elasticsearch server that has been breached, often resulting in troves of data being exposed. But why are so many breaches originating…
Recent Posts
- I tried adding audio to videos in Dream Machine, and Sora’s silence sounds deafening in comparison
- iPhones are briefly changing ‘racist’ to ‘Trump’ due to an iOS dictation issue
- We finally know who’s legally running DOGE
- OpenWrt debuts “unbrickable” hacker-friendly, security-focused wireless router that promises to “never be locked”
- Apple is fixing a voice dictation bug that substitutes ‘Trump’ for ‘racist’
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010