Cyberattack: how do hackers get into your systems and exploit data?

Quentin

Each year, IBM Security conducts a study of the initial vectors that allow intruders to penetrate a company’s internal network from the outside. Over the past three years, the two primary vectors have remained consistent: 30% of attacks begin with data leaks or phishing.
The purpose of this first article is to present data leaks comprehensively, covering their acquisition through info-stealers or platform breaches and their dissemination in private groups and marketplaces. Once this foundational knowledge is established, a second article will explore the potential uses of leaked data, particularly focusing on the concept of profiling.
Phishing
Phishing is the most effective technique in 2025. Attackers manipulate victims into executing malicious software or performing actions that compromise a company’s security without their knowledge.
There are two classic techniques:
- Social engineering: Attackers deceive victims into providing their usernames and passwords, typically through fake websites designed to resemble legitimate sites the victims are familiar with.
- Code execution: Attackers trick victims into running malicious code. This code generally aims to steal data or gain remote access to the victim’s machine.
The second technique has become so prevalent that hackers have developed malicious software known as infostealers to automate data harvesting.
Infostealer
An infostealer is an executable that, once launched, steals everything it finds on a computer, takes a screenshot as proof, and self-deletes. There are several major families: LummaC2, Raccoon, Redline, Stealc, META, Vidar. Each family steals different information or supports different platforms for data exfiltration (Steam, Telegram, or Google Calendar for example, to name only the most atypical, used to avoid detection).
The content of an infostealer
To illustrate the threat posed by infostealers, consider the following example. One of our clients downloaded and executed an executable named “KMS Pico”, which purportedly allows users to activate Windows for free and illegally. Unbeknownst to the user, this executable was actually an infostealer designed to compromise their system.

Screenshot of the victim, captured by the infostealer before it self-deleted
Infostealers mostly generate logs in plain text format. These logs typically include :
- Browser Identifiers: Information such as saved usernames, passwords, and autofill data stored within web browsers.
- Memory-Stored Identifiers: Data related to the workstation, including unique identifiers, Wi-Fi and VPN configurations, and other system-specific information.
- Cookies: Stored cookies from browsers, which can contain session information and tracking data.
- Cryptocurrency Wallets: Details of cryptocurrency wallets, including addresses and potentially sensitive transaction information for currencies like Bitcoin and Ethereum (ETH).
In another article, we explained how we traced hackers by reverse-engineering their infostealer.
Exploitation of an infostealer
Infostealer logs are considered highly valuable for two main reasons:
- Quality
The extracted passwords are often robust, typically comprising at least nine characters, and many remain valid because the victims are usually unaware that their workstations have been compromised.
- Session cookies
When a user logs into a website, session information (including proof of authentication) is stored in a cookie. An attacker who obtains these cookies can log in without needing a password and bypass multi-factor authentication (MFA). Access to session cookies for services like Gmail or other messaging platforms enables an attacker to reset numerous accounts and retrieve even more sensitive data.
References in the field of infostealers
Hudson Rock is a well-known reference in the field of infostealers. The company purchases infostealer logs from hackers (i.e., cybercriminals) and makes them available to its clients. Individuals can check for free if their email has been compromised through an infostealer by visiting osint.rocks.



However, it’s important to note that the free version has its limitations:
- Incomplete Information: The free service does not provide all available data.
- Accuracy Concerns: The information may not always be correct or complete.
- Recency: Hudson Rock only includes relatively recent infostealers and does not acquire older ones.
Two lesser-known alternatives to Hudson Rock are Hacked List and White Intel.
Data leaks
In a professional environment, it’s common for users to maintain accounts on various external platforms such as LinkedIn, Canva, ChatGPT, Google, Trello, Dropbox, Notion, GitHub, and Zoom. Each account is defined by two key elements:
- Identifier: This can be an email address or a username.
- Password (or Equivalent): The secret credential that grants access to the account.
When hackers successfully compromise one of these platforms, they can obtain all user identifiers and passwords, typically stored in an unreadable format known as a “hash.” These stolen credentials are then sold on the dark web or potentially distributed for free, amplifying the risk to affected individuals and organizations.
Major data leaks
Below is a list of significant data leaks involving French and other well-known companies:
- DeepSeek Leak (2025): A data breach involving DeepSeek, an intelligent assistant similar to ChatGPT.
- Telephone Operator Leaks (2025): Compromises affecting major French telecom operators, including Free, SFR, Orange, and Bouygues.
- Hospital Data Breaches:
- HPL Saint-Etienne (2025)
- Mediboard (2024)
- CHU d’Armentières (2024)
- Dedalus (2021)
- AP-HP (2021)
- Boulanger Leak (2024): A breach involving Boulanger, a prominent French IT equipment supplier.
- LDLC Leak (2024): Another significant compromise affecting LDLC, a major French IT equipment provider.
While the compromise of large accounts is relatively rare, such incidents are particularly damaging. When they do occur, the volume of data obtained by attackers is usually substantial, providing high-quality information that can be exploited in subsequent attacks. This data can be leveraged for various malicious activities, including credential stuffing, identity theft, and further infiltration into organizational networks.
References and tools
Affected individuals are typically notified about data breaches by the compromised organization, the CNIL (Commission Nationale de l’Informatique et des Libertés), or through well-known services such as Have I Been Pwned (HIBP) if they choose to check.
HIBP is a user-friendly platform that allows individuals to verify if their email addresses have been involved in any data breaches. Here’s how it works:
- Enter Your Email (e.g., “[email protected] ”) into the search bar.
- Click the “Verify” button.
The platform will display a list of breaches that include your email address among the 900+ data breaches indexed by HIBP as of 2025.

As the owner of a domain (e.g., “algosecure.fr”), you can subscribe to receive notifications if any email addresses associated with your domain are affected by a breach. This proactive measure allows domain owners to monitor potential security incidents and respond promptly.
Distribution of data leaks
Marketplaces
In general, a majority of the data breaches found on Have I Been Pwned, as well as all alternatives, come from marketplaces.
In general, the majority of data breaches listed on HIBP and its alternatives originate from online marketplaces and forums where hackers buy or distribute compromised data. Notable examples include:
- Breach Forums
- xss.is
- exploit.in
- nulled.to
These marketplaces represent only the tip of the iceberg. Many data leaks circulate for weeks, months, or even years within private groups before appearing on public platforms. Consequently, HIBP and marketplace-based alternatives capture only a portion of the data breaches circulating on the internet.
While services like HIBP provide valuable insights, they do not encompass the entirety of data leaks. A significant amount of compromised data remains confined to private forums and underground networks, making comprehensive detection and notification challenging.
Private groups
Hackers typically refrain from immediately announcing that they have compromised a site, as doing so can render their stolen data less valuable. The common workflow they follow includes:
- The attacker discovers a vulnerability within the target site.
- The identified vulnerability is exploited to gain unauthorized access to the site.
- Once access is secured, the attacker retrieves user data and other sensitive information from the compromised site.
- The stolen data is first shared or sold within private groups that are accessible only by invitation.
- After an initial period in private groups, the attacker may distribute all or part of the data on public marketplaces.
Data may remain in these private groups for extended periods, ranging from weeks to years, before being introduced to broader marketplaces.
Telegram is one of the most widely used networks for distributing stolen data. Groups on Telegram often post portions of their loot either for free or at a low cost to encourage buyers to purchase access to the complete dataset. Additionally, they offer subscription services for recurring buyers, providing continuous access to new stolen information.
Some examples of known Telegram channels involved in these activities include:
- Moon Cloud
- Daisy Cloud
- Omega Cloud


References and tools
Some platforms other than Have I Been Pwned offering a marketplace-based service: DeHashed, Leak-Lookup, Leaked Domains, Leak peek, Snusbase, Aura and a few other unmentioned platforms used within AlgoSecure.
Access to private groups generally requires going through specialized companies such as DarkOwl, SOCRadar, Enzoic, IntelX or Kela. Among these, Kela stands out as the most powerful product available. However, Kela is designed to cover only the past two years of data breaches. From our experience, older data leaks hold significant value because they may still contain active passwords, particularly within small businesses or organizations with limited security awareness. This limitation means that relying solely on Kela may not always be practical for leveraging older breaches that continue to pose security risks. To effectively capitalize on both recent and older data leaks, it may be necessary to use Kela in conjunction with other services or methods that provide access to more extensive historical data.
Each of the platforms or solutions listed above is generally insufficient when used alone. This is because each provider infiltrates different hacker groups, covers various platforms such as Telegram and Discord, integrates with distinct marketplaces, and has different policies regarding purchases from hackers based on their ethics and current regulations.
At AlgoSecure, we conducted a comparative study of 15 providers and have selected three that best meet our requirements. We perform this evaluation annually to ensure we continue to partner with the most effective and reliable providers in the field.
Conclusion
Data leaks account for 30% of the initial vectors used by attackers, either directly or indirectly through phishing. This compromised data is typically exchanged within private groups, primarily on Telegram, or through marketplaces such as xss.is, which was recently seized by authorities. Often, these data leaks remain concealed in private groups for weeks or even months before surfacing publicly.
For individuals seeking to stay informed about potential compromises, platforms like Have I Been Pwned and Hudson Rock offer valuable services to check if their personal information has been exposed in any breaches.
At AlgoSecure, we provide continuous monitoring of a company’s data leaks through our AlgoLightHouse solution. Additionally, all external penetration tests can be supplemented with a comprehensive data leak report upon client request, ensuring that organizations remain vigilant and informed about any potential vulnerabilities.
In our upcoming articles, we will delve deeper into the nature of data leaks, the challenges involved in extracting meaningful information from them, and strategies to leverage this data effectively. A particular focus will be placed on the concept of profiling, exploring how organizations can capitalize on data leaks to enhance their security postures and proactively mitigate risks.
Discover our other articles
Google Dorks
Google dorks are Google queries that allow you to find information more accurately by...
Surveillance and intelligence: understanding the benefits of Cyber Threat Intelligence
Surveillance and intelligence: understanding the benefits of Cyber Threat Intelligence Yacine 16 May 2025...
Interview with our offensive analyst
Interview with our offensive analyst Klea 22 janvier 2025 Non classé Hi Klea, can...