logo

Internet Archive breached again through stolen access tokens

The Internet Archive

The Internet Archive was breached again, this time on their Zendesk email support platform after repeated warnings that threat actors stole exposed GitLab authentication tokens.

Since last night, BleepingComputer has received numerous messages from people who received replies to their old Internet Archive removal requests, warning that the organization has been breached as they did not correctly rotate their stolen authentication tokens.

"It's dispiriting to see that even after being made aware of the breach weeks ago, IA has still not done the due diligence of rotating many of the API keys that were exposed in their gitlab secrets," reads an email from the threat actor.

"As demonstrated by this message, this includes a Zendesk token with perms to access 800K+ support tickets sent to [email protected] since 2018."

"Whether you were trying to ask a general question, or requesting the removal of your site from the Wayback Machine your data is now in the hands of some random guy. If not me, it'd be someone else."

Internet Archive Zendesk emails sent by the threat actor
Internet Archive Zendesk emails sent by the threat actorSource: BleepingComputer

The email headers in these emails also pass all DKIM, DMARC, and SPF authentication checks, proving they were sent by an authorized Zendesk server at 192.161.151.10.

Internet Archive Zendesk email headers
Internet Archive Zendesk email headersSource: BleepingComputer

These emails come after BleepingComputer repeatedly tried to warn the Internet Archive that their source code was stolen through a GitLab authentication token that was exposed online for almost two years.

Exposed GitLab authentication tokens

On October 9th, BleepingComputer reported that Internet Archive was hit by two different attacks at once last week—a data breach where the site's user data for 33 million users was stolen and a DDoS attack by a pro-Palestinian group named SN_BlackMeta.

While both attacks occurred over the same period, they were conducted by different threat actors. However, many outlets incorrectly reported that SN_BlackMeta was behind the breach rather than just the DDoS attacks.

JavaScript alert on Internet Archive warning about the breach
JavaScript alert on Internet Archive warning about the breachSource: BleepingComputer

This misreporting frustrated the threat actor behind the actual data breach, who contacted BleepingComputer through an intermediary to claim credit for the attack and explain how they breached the Internet Archive.

The threat actor told BleepingComputer that the initial breach of Internet Archive started with them finding an exposed GitLab configuration file on one of the organization's development servers, services-hls.dev.archive.org.

BleepingComputer was able to confirm that this token has been exposed since at least December 2022, which it rotating multiple times since then.

Exposed Internet Archive GitLab authentication token
Exposed Internet Archive GitLab authentication tokenSource: BleepingComputer

The threat actor says this GitLab configuration file contained an authentication token allowing them to download the Internet Archive source code.

The hacker say that this source code contained additional credentials and authentication tokens, including the credentials to Internet Archive's database management system. This allowed the threat actor to download the organization's user database, further source code, and modify the site.

The threat actor claimed to have stolen 7TB of data from the Internet Archive but would not share any samples as proof.

However, now we know that the stolen data also included the API access tokens for Internet Archive's Zendesk support system.

BleepingComputer attempted to the Internet Archive numerous times, as recently as on Friday, offering to share what we knew about how the breach occurred and why it was done, but we never received a response.

Breached for cyber street cred

After the Internet Archive was breached, conspiracy theories abounded about why they were attacked.

Some said Israel did it, the United States government, or corporations in their ongoing battle with the Internet Archive over copyright infringement.

However, the Internet Archive was not breached for political or monetary reasons but simply because the threat actor could.

There is a large community of people who traffic in stolen data, whether they do it for money by extorting the victim, selling it to other threat actors, or simply because they are collectors of data breaches.

This data is often released for free to gain cyber street credincreasing their reputation among other threat actors in this community, as they all compete for who has the most significant and most publicized attacks.

In the case of the Internet Archive, there was no money to be made by trying to extort the organization. However, as a well-known and extremely popular website, it definitely boosted a person's reputation amongst this community.

While no one has publicly claimed this breach, BleepingComputer was told it was done while the threat actor was in a group chat with others, with many receiving some of the stolen data.

This database is now likely being traded amongst other people in the data breach community, and we will likely see it leaked for free in the future on hacking forums like Breached.


Free security scan for your website