A Yandex source code repository allegedly stolen by a former employee of the Russian technology company has been leaked as a Torrent on a popular hacking forum.
Yesterday, the leaker posted a magnet link that they claim are ‘Yandex git sources’ consisting of 44.7 GB of files stolen from the company in July 2022. These code repositories allegedly contain all of the company’s source code besides anti-spam rules.
Software engineer Arseniy Shestakov analyzed the leaked Yandex Git repository and said it contains technical data and code about the following products:
- Yandex search engine and indexing bot
- Yandex Maps
- Alice (AI assistant)
- Yandex Taxi
- Yandex Direct (ads service)
- Yandex Mail
- Yandex Disk (cloud storage service)
- Yandex Market
- Yandex Travel (travel booking platform)
- Yandex360 (workspaces service)
- Yandex Cloud
- Yandex Pay (payment processing service)
- Yandex Metrika (internet analytics)
Shestakov also shared a directory listing of the leaked files on GitHub for those who want to see what source code was stolen.
“There are at least some API keys, but they are likely only been used for testing deployment only,” said Shestakov about the leaked data.
In a statement to BleepingComputer, Yandex said their systems were not hacked, and a former employee leaked the source code repository.
“Yandex was not hacked. Our security service found code fragments from an internal repository in the public domain, but the content differs from the current version of the repository used in Yandex services.
A repository is a tool for storing and working with code. Code is used in this way internally by most companies.
Repositories are needed to work with code and are not intended for the storage of personal user data. We are conducting an internal investigation into the reasons for the release of source code fragments to the public, but we do not see any threat to user data or platform performance.” – Yandex.
Exposure to hackers
BleepingComputer also discussed the leak with Grigory Bakunov, a former senior systems administrator, deputy chief of development, and director of spreading technologies at Yandex. who is very familiar with the leaked code, having worked at the tech giant between 2002 and 2019.
Bakunov explained that the motive of the data leak was political, and the rogue Yandex employee responsible for the data leak had not tried to sell the code to competitors.
The former senior executive added that the leak does not contain any customer data, so it does not constitute a direct risk to the privacy or security of Yandex users, nor does it directly threaten to leak proprietary technology.
Yandex uses a monorepo structure called ‘Arcadia,’ but not all of the company’s services use it. Also, even just to build a service, you need a lot of internal tools and special knowledge, as standard building procedures do not apply.
The leaked repository contains only code; the other important part is data. Key parts, like model weights for neural networks, etc., are absent, so it’s almost useless.
Still, there are a lot of interesting files with names like “blacklist.txt” that could potentially expose working services.
However, Bakunov told BleepingComputer that the leaked code creates the potential for hackers to identify security gaps and create targeted exploits. Bakunov believes this is only a matter of time now.
The former executive also commented on Yandex’s response, saying that the leaked code may not be identical to the current code used in the firm’s working services but might be up to 90% similar.
Therefore, thoroughly examining the leaked code could yield possible weak points at Yandex for threat actors.