Microsoft AI Researchers Leak 38TB of Private Data

Written by

Microsoft accidentally revealed a huge trove of sensitive internal information dating back over three years via a public GitHub repository, it has emerged.

Cloud security firm Wiz discovered the privacy snafu when it found the GitHub repository “robust-models-transfer” which belonged to Microsoft’s AI research division.

Although the repository was meant only to provide access to open source code and AI models for image recognition, the Azure Storage URL was actually misconfigured to grant permissions on the entire account, Wiz said.

“Our scan shows that this account contained 38TB of additional data – including Microsoft employees’ personal computer backups. The backups contained sensitive personal data, including passwords to Microsoft services, secret keys, and over 30,000 internal Microsoft Teams messages from 359 Microsoft employees,” it continued.

“In addition to the overly permissive access scope, the token was also misconfigured to allow “full control” permissions instead of read-only. Meaning, not only could an attacker view all the files in the storage account, but they could delete and overwrite existing files as well.”

Read more on Microsoft data leaks: Microsoft Misconfiguration Exposes Customer Data

The problem appears to stem from Microsoft’s use of a Shared Access Signature (SAS) token – a signed URL that grants users access to Azure Storage data.

It’s a flexible tool which allows for a high degree of customization from the user, enabling permissions from read-only to full control and expiry times which can be set effectively to forever. The original SAS token in this incident was first committed to GitHub in July 2020, with its expiry date updated in October 2021 to 30 years hence.

After Wiz reported the incident, Microsoft invalidated the token and replaced it.

Microsoft said that although it scans public-facing repositories for its accounts, the specific SAS URL found by Wiz was incorrectly marked as a false positive.

“No customer data was exposed, and no other internal services were put at risk because of this issue,” the tech giant said in a blog post explaining SAS best practices. “No customer action is required in response to this issue.”

However, Wiz warned that SAS tokens represent an ongoing risk.

“Due to a lack of monitoring and governance, SAS tokens pose a security risk, and their usage should be as limited as possible. These tokens are very hard to track, as Microsoft does not provide a centralized way to manage them within the Azure portal,” it added.

“In addition, these tokens can be configured to last effectively forever, with no upper limit on their expiry time. Therefore, using Account SAS tokens for external sharing is unsafe and should be avoided.”

What’s hot on Infosecurity Magazine?