The artificial intelligence assistant of Microsoft, Copilot, accidentally divulged the contents of private repositories of more than 20,000 organizations. It includes several big brands such as Google, Intel, Huawei, PayPal, IBM, Tencent, and even Microsoft.
However, though the repositories were originally opened to the public through more than 16,000 organizations, they shifted to private because these contained sensitive information. The information includes authentication credentials leading to unauthorized access. Such an event happened months ago, yet the data is still entirely available through Copilot and anybody can access it.
The revelation surfaced in the second half of 2024 by an AI security company, Lasso. Lasso discovered in January that Copilot had continued retaining and displaying private repositories and went out to measure the magnitude of the problem.
Ophir Dror and Bar Lanyado, the Lasso researchers wrote in a post, “Realizing that any data on GitHub, even if it was public for a brief moment, could be indexed and potentially exposed by tools like Copilot, we were alarmed by how easily this information could be retrieved.”
Further in the post, it was written, “Motivated to grasp the full scope of the issue, we automated the identification of ‘zombie repositories’-those that transitioned from public to private- and validated our findings.”
The researchers traced the leak back to the Bing caching mechanism after discovering that Microsoft had shown one of Lasso’s private repositories to the world. The pages had been indexed by the search engine when they were still publicly available and failed to purge those entries after the repositories turned private on GitHub.
As such, since Copilot runs on Bing’s engines for search capabilities, the private data became accessible through the AI chatbot as well. Microsoft has since made alterations in its system to restore this problem after the report from Lasso in November