We live in a world where people can get attention by making noisy and spectacular claims, even if they aren’t true. The growing influence of social media has created a market where engagement is king. Unfortunately, sometimes this influence leaks over into the world of IT and security, which is what sort of happened in mid-November 2024 when a mostly ignored Office 365 feature was shoved into an unwelcome spotlight amid claims that Microsoft is secretly spying on us all.
Five Connected Experiences
Microsoft’s desktop Office applications have come a very long way since they were first introduced, from character-based DOS apps to GUI apps running on Windows and macOS to the cloud-integrated versions we use now. The days when multiple vendors were fiercely competing for our dollars by adding more sophisticated features are over; instead, Microsoft put their effort into integrating services across the Office applications (for example, providing a common grammar and spelling engine) and integrating new cloud services. Microsoft generically refers to these cloud integrations as “connected experiences.” For example, the Editor feature in Office 365 is a connected experience; when it’s enabled, and you use it to check your grammar or spelling, your text is sent to a cloud service that returns grammar and spelling suggestions. That’s pretty innocuous. Microsoft does receive your document content, which means they may see something private or secret, but they don’t store it. More on that later.
The second type of connected experience is familiar and welcome to most of us: document co-editing obviously requires the service to be able to see your document contents and share them, in near-real-time, with other users who have legitimate access to your documents.
The third type of experience moves information in the opposite direction: your Office apps can download content from the cloud. For example, document templates and slide design layouts can be downloaded from Microsoft’s cloud and incorporated into your documents, and Outlook can fetch and display weather data for your current location. These download experiences require your explicit request; although Malwarebytes points out that these may be the cause of a future vulnerability (and I think they’re right!) the privacy risk is low.
Fourth, Windows and Office have also long had opt-in crash reporting, where the crash of an Office application or Windows system component can generate a crash dump that is uploaded to Microsoft for analysis. This is a common, and valuable, debugging tool because the number of crash reports Microsoft gets, and their contents, make it much easier to pinpoint problems early in the release cycle. If you combine the various Insider early-access programs with crash reporting, what you get is a stable set of applications that only rarely crash. (Of course, they still have bugs, but the days when it was common to lose a document because of a crash are thankfully long gone!)
Finally, it’s also worth noting that there are many telemetry features embedded in the Microsoft 365 ecosystem. For example, Microsoft is able to track average page load times for SharePoint Online and average email open times for OWA. This data is aggregated across the millions of service users, it’s anonymized, and no one outside of Microsoft’s operations team can view it. It’s not worth further consideration.
The Beginning of the Flap
The connected experiences in Office applications have been around for a long time; as an example, here’s a 2019 question about them on the Microsoft Tech Community. As with many other Office features, they have mostly gone along unnoticed, sitting as one of the many checkboxes in the Office settings dialogs that users leave untouched. However, a controversy erupted after a Linux-focused Twitter account (which I won’t link to here to avoid sending them any more engagement) made the breathless claim that Microsoft was training AI models using data from customer documents via the Connected Experiences feature. As often happens, people who didn’t understand what those features do, plus the large crowd of people online who are always happy to yell angrily at Microsoft, started amplifying this message and it started getting traction. It even made it to noted AI research scientist Gary Marcus’ blog; Marcus often writes about the privacy risks of unconstrained AI training so this was right on brand for him.
Microsoft’s Response
Microsoft’s response was unequivocal: “In the M365 apps, we do not use customer data to train LLMs.”
Of course, this won’t surprise anyone who’s been paying attention to how the Copilot services in Microsoft 365 are actually trained. (A quick refresher: each tenant with Copilot enabled has its own private model, where documents may optionally be used to train only that model. Documents from one tenant are never used to train a model accessible to any other tenant. See this link for a more detailed overview.)
However, the original false claim had already spread widely; I saw it on Twitter, Bluesky, Threads, and LinkedIn (all without trying; their various algorithms decided that I needed to see it). Microsoft’s rebuttal, sadly, doesn’t undo the original false information.
Controlling Connected Experiences
As it turns out, Microsoft has been quite transparent about what the connected experiences features do. It’s well documented in this summary article and there’s an even more comprehensive list here. A quick scan of the list will show that many of these experiences are useful to most users. For example, having Outlook query Bing to find the travel time required to get to a calendar appointment is a great feature if you often have to travel to meetings.
Microsoft’s access to the data used for these experiences is governed by their privacy agreement and their terms of service. Although both of these documents are long and complicated, neither of them grants Microsoft the right to use your data for AI training.
On top of the promises Microsoft makes about using that data, you have full control over access to those experiences in two ways. First, individual users can enable or disable connected experiences access through the settings of individual Office applications on their devices. Second, as an administrator, you can use Office Cloud policies or group policy objects to turn them off. Third, some connected experiences (as described in the documentation I linked earlier) are aware of Purview sensitivity labels and won’t ingest or process data with labels that restrict access to the content.
An Unfortunate Episode
It’s a shame that these fake claims turned into big news; the accumulated eyeballs and attention given to the claimants made them some money, and the resulting controversy distracted from more important issues, including the Thanksgiving-week Office 365 outage that Microsoft inflicted on us. Your best defense against getting sucked into the vortex of misinformation is to read and understand Microsoft’s own documentation about the privacy and security features of their tools—this is of course an ongoing endeavor because they’re introducing new things all the time. If you keep one basic principle in mind– Microsoft wants to make money but realizes that a huge amount of regulatory and consumer attention is focused on what they do with customer data– you’ll be better able to distinguish truth from falsehood in the future.