And There’s Little You Can Do to Stop It
Throttling is complicated. In my TEC 2022 talk “Throttling Stinks (and What You Can Do About It),” on which this article is based, I compared the explanation of “your tenant is being throttled” to explaining something to a toddler by saying “it’s magic!” The toddler can’t disprove what you’re saying, but it’s not really a helpful or transparent explanation. Throttling is the same way—but in my session, I tried to demystify things a bit; this article hopefully helps to dispel the magic a little bit. Let’s dive in.
What is throttling?
My definition of throttling: “the intentional restriction of application or system performance to avoid performance or resource problems.” That’s not precise, but it captures the important parts; throttling happens on purpose. It’s something done to your application or workload to preserve capability for the service as a whole. SaaS applications need to provide throttling to prevent various kinds of problems, including DDoS attacks and cases where one busy user or application steals all the capacity.
For our purposes, when I talk about “throttling” I mostly mean that an application you’re using for backup, migration, or other large-scale operations is getting throttled. It’s possible for individual end users to get throttled, but it’s unusual unless they’re doing something shady, like sending lots of messages in a spam-like manner.
It’s important to understand that throttling is applied across all layers of the Microsoft 365 stack. When your application (or you) makes a request, it might be throttled at any or all the following layers:
- An individual server, if that server’s disk, memory, or CPU resources are oversubscribed
- An individual site collection or mailbox
- An individual SharePoint list (e.g. lists that have >5000 items may incur additional throttling when their contents are enumerated or retrieved)
- An API or protocol (e.g. CSOM may be throttled while Graph is unaffected)
- At the level of the individual tenant (where usage of any workload may lead to all workload requests for that tenant being throttled)
The good, and bad news about this list is that it covers a lot of ground—so Microsoft has a lot of ways to finely regulate the load on the service, but you also have a lot of potential root causes to worry about. Microsoft does not provide any information about why throttling happens in a particular case, and in general, they don’t tell anyone precisely what the limits are for a specific tenant or environment. They do publish the costs of some specific operations and information about the capacities of specific services. This is all useful but not enough all by itself to keep a tenant throttle-free.
On Demand Migration
Migrate all your workloads and Active Directory with one comprehensive Office 365 tenant-to-tenant migration solution.
How throttling works
Imagine that you have a small child living with you. Small children often ask for the same thing, over and over again, in hopes that doing so will get them what they want faster. Sometimes instead what that gets them is a parental response of the general form “IF YOU ASK ME ONE MORE TIME WHEN WE WILL GO TO THE ZOO, WE WILL NEVER, EVER GO.” That’s throttling; the parent is responding in a way that is intended to slow or stop the flow of requests.
In Microsoft 365, what happens is that a requestor asks for something. This may be a user opening a SharePoint page or a backup tool reading the contents of a public folder. Whatever the origin, at some point that request is turned into an API request using one of Microsoft’s APIs (often the Microsoft Graph or Exchange Web Services). When throttling is in effect, the service could just sit on the request for a while (a tactic some of you may remember from Exchange connector “tarpitting”) or return an error code indicating that the request is throttled. For Graph, CSOM, and EWS, this happens via an HTTP 429 error response that contains a Retry-After header. This header tells the requester how long to wait before they may try again. (Small-child version: “don’t ask me again for one hour!”) Microsoft is also rolling out support for an additional header, Rate-Limit, that tells the requestor more precisely how close they are to getting throttled, but they offer this only on a best-effort basis.
Throttling happens because of resource consumption. To be more specific, there are two classes of things that are likely to get you throttled:
- Doing the same thing many times to different objects (like concurrently reading messages from 10,000 mailboxes at once)
- Doing the same thing many times in the same object, like sending a million messages from the same mailbox
As an end user, you’re not likely to run into either of these cases, but as an administrator you might. There are three practical suggestions in my talk that you might find useful, given that mass operations involving migration or backup are the most common (by far!) causes of throttling.
First, if you’re migrating or backing up data, use only one tool at a time. Running two different SharePoint migration tools at the same time, for example, probably will get you throttled twice as fast as running either one alone.
Second, don’t try to migrate and back up or restore data to the same workload at the same time. This is sort of a subcase of the previous point, but it’s worth stating because people sometimes do it anyway.
Third, make sure the tools you’re using comply with Microsoft’s required best practices (like these for SharePoint) and use them concurrently and intelligently. Bonus points for tools that either implement auto-scaling or allow you to see enough performance data to adjust their scaling yourself to get the best possible performance without getting throttled.
Throttling is a fact of life
Microsoft has to implement throttling to protect the service quality they deliver, which means we all benefit from it… except when we don’t. It’s best to think of throttling the same way Mark Twain is said to have thought about weather: “everyone talks about it, but no one does anything about it.”
Microsoft Platform Migration Planning and Consolidation
Simplify migration planning, overcome migration challenges, and finish projects faster while minimizing the costs, risks and disruptions to users.