As an admin, let’s imagine a scenario with Outlook performance issues. These could be email send/receive performance issues; Outlook hanging when switching folder; or simply using shared calendars.
Understandably, we’re in the middle of a global pandemic – and you may be reading this from the comfort of your home office, or perhaps you are required to be on-premises now.
This network troubleshooting guidance will span both scenarios. You may recall a previous article on this topic explained the many possible causes of network latency at a high level.
In this article, we take a deeper dive to uncover the exact construction of our networks without needing to unplug anything.
Since this article is about the practicalities of network troubleshooting, we will start with the command line.
On Windows, start up a CMD or a PowerShell session; if you are working on macOS or Linux, start up a terminal.
I will be using MacOS’s terminal to illustrate the output but will be sharing both Windows, and MacOS/Linux commands.
The first thing we want to do is find our local Exchange Online Front Door. The concept of service front doors and why they matter is documented in the Office 365 network connectivity principles. However, the TL;DR version is as follows:
- Microsoft owns one of the largest privately held global networks in the world.
- This network is peered at many locations across the world and provides service front doors.
- Service Front doors accept requests for Office 365 services in your country/province/state.
- These requests are routed at Microsoft’s cost to wherever the actual service location, i.e. mailbox or OneDrive location, etc
- Connecting to the service front door closest to you is the goal!
If you are following along on the command line, you are about to find your local Exchange Online Front Door, or if you see something that does not look local, we will find out why.
You can follow along with the examples below and note the output. Later, we will compare the result we are about to generate with another example.
In the illustration below, we have numbered our steps and highlighted the output:
Step 1 – In a command line of your choice, type nslookup and hit Enter.
Step 2 – Next, type outlook.office365.com and hit Enter.
In my environment, using ISPs locally provided DNS, I receive an Exchange Online Front Door Address of “cpt-efz.ms-acdc.office.com”. The first three letters of the returned FQDN reveal that this front door is situated in Cape Town, as I would expect.
Next, I need to know how “far away” my Exchange Online Front Door is. As in our above example, in the illustration below, we have numbered our steps and highlighted the output: Step 1 – In Windows, use tracert, on MacOS, use traceroute to reveal the number of network hops between ourselves and our chosen front door:
Notice that in step three of our traceroute, we hit the Microsoft peering point in my local ISPs Internet Exchange, which is relatively optimal, with Step 4 routing onto msn.net, the Microsoft Network. This result is an accumulation of three factors:
- Local ISP breakout
- Local DNS resolution allowing me to find the closest service front door when queried
- ISP peering shortening the route to the Microsoft network.
In our previous article, we discussed factors how local ISP breakout and local DNS resolution are required to satisfy the Office 365 network connectivity principles; ISP peering is a new concept. I’ll step away from the command line for a second to explain why we even care.
ISP Peering and why you should care
In the traceroute example above, we illustrated that my ISP peers with Microsoft enables me to connect to the Microsoft Network (*.msn.net) quickly over four hops. Once I’m connected, Microsoft uses a concept known as cold potato routing to accept my traffic and backhaul it to wherever it may need to be to consume my desired service. Network peering is a well-documented space for ISPs wishing to shorten the path to Microsoft services.
Microsoft publicly documents their AS Number – a numbering system allowing ISPs and Exchanges to exchange routing information – as “ASN 8075”. Since the internet is a well-documented space, we can see how Microsoft have peered using PeeringDB.com.
Searching for “8075” lifts the hood of how extensively Microsoft have invested in this space when we note both Public Peering (Internet Exchange points and ISPs) and Private Peering (direct physical connection) points:
Using the filter facility, I can filter using the name of the Private Peering Facility used by most ISPs in South Africa, “Teraco”:
Clicking on the Cape Town URL reveals the list of South African ISPs who are peering with Microsoft using a similar peering methodology:
Clicking on each ISP/Peer Name in turn documents who is peering with whom, therefore displaying inter-ISP peering agreements.
Peering – Why should we care?
Not peering can dramatically increase the network distance between ourselves and Microsoft, which we now know is not optimal. For example, a client in the securities industry exhibited “slowness” when using Exchange Online for their call center agents. My team discovered that their ISP needed 15 network hops to connect to a Microsoft front door hosted in their city using traceroute. Addressing the issue with the ISP removed 150ms work of network latency. Ideally, that number should well be under 20ms at most.
Using global DNS services
Many of us, including some of our ISPs or corporate networks, use third party DNS services, like Google (184.108.40.206, 220.127.116.11) or Cloudflare (18.104.22.168, 22.214.171.124) and the like need to understand that often these services are not local to you. Here’s an example using the same two tools we used earlier – nslookup and traceroute. You can follow along from your location and note the result.
Using the same convention used above, in the illustration below, we have numbered our steps and highlighted the output:
Step 1 – In a command line of your choice, type nslookup and hit Enter.
Step 2 – type server 126.96.36.199 and hit Enter. This command changes the DNS server to Google provided DNS
Step 3 – Next type outlook.office365.com and hit Enter.
Suppose you spotted that the resulting location code “LHR” looks like the airport code for London Heathrow in the United Kingdom and confirmed your suspicion by searching for where the four above listed IP addresses reside. In that case, you can conclude that this front door is not in Cape Town but the United Kingdom. The three-letter code prefix is usually the closest airport code. In this article, we have seen CPT – Cape Town, and LHR, London Heathrow.
Here’s an old trick – Ping. We will ping each host in turn; first, my local Exchange Online Front Door resolved using local DNS, then my remote host returned using Google DNS 188.8.131.52 and examine the results:
Note that using Windows, your command line would not use the “-c3” parameter.
The picture here demonstrates an average latency of 11.5ms for my local front door, which is expected. The remote front door resolved using Google DNS averages just under 160ms. This picture demonstrates the effect of using non-location aware DNS and the latency impact of hair pinning traffic to a central global location.
Latency is the beginning but not the end
In this article, we cited a pleasing 20ms or so latency to a local front door, which is only a measure of the physical network. Keep in mind your mailbox may not be hosted in the exact location as your service front door. The likelihood is good that it may be in a different data center, altogether due to the high availability model which Exchange Online uses.
If you have ever opened the Connection Status dialog in Outlook, by holding Ctrl and clicking on the Outlook icon in the notification area, you would have noticed the “Avg Resp” column is much higher than 20ms. This column is the average response time for a client request, includes transactions over the entire Layer 7 networking stack, which makes up an Outlook request over HTTPS and is much more complicated than a simple ICMP or ping request.
With a multi-geo tenant, your mailbox may be in a completely different part of the world to you; however, your front door connectivity will be loca. Thus, your troubleshooting will start but not end with local network latency. Additional factors such as high item counts in critical holders, Outlook online or cached mode, etc., weigh in towards Outlook performance.
Many reading this are working from home, a home office or connected remotely to HQ via a VPN. You may be experiencing performance issues in Outlook, which are less than pleasing. In this article, we have documented:
- How to use NSLOOKUP to find the Exchange Online Front Door, which DNS hands to you as the closest location.
- How to use Traceroute to document how “far away” that Exchange Online Front Door is in relative network terms and document the ISP peering relationship that may be affecting your connection experience.
- What ISP peering is and how your ISP should be using it for your advantage, which is to connect you to the Microsoft network using the shortest possible routing.
You now have the tools to uncover why your experience connecting to Microsoft Services may be sub-optimal. If your ISP is not peering efficiently, think less than five hops using traceroute – how to document why.
Working from home, or your office location, or even via VPN – you can demonstrate what needs to change.