Historical Search Jobs Retrieve Message Trace Data More than 10 Days Old
An apparently simple conversation posted in a Facebook group asked about how to create an email traffic report detailing inbound messages received by a Microsoft 365 tenant over the last 90 days. The report should include information like a timestamp, email address of the sender, and if the message had an attachment. As it turns out, no simple solution exists for this request. Let’s explore why.
Exchange Online only keeps message trace data online for ten days. It’s possible to retrieve message trace data for up to 90 days, but only by running a historical search through the Exchange admin center or PowerShell.
A historical search means that Exchange Online runs a background job to retrieve the data from its message trace repository. This process can take anything from ten minutes to several hours, depending on the current load on the service. A historical search can cover message data for up to 100 email addresses and return a maximum of 100,000 records. Usually, people search for messages sent from or received by mailboxes (user or shared). This article explains how to run a historical search for email sent from shared mailboxes.
Creating a report for an entire organization for the last 90 days likely means that processing must be divided over several jobs to ensure that the number of addresses submitted for each job is 100 or less and that the job returns less than 100,000 records. An organization can run up to 250 historical searches daily.
Creating Historical Search Jobs
Any solution depends on running enough historical search jobs to retrieve message trace data for all SMTP addresses within scope of the search. If you want to create a report over the last 90 days for all inbound email, you need to find all the recipient addresses that external people might use and divide the addresses into batches of 100 that are then submitted for processing.
For instance, to check all mailboxes, the first step is to find the mailboxes. If more than 100 exist, you then separate them into sets of 100 or less. For each set, extract their primary SMTP addresses and store them in an array. As an example, these commands find user and shared mailboxes and store the primary SMTP address for each mailbox in an array:
[array]$Mailboxes = Get-ExoMailbox -RecipientTypeDetails UserMailbox, SharedMailbox -ResultSize Unlimited [array]$RecipientAddresses = $Mailboxes.PrimarySMTPAddress
Exchange Online supports multiple proxy addresses for mail-enabled objects, which can receive email using any SMTP proxy address. If you want to check inbound messages for every possible address, you must extract the set of proxy addresses for each object and store all the addresses in the array. Something like this code extracts all the SMTP proxy addresses for the set of mailboxes (found using the code above) and stores them in an array:
[array]$MailboxProxyAddresses = $Mailboxes.EmailAddresses [array]$MailboxAddresses = $Null ForEach ($Address in $MailboxProxyAddresses) { If (($Address.Split(':')[0]) -in 'smtp', 'SMTP') { $SMTPAddress = $Address.SubString(5,$Address.Length-5) $MailboxAddresses += $SMTPAddress } }
Proxy addresses include the MOERA (Microsoft Online Email Routing Address) that each recipient gets for the tenant service domain and any plus addresses assigned by administrators to mailboxes. It’s reasonable to expect that the number of proxy addresses will be between two and three times the number of primary SMTP addresses. Searching for all SMTP proxy addresses rather than primary SMTP addresses increases the number of historical search jobs.
Now you know what addresses to search for, you can submit the historical search jobs to retrieve data for the addresses. This code submits a historical search job to find inbound email for all addresses in the $RecipientAddresses variable (an array) for the last 90 days.
[int]$i = 1 $StartDate = (Get-Date).AddDays(-90) $ReportName = ("Historical Search from {0} Number {1} Submitted {2}" -f $StartDate, $i, (Get-Date -format g)) $Status = Start-HistoricalSearch -RecipientAddress $RecipientAddresses -StartDate $StartDate -EndDate (Get-Date) -ReportType MessageTrace -ReportTitle $ReportName -Direction Received -NotifyAddress Admin@office365itpros.com
You can track the progress of the job with the Get-HistoricalSearch cmdlet:
Get-HistoricalSearch -JobId $Status.JobId | Format-Table JobId, Status, ReportTitle JobId Status ReportTitle ----- ------ ----------- 3b9847c0-b1c2-4603-b344-3095b2d6c044 NotStarted Historical Search from 28/07/2023 21:44:51 Number 1 Submitted 26/10/20…
If you add a notification address when submitting a job, Exchange Online sends email to that address when the job finishes. Obviously, you must break up the set of searchable addresses into batches of 100 or less and submit a historical search job for each batch.
Downloading the Data for the Email Traffic Report
Eventually, all the historical search jobs will finish and the message trace data extracted by the jobs will be ready. Before you can use the data, you must download it from the Message Trace section of the Exchange admin center. Under the Downloadable reports tab, you’ll find a listing of the historical search jobs and can check details of each (Figure 1). When the job status is Complete, an option appears to download the report. It can take some time to connect to Azure to fetch the data, which downloads as a CSV file in Unicode format.
The notification message sent upon the completion of a job also includes a link to download the data file (Figure 2).
Creating an Email Traffic Report from the Historical Message Trace Data
If you’ve had to split processing over multiple jobs, you must download the file for each job. To make it more convenient to process the files, I moved them to a specific folder. The task is then to write a PowerShell script to loop through the files, extract the message trace data from each file, and combine the data into a single set for analysis.
The script I wrote to process the message trace files is available from GitHub. After processing is complete, a PowerShell list object (called $Report) containing the data extracted from the historical trace files is available for analysis. The original request was to create a report listing the timestamp, sender, and whether messages have attachments. Message trace information doesn’t include indications when emails have attachments. It might be possible to assume that any message with a byte size over 100,000 has an attachment, but given the size of embedded graphics that can be in email, that’s a big assumption.
Apart from attachments, the script can generate a report containing the requested information. Figure 3 shows the information piped through the Out-GridView cmdlet as an example of the script output.
The point is that once the script generates the data, it can be sliced and diced into whatever what you want using whatever tool you think is best. Some would import the data into Power BI to use its visualization capabilities. Others will be happy with simple PowerShell commands to create different statistics. For example, these commands group the sender email addresses and sender domains from the file to report the most common senders and sender domains:
$Report | Group-Object Sender -NoElement | Sort-Object Count -Descending | Select-Object -First 10 | Format-Table Name, count $Report | Group-Object Sender_Domain -NoElement | Sort-Object Count -Descending | Select-Object -First 10 | Format-Table Name, count Name Count ---- ----- gmail.com 730 microsoft.com 620 yandex.com 508 practical365.com 272 linkedin.com 234 yahoo.com 224 yammer.com 205 lists.irishtimes.com 182 quest.com 174 email.teams.microsoft.com 147
Leveraging the Power of PowerShell
What’s been proven in this journey is that despite Microsoft restrictions, it’s possible to retrieve and analyze large amounts of historical message trace data. All it takes is planning. After that, PowerShell will submit the historical message trace jobs and process the information found by those jobs. What you do with the results is up to you.
To emphasize how useful PowerShell is when dealing with message trace data, here are some other articles to read.
- How to run online message traces.
- How to run historical message traces.
- Use message trace data to figure out user sending patterns.
- Use message trace data to report email sent to external recipients.
- Find messages from specific domains delivered to recipients in a tenant.
Could you please help me with all the DL’s historical search and fetch the last time DL’s has been expanded or received an email
Like this? https://office365itpros.com/2023/12/05/distribution-list-check-90-days/