Microsoft Aims to Increase Limit for Purgeable Items to 100
Earlier this year, Microsoft finally retired the Search-Mailbox cmdlet and removed the ability to remove large numbers of messages from mailboxes. The replacement, compliance search actions, can only remove 10 items per mailbox. Microsoft is now revamping Purview eDiscovery, and as part of that effort, they promise to increase the limit to 100 items per mailbox.
The increased purge item limit hasn’t reached my tenant yet, but the change now happening around eDiscovery makes it appropriate to consider the current state of compliance search actions and what might happen when Microsoft transitions to the new eDiscovery over the coming months.
Compliance Search Basics
First, let’s define what we are dealing with.
- A compliance search is otherwise known as a content search. In this context, we only discuss searches against mailboxes. The obvious searchable items are emails stored for user mailboxes. However, mailbox items also include data in inactive mailboxes and the special cloud mailboxes created for guest accounts. Apart from normal email, searchable items include the compliance records captured by the Microsoft 365 substrate for Teams, Viva Engage, and Planner and stored in hidden folders in user mailboxes.
- A content search can currently be created as a standalone operation or as part of an eDiscovery case. An individual eDiscovery case can include multiple content searches. A content search runs against the indexes maintained by Microsoft Search and can return mailbox items and files stored in SharePoint Online and OneDrive for Business.
- Content searches are estimate searches. In other words, the results returned by the searches are high-quality estimates of the items that match the search criteria. However, the definitive set of items recovered by a search are only known after an administrator exports the search results. At that point, Purview fetches copies of matching items from the searched mailboxes and moves them into a secure location in Azure. It is possible that the items found by the estimated search and those exported to Azure differ, especially if the search criteria are not very precise.
- Search actions, like previewing a selected set of found items to ensure that the search criteria are accurate, are performed against estimate results.
- The Purge search action can only process mailbox items and comes in two variants. The SoftDelete option moves purged items into the Deletions folder within Recoverable Items. It is the equivalent of deleting items from the Deleted Items folder. The HardDelete option permanently removes items from the mailbox. However, Single Item Recovery (SIR) or in-place retention (litigation hold, Microsoft 365 retention policies and labels, or Exchange Online mailbox tags) prevent permanent removal if items are required for retention. In this case, the items are moved to the Purges folder in Recoverable Items.
More information can be found in Microsoft’s documentation covering how to remove mailbox items with compliance search actions.
The Rules of Purging
Administrators need good reasons to remove items from user mailboxes. The usual reasons include a malware infestation or because someone has circulated confidential or inappropriate information accidentally via email. In the first instance, you might have to remove just one or two items per mailbox and the 10-item threshold won’t apply. In the second, it’s possible that the search will find more than 10 matching items and therefore multiple purge actions are necessary to remove everything. Either way, two basic rules should be remembered:
The most important rule, which applies to every search, is to make the search criteria as precise as possible by including details of the message subject, suitable keyword text (like a unique phrase) that occur in the message body, a limited period, and the sender’s email address.
When several purge runs are needed to remove items, it’s important to use a targeted collection to exclude the Recoverable Items folders from the search. A targeted collection instructs Purview to focus on or ignore selected folders by including folder identifiers in the search criteria. In this instance, the criteria should exclude the folder identifiers for the Deletions, Purges, and Versions folders in Recoverable Items for both the primary and archive (if enabled) mailboxes for the selected accounts. If you don’t use a targeted collection, the risk exists that the purge action will attempt to process items that are held for retention. These attempts will fail because retention takes precedence. The solution is to exclude the Recoverable Items folders from the search.
One complicating factor is that folder identifiers differ from mailbox to mailbox. To exclude the Recoverable Items folders, it’s necessary to retrieve the folder identifiers for each mailbox covered by a search.
With those thoughts in mind, let’s discuss creating a compliance search that leads to purging items.
Creating a Compliance Search to Purge Items
Many example scripts exist on the internet to demonstrate how to purge mailbox items. The mechanics of creating a new compliance search action don’t need to be explained here. However, I wrote a script to demonstrate some of the finer points of creating and running and compliance search that follows through to purging items. You can download the script from GitHub. The script is not intended for production use. Its purpose is to illustrate principles and concepts.
The script processes either a single mailbox or all mailboxes. After prompting for the search target and the subject of the message to find, the script proceeds as follows:
If a single mailbox is searched, the option is given to exclude or include the Recoverable Items folder. If the folders are excluded, the script finds the identifiers of the folders (from both primary and archive mailboxes) and adds them to the KQL query. The script searches for items sent over the last 30 days. The period is configurable.
After finalizing the KQL query with date range, subject, and folder identifiers, the script removes any previous search with the same name. It then creates a new compliance search with the parameters as defined.
New-ComplianceSearch -Name $SearchName -ExchangeLocation $MailboxesToSearch -ContentMatchQuery $KQLQuery -Description 'Compliance Search Test' | Out-Null Start-ComplianceSearch -Identity $SearchName
Once the search completes, the script analyzes the search results to find the set of mailboxes where matches occur. It also identifies the mailbox with the highest number of matches.
# Use regex to find all instances of the search results where item count is greater than zero $LocationsWithItemCount = [regex]::Matches($ComplianceSearch.SuccessResults, "Item count: (\d+)") [array]$Locations = $LocationsWithItemCount | Where-Object { $_.Groups[1].Value -gt 0 } # Use regex to extract email addresses, item counts, and total sizes from the search results $LocationsWithEmail = [regex]::Matches($ComplianceSearch.SuccessResults, "Location: (\S+@\S+\.\S+), Item count: (\d+), Total size: (\d+)") # Extract email addresses and item counts where item count is greater than 0 [array]$LocationsWithItems = $null foreach ($Match in $LocationsWithEmail) { $Email = $match.Groups[1].Value $ItemCount = [int]$match.Groups[2].Value if ($ItemCount -gt 0) { $LocationsWithItems += [PSCustomObject]@{ Email = $email ItemCount = $itemCount } } } # Figure out how many loops might be needed to remove all items [int]$LocationsGT10 = 0; [int]$HighestValue = 0 If ($LocationsWithItems.Count -eq 1) { $HighestLocation = $LocationsWithItems[0].Email $HighestValue = $LocationsWithItems[0].ItemCount } Else { ForEach ($Item in $LocationsWithItems) { $ItemCount = $Item.ItemCount If ($ItemCount -gt 10) { $LocationsGT10++ } If ($ItemCount -gt $HighestValue) { $HighestValue = $ItemCount $HighestLocation = $Item.Email } } } $LoopsNeeded = [math]::ceiling($HighestValue/10) $TotalResults = 0 # Let the administrator know what we have found Write-Host ("Compliance search completed. {0} items found in {1} locations. {2} iterations are required to remove these items. The mailbox with most items is {3} with {4}." -f $ItemsFound, $Locations.Count, $LoopsNeeded, $HighestLocation, $HighestValue) -ForeGroundColor Yellow
If any items are found, the script runs a preview search action to find some sample items for the administrator to check if the search is targeting the correct items. The script displays ten preview items and asks the administrator to approve going ahead to purge the items.
The script uses a SoftDelete purge action. Change the value to HardDelete if you want to permanently remove items from mailboxes.
Figure 1 shows the script in action to find and remove items from all mailboxes in a tenant.
Purging into the Future
From the above, it’s easy to understand why administrators see content searches as an important tool. In their blog post, Microsoft explicitly called out content searches with purge actions as “our search and purge tool for mailbox items” for “data spillage scenario[s].”
After publishing the article about eDiscovery modernization, I received some questions about whether content searches would be affected if eDiscovery became all about cases. Well, an eDiscovery case can include several searches, and these searches can be found and used just like content searches. Here’s an example of examining the search results for one of the two searches in a standard eDiscovery case:
Get-ComplianceSearch -case 'Case 2024-Oct-99' Name RunBy JobEndTime Status ---- ----- ---------- ------ Search Case 99 Tony Redmond 14/10/2024 17:56:55 Completed Tony Redmond Tony Redmond 14/10/2024 20:03:43 Completed $SearchCase = Get-ComplianceSearch -Case 'Case 2024-Oct-99' | Sort-Object Name | Select-Object -First 1 (Get-ComplianceSearch -Identity $SearchCase.Identity).SuccessResults {Location: Tony.Redmond@office365itpros.com, Item count: 22, Total size: 5761428,,,,
Some scripts might need adjustments to target searches from eDiscovery cases instead of standalone content searches, but those changes are straightforward.
I’m assuming that Microsoft won’t shoot themselves in the foot by compromising the effectiveness of their self-proclaimed search and purge tool as the eDiscovery modernization rolls out. Then again, I’ve been known to be wrong in the past, so keep a wary eye on how search and purge develops over the next year. And if at else fails, you can always write a bespoke script to remove mailbox items using whatever criteria you choose. The nice thing about a bespoke script is that it can remove hundreds or thousands of items. The downside is exactly the same comment (if you make a mistake with the search criteria, for instance). Good luck!
If I need to purge lots of content from a mailbox like anything older than 30 days how can the purge process those? It will be lots and lots of items. I thought about a retention policy but the date needs to be static. Thoughts?
Use something like the script in https://practical365.com/mailbox-clean-up-script/