Microsoft Aims to Increase Limit for Purgeable Items to 100

Earlier this year, Microsoft finally retired the Search-Mailbox cmdlet and removed the ability to remove large numbers of messages from mailboxes. The replacement, compliance search actions, can only remove 10 items per mailbox. Microsoft is now revamping Purview eDiscovery, and as part of that effort, they promise to increase the limit to 100 items per mailbox.

The increased purge item limit hasn’t reached my tenant yet, but the change now happening around eDiscovery makes it appropriate to consider the current state of compliance search actions and what might happen when Microsoft transitions to the new eDiscovery over the coming months.

Compliance Search Basics

First, let’s define what we are dealing with.

  • A compliance search is otherwise known as a content search. In this context, we only discuss searches against mailboxes. The obvious searchable items are emails stored for user mailboxes. However, mailbox items also include data in inactive mailboxes and the special cloud mailboxes created for guest accounts. Apart from normal email, searchable items include the compliance records captured by the Microsoft 365 substrate for Teams, Viva Engage, and Planner and stored in hidden folders in user mailboxes.
  • A content search can currently be created as a standalone operation or as part of an eDiscovery case. An individual eDiscovery case can include multiple content searches. A content search runs against the indexes maintained by Microsoft Search and can return mailbox items and files stored in SharePoint Online and OneDrive for Business.
  • Content searches are estimate searches. In other words, the results returned by the searches are high-quality estimates of the items that match the search criteria. However, the definitive set of items recovered by a search are only known after an administrator exports the search results. At that point, Purview fetches copies of matching items from the searched mailboxes and moves them into a secure location in Azure. It is possible that the items found by the estimated search and those exported to Azure differ, especially if the search criteria are not very precise.
  • Search actions, like previewing a selected set of found items to ensure that the search criteria are accurate, are performed against estimate results.
  • The Purge search action can only process mailbox items and comes in two variants. The SoftDelete option moves purged items into the Deletions folder within Recoverable Items. It is the equivalent of deleting items from the Deleted Items folder. The HardDelete option permanently removes items from the mailbox. However, Single Item Recovery (SIR) or in-place retention (litigation hold, Microsoft 365 retention policies and labels, or Exchange Online mailbox tags) prevent permanent removal if items are required for retention. In this case, the items are moved to the Purges folder in Recoverable Items.

More information can be found in Microsoft’s documentation covering how to remove mailbox items with compliance search actions.

The Rules of Purging

Administrators need good reasons to remove items from user mailboxes. The usual reasons include a malware infestation or because someone has circulated confidential or inappropriate information accidentally via email. In the first instance, you might have to remove just one or two items per mailbox and the 10-item threshold won’t apply. In the second, it’s possible that the search will find more than 10 matching items and therefore multiple purge actions are necessary to remove everything. Either way, two basic rules should be remembered:

The most important rule, which applies to every search, is to make the search criteria as precise as possible by including details of the message subject, suitable keyword text (like a unique phrase) that occur in the message body, a limited period, and the sender’s email address.

When several purge runs are needed to remove items, it’s important to use a targeted collection to exclude the Recoverable Items folders from the search. A targeted collection instructs Purview to focus on or ignore selected folders by including folder identifiers in the search criteria. In this instance, the criteria should exclude the folder identifiers for the Deletions, Purges, and Versions folders in Recoverable Items for both the primary and archive (if enabled) mailboxes for the selected accounts. If you don’t use a targeted collection, the risk exists that the purge action will attempt to process items that are held for retention. These attempts will fail because retention takes precedence. The solution is to exclude the Recoverable Items folders from the search.

One complicating factor is that folder identifiers differ from mailbox to mailbox. To exclude the Recoverable Items folders, it’s necessary to retrieve the folder identifiers for each mailbox covered by a search.

With those thoughts in mind, let’s discuss creating a compliance search that leads to purging items.

Creating a Compliance Search to Purge Items

Many example scripts exist on the internet to demonstrate how to purge mailbox items. The mechanics of creating a new compliance search action don’t need to be explained here. However, I wrote a script to demonstrate some of the finer points of creating and running and compliance search that follows through to purging items. You can download the script from GitHub. The script is not intended for production use. Its purpose is to illustrate principles and concepts.

The script processes either a single mailbox or all mailboxes. After prompting for the search target and the subject of the message to find, the script proceeds as follows:

If a single mailbox is searched, the option is given to exclude or include the Recoverable Items folder. If the folders are excluded, the script finds the identifiers of the folders (from both primary and archive mailboxes) and adds them to the KQL query. The script searches for items sent over the last 30 days. The period is configurable.

After finalizing the KQL query with date range, subject, and folder identifiers, the script removes any previous search with the same name. It then creates a new compliance search with the parameters as defined.

New-ComplianceSearch -Name $SearchName -ExchangeLocation $MailboxesToSearch -ContentMatchQuery $KQLQuery -Description 'Compliance Search Test' | Out-Null
Start-ComplianceSearch -Identity $SearchName

Once the search completes, the script analyzes the search results to find the set of mailboxes where matches occur. It also identifies the mailbox with the highest number of matches.

# Use regex to find all instances of the search results where item count is greater than zero
$LocationsWithItemCount = [regex]::Matches($ComplianceSearch.SuccessResults, "Item count: (\d+)")
[array]$Locations = $LocationsWithItemCount | Where-Object { $_.Groups[1].Value -gt 0 }

# Use regex to extract email addresses, item counts, and total sizes from the search results
$LocationsWithEmail = [regex]::Matches($ComplianceSearch.SuccessResults, "Location: (\S+@\S+\.\S+), Item count: (\d+), Total size: (\d+)")
# Extract email addresses and item counts where item count is greater than 0
[array]$LocationsWithItems = $null
foreach ($Match in $LocationsWithEmail) {
    $Email = $match.Groups[1].Value
    $ItemCount = [int]$match.Groups[2].Value
    if ($ItemCount -gt 0) {
        $LocationsWithItems += [PSCustomObject]@{
            Email = $email
            ItemCount = $itemCount
        }
    }
}

# Figure out how many loops might be needed to remove all items
[int]$LocationsGT10 = 0; [int]$HighestValue = 0
If ($LocationsWithItems.Count -eq 1) {
    $HighestLocation = $LocationsWithItems[0].Email
    $HighestValue = $LocationsWithItems[0].ItemCount
} Else {
    ForEach ($Item in $LocationsWithItems) {
       $ItemCount = $Item.ItemCount
       If ($ItemCount -gt 10) {
            $LocationsGT10++
       }
       If ($ItemCount -gt $HighestValue) {
            $HighestValue = $ItemCount
            $HighestLocation = $Item.Email
       }
    }
}
$LoopsNeeded = [math]::ceiling($HighestValue/10)
$TotalResults = 0
# Let the administrator know what we have found 
Write-Host ("Compliance search completed. {0} items found in {1} locations. {2} iterations are required to remove these items. The mailbox with most items is {3} with {4}." -f $ItemsFound, $Locations.Count, $LoopsNeeded, $HighestLocation, $HighestValue) -ForeGroundColor Yellow

If any items are found, the script runs a preview search action to find some sample items for the administrator to check if the search is targeting the correct items. The script displays ten preview items and asks the administrator to approve going ahead to purge the items.

The script uses a SoftDelete purge action. Change the value to HardDelete if you want to permanently remove items from mailboxes.

Figure 1 shows the script in action to find and remove items from all mailboxes in a tenant.

Practical Compliance: Using Purge Actions to Remove Mailbox Items
Figure 1: The compliance search purge script in action

Purging into the Future

From the above, it’s easy to understand why administrators see content searches as an important tool. In their blog post, Microsoft explicitly called out content searches with purge actions as “our search and purge tool for mailbox items” for “data spillage scenario[s].

After publishing the article about eDiscovery modernization, I received some questions about whether content searches would be affected if eDiscovery became all about cases. Well, an eDiscovery case can include several searches, and these searches can be found and used just like content searches. Here’s an example of examining the search results for one of the two searches in a standard eDiscovery case:

Get-ComplianceSearch -case 'Case 2024-Oct-99'

Name           RunBy        JobEndTime          Status
----           -----        ----------          ------
Search Case 99 Tony Redmond 14/10/2024 17:56:55 Completed
Tony Redmond   Tony Redmond 14/10/2024 20:03:43 Completed

$SearchCase = Get-ComplianceSearch -Case 'Case 2024-Oct-99' | Sort-Object Name | Select-Object -First 1
(Get-ComplianceSearch -Identity $SearchCase.Identity).SuccessResults
{Location: Tony.Redmond@office365itpros.com, Item count: 22, Total size: 5761428,,,,

Some scripts might need adjustments to target searches from eDiscovery cases instead of standalone content searches, but those changes are straightforward.

I’m assuming that Microsoft won’t shoot themselves in the foot by compromising the effectiveness of their self-proclaimed search and purge tool as the eDiscovery modernization rolls out. Then again, I’ve been known to be wrong in the past, so keep a wary eye on how search and purge develops over the next year.

About the Author

Tony Redmond

Tony Redmond has written thousands of articles about Microsoft technology since 1996. He is the lead author for the Office 365 for IT Pros eBook, the only book covering Office 365 that is updated monthly to keep pace with change in the cloud. Apart from contributing to Practical365.com, Tony also writes at Office365itpros.com to support the development of the eBook. He has been a Microsoft MVP since 2004.

Leave a Reply