The Biggest Change to SharePoint Versioning Ever
Originally planned for release in November 2023, Microsoft is approaching general availability for SharePoint version controls in Microsoft 365 tenants worldwide. The update is described in MC789209 (last updated 9 August 2024, Microsoft 365 roadmap item 145802). Full deployment is expected to be completed by mid-October 2024, so now’s a good time to review if your tenant can take advantage of the new capability.
The new controls affect how SharePoint Online manages versions for files. File versioning is an essential element of the ability for users to recover previous versions of documents, including the restore this library feature. Versioning is also used by the Office AutoSave feature to ensure that edits made to Office files stored in SharePoint Online and OneDrive for Business are captured automatically. Without frequent capturing of auto-saved versions, co-authoring wouldn’t work.
The Big Problem with Simple Versioning
Versioning has worked well for many years, but it has one big problem: AutoSave creates many versions of files during edit sessions. Editing the document containing this article created 25 versions. SharePoint Online sets the default version limit for files to 500 for newly-created document libraries, and keeping up to 500 versions of a large file can rapidly consume a lot of storage.
Given the cost of SharePoint Online storage, consuming gigabytes of information for file versions without restraint required a more sophisticated approach, and that’s what intelligent versioning is all about.
According to Microsoft, the new functionality delivers “time-based and intelligent automatic trimming capabilities to give administrators and content owners a way to reduce the storage footprint consumed by low value file versions while retaining appropriate recoverability.”
Based on the design principle that the restore value of a version degrades over time, the intelligent versioning algorithm used by SharePoint Online “thins out intermittent” older versions that are less likely to be restored while preserving sufficient “high-value” versions to enable file recovery if necessary. Intelligent versioning works for files created by Office applications. It doesn’t support file types like PDF where completely new versions of files are uploaded to SharePoint to update a file in a library.
Coming back to the versions created for this article, it’s likely that some of the 25 versions store just a few changes to the file while others contain more fundamental changes, like moving large chunks of text from one place to another. SharePoint Online created still more versions when I updated file properties, including the custom properties I use to track articles.
The idea is that SharePoint can remove the less important versions while keeping those that are more important. If I ever wanted to recover a version of the article, I could select from the versions that mark real change in the file rather than those where just a few characters might have been altered.
Intelligent Versioning
Versions are controllable at the tenant, site, and document library level for SharePoint Online and OneDrive for Business. To configure the setting for the tenant, go to Settings in the SharePoint admin center and select Version history limits. To take advantage of intelligent versioning, select Automatic (Figure 1). This means that SharePoint Online will manage file versions dynamically based on activity (how often a file is updated) and creation date.
No time limit remains the default, meaning that SharePoint Online removes file versions after they exceed the maximum number of versions permitted for the site. For instance, if 500 versions are allowed, SharePoint removes the oldest version when a user creates version 501. The Manually option allows organizations to choose to remove versions after a version limit is reached or a time limit is exceeded. For example, you could opt to remove versions older than 730 days (two years).
The selected setting for tenant-wide version time limits apply to new sites and OneDrive for Business accounts but is not retrospectively applied to existing sites and accounts. However, site owners and administrators can apply the new version controls to existing sites (using PowerShell) and document libraries (by editing library settings as shown in Figure 2).
Like any SharePoint setting, it can take a little while before the new version settings are operational. You’ll know when intelligent versioning is in use when the version history for a file shows different date-based expiration periods for versions. For instance, the bottom of the version history shown in Figure 3 has “never expires.” These versions were created before switching the site to use intelligent versioning. The versions created afterward have varying expiration dates depending on when the algorithm thinks it will be safe to remove these versions. The current version never expires because it is the file that users access in the document library.
The older versions marked to never expire won’t be trimmed by SharePoint Online until the maximum number of versions defined for the tenant is reached. When a tenant opts to use intelligent versioning, SharePoint Online sets the maximum number of versions to 500 and the tenant cannot change this value. Once the 500-version threshold is reached, SharePoint will trim starting from the oldest version to keep the version count to 500.
Updating Older Sites for Intelligent Versioning
The tenant version time limit settings apply to new sites, but Microsoft doesn’t provide a GUI method to apply new version settings to the existing sites. However, it’s possible to use PowerShell to apply intelligent versioning to multiple sites. Here’s an example of fetching all the sites used by Microsoft 365 groups that are not archived by Microsoft 365 archive. The Set-SPOSite cmdlet is run for each site to set EnableAutoExpirationVersionTrim to true (by default, the property is not set).
[array]$Sites = Get-SPOSite -Limit All -Template 'GROUP#0' -Filter {ArchiveStatus -eq 'NotArchived'} ForEach ($Site in $Sites) { If ($Site. EnableAutoExpirationVersionTrim -ne $true) { Write-Host ("Updating {0}…" -f $Site.Url) Set-SPOSite -Identity $Site.Url -EnableAutoExpirationVersionTrim $true -Confirm:$false } }
In contrast, if you want to use version and date limits, the Set-SPOSite command sets EnableAutoExpirationVersionTrim to false and provides values for the version and date limits:
Set-SPOSite -Identity $siteUrl -EnableAutoExpirationVersionTrim $false -MajorVersionLimit 300 -ExpireVersionsAfterDays 730 -Confirm:False
The Effect of Intelligent Versioning
Apart from saying that it’s worth reading, I don’t want to repeat Microsoft’s documentation covering planning version storage for document libraries. The important takeaway for intelligent versioning is summarized in this claim:
Automatic limits: Under automatic settings intermittent older versions are trimmed over time, resulting in a 96% version storage reduction over a six-month period compared to count limits.
Such a result is possible, providing certain conditions exist. Other organizations will have different results. In any case, two things are obvious here. First, Microsoft believes that intelligent versioning can have a very substantial impact on storage consumed by file versions. Second, the full impact can only be measured over time because intelligent versioning can only kick in as people work with files to generate new versions.
The Retention Issue
But then we come to the retention issue. When retention policies or retention labels apply in-place holds on sites or individual documents, the site preservation hold library can swell to consume a large amount of storage. For example, the preservation hold library of the site used to store the files for the Office 365 for IT Pros eBook consumes 21.6% (23.7 GB) of the site storage (Figure 4).
The preservation hold library serves an essential function. Without it, file recovery an eDiscovery wouldn’t work. However, As seen above, the preservation hold library for a site holding many large and complex files can soon fill lots of storage. However, because the site comes within the scope of a retention policy, SharePoint Online cannot remove versions until the retention period defined by the policy expires. Indeed, as Microsoft observes:
“For items that are subject to a retention policy (or an eDiscovery hold), the versioning limits for the document library are ignored.”
In other words, retention policies force SharePoint Online to hold all versions. See this article for more information about how the trimming process copes with retention policies and labels.
Running an Auto-Trim Job
To confirm that intelligent versioning couldn’t remove versions from sites controlled by retention policies, I ran a trim job. This is a background process to check and remove versions using the criteria of the chosen option (manual or automatic). Trim jobs can only be submitted using the PowerShell New-SPOSiteFileVersionBatchDeletedJob cmdlet:
$Site = 'https://office365itpros.sharepoint.com/sites/O365ExchPro' New-SPOSiteFileVersionBatchDeleteJob -Identity $Site -Automatic Are you sure you want to perform this action? By executing this command, versions specified will be permanently deleted from site https://office365itpros.sharepoint.com/sites/O365ExchPro in the upcoming days. These file versions cannot be restored from the recycle bin. [Y] Yes [A] Yes to All [N] No [L] No to All [S] Suspend [?] Help (default is "Y"): y Get-SPOSiteFileVersionBatchDeleteJobProgress -Identity $Site Url : https://office365itpros.sharepoint.com/sites/O365ExchPro WorkItemId : a8fa54d3-5bfa-405f-ac6c-8155781c4db9 Status : CompleteWithFailure ErrorMessage : This job had failed. Please check if there is a legal hold in place. RequestTimeInUTC : 29/09/2024 00:38:43 LastProcessTimeInUTC : 29/09/2024 00:45:11 CompleteTimeInUTC : BatchDeleteMode : AutomaticTrim DeleteOlderThanInUTC : MajorVersionLimit : MajorWithMinorVersionsLimit : FilesProcessed : 4365 VersionsProcessed : 137 VersionsDeleted : 0 VersionsFailed : 1 StorageReleasedInBytes : 0
Note the warning when scheduling the trim job that versions will be permanently removed and the “complete with failure” status reported when the job is completed. No versions could be removed because a legal hold (retention hold) was in place.
Reporting Versions
Before testing trim jobs, you might like to understand what versions exist for files within a site. Microsoft includes the facility to generate a report, again using PowerShell. For example:
$Site = 'https://office365itpros.sharepoint.com/sites/O365ExchPro' $Report = 'https://office365itpros.sharepoint.com/sites/O365ExchPro/Shared%20Documents/Files/Report.CSV' New-SPOSiteFileVersionExpirationReportJob -Identity $Site -ReportUrl $Report
The output is a CSV file that possibly works well for sites holding a few documents. It is terribly difficult to follow for large sites. Microsoft warns that the report can take over 24 hours to run. I guess that I was lucky as the report run was far shorter, but the 31,449 lines of information generated in the CSV file (Figure 5) are very hard to make sense of. It would be good if Microsoft generated a summary of what’s contained in the report to help people understand the content.
Being a PowerShell kind of guy, I wrote a script to interpret the CSV data and turn it into something that made more sense to me. Figure 6 shows the information generated from the file, which the script also outputs as an Excel worksheet or CSV file. Interestingly, 3,546 versions of the _siteicon_.jpg file occupy 393.61 MB of valuable SharePoint quota. I think this is an old icon file used by SharePoint Online and it’s an example of the kind of crud that can clutter up a system that’s revealed when you examine detail. You can download the script that I used from GitHub.
To be fair, Microsoft has an Excel template to read in the file versions CSV data to make more sense of the information. I didn’t get great results from the template, but this might be due to my incompetence with Excel. In any case, the template is worth checking out.
Great Promise Until Retention Got in the Way
It’s good to see intelligent versioning in SharePoint Online. Site owners and tenant administrators will appreciate the extra flexibility that the new options bring for managing the versions generated by day-to-day user activity. The promise of saving on expensive SharePoint storage quota is compelling.
Alas, I fear that the reality is that retention policies will stop many large enterprises from benefiting from the savings. For retention to work, versions must be kept, and if you want to save on storage costs, moving inactive sites to Microsoft 365 Archive is likely the only practical course to take.
The Real Person!
The Real Person!
I am a Microsoft employee who is familiar with the feature. Thanks for writing it up!
“It doesn’t support file types like PDF where completely new versions of files are uploaded to SharePoint to update a file in a library.”
Most Office files aren’t special. When you overwrite any file, the version expiration date is set. Except there is something special about PST (Outlook archive files). Those don’t respect the automatic/manual/count setting and only keep the last 30.
You’re right that policies which prevent version deletion are a problem. But the version expiration stuff reconciles things as best it can.
With legal hold and retention policies, the expiration date is set according to the library setting. But when files subject to the legal hold or retention policy are supposed to expire, they are not deleted. Instead, the date gets moved back a few days. Once the hold is released, the versions due to expire will get cleaned up.
It doesn’t help customers who are forced to keep legal hold in place indefinitely. (New lawsuits create holds faster than old holds are lifted.) I don’t see a technical solution to that; maybe they they need to ask a judge for permission to clean up irrelevant content?
For the record, the text of the article was reviewed by the Microsoft development group before publication. I wanted to be very sure about the impact that retention has on the current implementation. You’re right that when retention holds are released, versions due to be removed because of expiry will be deleted. The point is that many people might expect to gain back storage quota beforehand.
The Real Person!
The Real Person!
P.S. Great article – thanks so much!
The Real Person!
The Real Person!
People should know that Avepoint have an archiving solution also. Microsoft’s archiving solution wasn’t mature enough 12 months ago so a client went with them. When we asked Microsoft they denied any knowledge of any other 3rd party solutions out there on the market – but of course they would say that wouldn’t they!
Is the AvePoint solution just nicer UI built on top of the Microsoft archiving framework?