The Biggest Change to SharePoint Versioning Ever

Originally planned for release in November 2023, Microsoft is approaching general availability for SharePoint version controls in Microsoft 365 tenants worldwide. The update is described in MC789209 (last updated 9 August 2024, Microsoft 365 roadmap item 145802). Full deployment is expected to be completed by mid-October 2024, so now’s a good time to review if your tenant can take advantage of the new capability.

The new controls affect how SharePoint Online manages versions for files. File versioning is an essential element of the ability for users to recover previous versions of documents, including the restore this library feature. Versioning is also used by the Office AutoSave feature to ensure that edits made to Office files stored in SharePoint Online and OneDrive for Business are captured automatically. Without frequent capturing of auto-saved versions, co-authoring wouldn’t work.

The Big Problem with Simple Versioning

Versioning has worked well for many years, but it has one big problem: AutoSave creates many versions of files during edit sessions. Editing the document containing this article created 25 versions. SharePoint Online sets the default version limit for files to 500 for newly-created document libraries, and keeping up to 500 versions of a large file can rapidly consume a lot of storage.

Given the cost of SharePoint Online storage, consuming gigabytes of information for file versions without restraint required a more sophisticated approach, and that’s what intelligent versioning is all about.

According to Microsoft, the new functionality delivers “time-based and intelligent automatic trimming capabilities to give administrators and content owners a way to reduce the storage footprint consumed by low value file versions while retaining appropriate recoverability.”

Based on the design principle that the restore value of a version degrades over time, the intelligent versioning algorithm used by SharePoint Online “thins out intermittent” older versions that are less likely to be restored while preserving sufficient “high-value” versions to enable file recovery if necessary. Intelligent versioning works for files created by Office applications. It doesn’t support file types like PDF where completely new versions of files are uploaded to SharePoint to update a file in a library.

Coming back to the versions created for this article, it’s likely that some of the 25 versions store just a few changes to the file while others contain more fundamental changes, like moving large chunks of text from one place to another. SharePoint Online created still more versions when I updated file properties, including the custom properties I use to track articles.

The idea is that SharePoint can remove the less important versions while keeping those that are more important. If I ever wanted to recover a version of the article, I could select from the versions that mark real change in the file rather than those where just a few characters might have been altered.

Intelligent Versioning

Versions are controllable at the tenant, site, and document library level for SharePoint Online and OneDrive for Business. To configure the setting for the tenant, go to Settings in the SharePoint admin center and select Version history limits. To take advantage of intelligent versioning, select Automatic (Figure 1). This means that SharePoint Online will manage file versions dynamically based on activity (how often a file is updated) and creation date.

Updating versioning settings for a SharePoint Online tenant.
Figure 1: Updating versioning settings for a SharePoint Online tenant

No time limit remains the default, meaning that SharePoint Online removes file versions after they exceed the maximum number of versions permitted for the site. For instance, if 500 versions are allowed, SharePoint removes the oldest version when a user creates version 501. The Manually option allows organizations to choose to remove versions after a version limit is reached or a time limit is exceeded. For example, you could opt to remove versions older than 730 days (two years).

The selected setting for tenant-wide version time limits apply to new sites and OneDrive for Business accounts but is not retrospectively applied to existing sites and accounts. However, site owners and administrators can apply the new version controls to existing sites (using PowerShell) and document libraries (by editing library settings as shown in Figure 2).

Updating the version settings for a document library
Figure 2: Updating the version settings for a document library

Like any SharePoint setting, it can take a little while before the new version settings are operational. You’ll know when intelligent versioning is in use when the version history for a file shows different date-based expiration periods for versions. For instance, the bottom of the version history shown in Figure 3 has “never expires.” These versions were created before switching the site to use intelligent versioning. The versions created afterward have varying expiration dates depending on when the algorithm thinks it will be safe to remove these versions. The current version never expires because it is the file that users access in the document library.

Practical SharePoint: Intelligent Versioning Rolling Out
Figure 3: Version history for a file with intelligent versioning

The older versions marked to never expire won’t be trimmed by SharePoint Online until the maximum number of versions defined for the tenant is reached. When a tenant opts to use intelligent versioning, SharePoint Online sets the maximum number of versions to 500 and the tenant cannot change this value. Once the 500-version threshold is reached, SharePoint will trim starting from the oldest version to keep the version count to 500.

Updating Older Sites for Intelligent Versioning

The tenant version time limit settings apply to new sites, but Microsoft doesn’t provide a GUI method to apply new version settings to the existing sites. However, it’s possible to use PowerShell to apply intelligent versioning to multiple sites. Here’s an example of fetching all the sites used by Microsoft 365 groups that are not archived by Microsoft 365 archive. The Set-SPOSite cmdlet is run for each site to set EnableAutoExpirationVersionTrim to true (by default, the property is not set).

[array]$Sites = Get-SPOSite -Limit All -Template 'GROUP#0' -Filter {ArchiveStatus -eq 'NotArchived'}
ForEach ($Site in $Sites) {
   If ($Site. EnableAutoExpirationVersionTrim -ne $true) {
     Write-Host ("Updating {0}…" -f $Site.Url)
     Set-SPOSite -Identity $Site.Url -EnableAutoExpirationVersionTrim $true -Confirm:$false
   }
}

In contrast, if you want to use version and date limits, the Set-SPOSite command sets EnableAutoExpirationVersionTrim to false and provides values for the version and date limits:

Set-SPOSite -Identity $siteUrl -EnableAutoExpirationVersionTrim $false -MajorVersionLimit 300 -ExpireVersionsAfterDays 730 -Confirm:False

The Effect of Intelligent Versioning

Apart from saying that it’s worth reading, I don’t want to repeat Microsoft’s documentation covering planning version storage for document libraries. The important takeaway for intelligent versioning is summarized in this claim:

Automatic limits: Under automatic settings intermittent older versions are trimmed over time, resulting in a 96% version storage reduction over a six-month period compared to count limits.

Such a result is possible, providing certain conditions exist. Other organizations will have different results. In any case, two things are obvious here. First, Microsoft believes that intelligent versioning can have a very substantial impact on storage consumed by file versions. Second, the full impact can only be measured over time because intelligent versioning can only kick in as people work with files to generate new versions.

The Retention Issue

But then we come to the retention issue. When retention policies or retention labels apply in-place holds on sites or individual documents, the site preservation hold library can swell to consume a large amount of storage. For example, the preservation hold library of the site used to store the files for the Office 365 for IT Pros eBook consumes 21.6% (23.7 GB) of the site storage (Figure 4).

The preservation hold library can consume a large amount of storage
Figure 4: The preservation hold library can consume a large amount of storage

The preservation hold library serves an essential function. Without it, file recovery an eDiscovery wouldn’t work. However, As seen above, the preservation hold library for a site holding many large and complex files can soon fill lots of storage. However, because the site comes within the scope of a retention policy, SharePoint Online cannot remove versions until the retention period defined by the policy expires. Indeed, as Microsoft observes:

“For items that are subject to a retention policy (or an eDiscovery hold), the versioning limits for the document library are ignored.”

In other words, retention policies force SharePoint Online to hold all versions.

Running an Auto-Trim Job

To confirm that intelligent versioning couldn’t remove versions from sites controlled by retention policies, I ran a trim job. This is a background process to check and remove versions using the criteria of the chosen option (manual or automatic). Trim jobs can only be submitted using the PowerShell New-SPOSiteFileVersionBatchDeletedJob cmdlet:

$Site = 'https://office365itpros.sharepoint.com/sites/O365ExchPro'
New-SPOSiteFileVersionBatchDeleteJob -Identity $Site -Automatic

Are you sure you want to perform this action?
By executing this command, versions specified will be permanently deleted from site https://office365itpros.sharepoint.com/sites/O365ExchPro in the upcoming days.
These file versions cannot be restored from the recycle bin.
[Y] Yes  [A] Yes to All  [N] No  [L] No to All  [S] Suspend  [?] Help (default is "Y"): y

Get-SPOSiteFileVersionBatchDeleteJobProgress -Identity $Site

Url                         : https://office365itpros.sharepoint.com/sites/O365ExchPro
WorkItemId                  : a8fa54d3-5bfa-405f-ac6c-8155781c4db9
Status                      : CompleteWithFailure
ErrorMessage                : This job had failed. Please check if there is a legal hold in place.
RequestTimeInUTC            : 29/09/2024 00:38:43
LastProcessTimeInUTC        : 29/09/2024 00:45:11
CompleteTimeInUTC           :
BatchDeleteMode             : AutomaticTrim
DeleteOlderThanInUTC        :
MajorVersionLimit           :
MajorWithMinorVersionsLimit :
FilesProcessed              : 4365
VersionsProcessed           : 137
VersionsDeleted             : 0
VersionsFailed              : 1
StorageReleasedInBytes      : 0

Note the warning when scheduling the trim job that versions will be permanently removed and the “complete with failure” status reported when the job is completed. No versions could be removed because a legal hold (retention hold) was in place.

Reporting Versions

Before testing trim jobs, you might like to understand what versions exist for files within a site. Microsoft includes the facility to generate a report, again using PowerShell. For example:

$Site = 'https://office365itpros.sharepoint.com/sites/O365ExchPro'
$Report = 'https://office365itpros.sharepoint.com/sites/O365ExchPro/Shared%20Documents/Files/Report.CSV'
New-SPOSiteFileVersionExpirationReportJob -Identity $Site -ReportUrl $Report

The output is a CSV file that possibly works well for sites holding a few documents. It is terribly difficult to follow for large sites. Microsoft warns that the report can take over 24 hours to run. I guess that I was lucky as the report run was far shorter, but the 31,449 lines of information generated in the CSV file (Figure 5) are very hard to make sense of. It would be good if Microsoft generated a summary of what’s contained in the report to help people understand the content.

Report of file versions
Figure 5: Report of file versions

Being a PowerShell kind of guy, I wrote a script to interpret the CSV data and turn it into something that made more sense to me. Figure 6 shows the information generated from the file, which the script also outputs as an Excel worksheet or CSV file. Interestingly, 3,546 versions of the _siteicon_.jpg file occupy 393.61 MB of valuable SharePoint quota. I think this is an old icon file used by SharePoint Online and it’s an example of the kind of crud that can clutter up a system that’s revealed when you examine detail. You can download the script that I used from GitHub.

Reporting versions for SharePoint Online files
Figure 6: Reporting versions for SharePoint Online files

To be fair, Microsoft has an Excel template to read in the file versions CSV data to make more sense of the information. I didn’t get great results from the template, but this might be due to my incompetence with Excel. In any case, the template is worth checking out.

Great Promise Until Retention Got in the Way

It’s good to see intelligent versioning in SharePoint Online. Site owners and tenant administrators will appreciate the extra flexibility that the new options bring for managing the versions generated by day-to-day user activity. The promise of saving on expensive SharePoint storage quota is compelling.

Alas, I fear that the reality is that retention policies will stop many large enterprises from benefiting from the savings. For retention to work, versions must be kept, and if you want to save on storage costs, moving inactive sites to Microsoft 365 Archive is likely the only practical course to take.

About the Author

Tony Redmond

Tony Redmond has written thousands of articles about Microsoft technology since 1996. He is the lead author for the Office 365 for IT Pros eBook, the only book covering Office 365 that is updated monthly to keep pace with change in the cloud. Apart from contributing to Practical365.com, Tony also writes at Office365itpros.com to support the development of the eBook. He has been a Microsoft MVP since 2004.

Comments

  1. Jason B

    The Real Person!

    Author Jason B acts as a real person and verified as not a bot.
    Passed all tests against spam bots. Anti-Spam by CleanTalk.

    The Real Person!

    Author Jason B acts as a real person and verified as not a bot.
    Passed all tests against spam bots. Anti-Spam by CleanTalk.

    P.S. Great article – thanks so much!

  2. Jason B.

    The Real Person!

    Author Jason B. acts as a real person and verified as not a bot.
    Passed all tests against spam bots. Anti-Spam by CleanTalk.

    The Real Person!

    Author Jason B. acts as a real person and verified as not a bot.
    Passed all tests against spam bots. Anti-Spam by CleanTalk.

    People should know that Avepoint have an archiving solution also. Microsoft’s archiving solution wasn’t mature enough 12 months ago so a client went with them. When we asked Microsoft they denied any knowledge of any other 3rd party solutions out there on the market – but of course they would say that wouldn’t they!

    1. Avatar photo
      Tony Redmond

      Is the AvePoint solution just nicer UI built on top of the Microsoft archiving framework?

Leave a Reply