OneDrive is the Great Dumping Ground for Microsoft 365 Information
OneDrive for Business has become the great dumping ground for Microsoft 365. Its original mission to provide personal storage for users has evolved over the last few years as apps like Stream, Clipchamp, and Whiteboard move away from app-specific storage to embrace the concept of OneDrive as the user storage location for Microsoft 365. These apps join others like Loop (components) and Lists (personal lists) in using OneDrive storage.
I don’t have any difficulty with the concept of establishing a common storage repository for Microsoft 365 apps. Sometimes, Microsoft development groups are very enthusiastic about using OneDrive, sometimes without good reason. Installing PowerShell modules into OneDrive is an example. From PowerShell 6 onward, the default location for PowerShell module installation is $HOME\Documents\PowerShell\Modules, which ends up in OneDrive if Windows Known Folders are redirected to OneDrive. I don’t know why Microsoft thinks storing PowerShell modules in OneDrive is a good idea, so I always specify AllUsers as the scope for module installations.
In any case, the net result of the activity over the last few years is that user OneDrive for Business accounts accumulate the output of many files from Teams meeting recordings to the Office documents created by users.
You can ignore the problem and let OneDrive for Business accounts accrue vast quantities of potentially useless data or take control of the situation and make sure that your OneDrive for Business account contains useful and relevant information. OneDrive accounts are an important source of information to ground AI searches, even if the organization uses Copilot for Microsoft 365 with Restricted SharePoint Search, so cleaning out obsolete debris from OneDrive is a good idea.
Reporting OneDrive Files
That is, if you have some guidance for the removal of unwanted, obsolete, or erroneous information, which brings me to the need for some method to analyze the contents of a user OneDrive for Business account. With this thought in mind, I decided that it would be a good idea to create a report (including some basic statistics) about what’s lurking in a OneDrive account. Not only is this a good thing to ensure that generative AI doesn’t hallucinate based on bad data, being able to report what’s in a OneDrive account makes it easier to deal with the OneDrive accounts for people who leave the organization.
Unlike SharePoint Online, OneDrive for Business doesn’t include the option to export item and folder information to Excel. At least, I can’t find one. Writing a script to generate the information is the only option.
I started with a PowerShell script that I wrote to create a report about the files in a SharePoint Online document library. The script uses cmdlets from the Microsoft Graph PowerShell SDK like Get-MgDrive and Get-MgDriveChildItem to retrieve information about drives, files, and folders.
OneDrive for Business and SharePoint Online share many characteristics. From a Graph API perspective, both store files in drives. A SharePoint Online site can have many drives, each being a document library. By comparison, a OneDrive for Business account has one drive called Shared Documents.
Adapting Script Code for OneDrive
The script to report SharePoint Online files stored in a document library is a good starting point. Much of the code to deal with selecting a site and selecting a document library from the chosen site can be eliminated. However, I want the script to generate some basic analysis of the content of a OneDrive for Business account, so there’s some extra work to be done to produce that data.
The first challenge is how to list items in a OneDrive account. The documentation covering how to retrieve items from a drive seems straightforward, but it’s focused on SharePoint Online rather than OneDrive for Business. Some trial and error to discover the correct root for the drive took an hour or so before I figured out that I could pass a string variable containing ‘root’ to make the function I use to retrieve information for the items and folders found in a folder and enumerate content from the nested folders. It’s a logical but undocumented feature.
After discovering how to enumerate over files and folders, the process of fetching information about individual items stored in a OneDrive for Business account is straightforward. I updated the script code to store details about the byte size and extension for each file to make it easier to generate statistics and had to change the format of the Graph request to the extractSensitivityLabels API to use the Drives method for the User resource type rather than the Site resource type. This is because OneDrive is connected to a user instead of being a standalone entity.
Permissions
In terms of permissions, the Files.Read delegated permission is sufficient to read information from drives and files for the OneDrive for Business account belonging to the signed-in user. I wanted to make the script capable of reading a OneDrive account belonging to another user (subject to permissions), so I ended up with:
- Files.Read.All: Access files in any OneDrive account that the signed in user is allowed to view.
- RecordsManagement.Read.All: Read information about retention labels.
- InformationProtectionPolicy.Read: Read information about sensitivity labels.
You could remove the need for the latter two permission if you used the Exchange Online management module to fetch details of retention and sensitivity labels. While that’s another module to sign into, it does offer the advantage that the labels fetched are all those available in the tenant rather than just those available to the signed-in user. This could be important if reporting files from someone else’s OneDrive account.
Essentially, all the data about files ends up in a PowerShell list that we can query to slice and dice the information to discover different aspects of the content held in OneDrive.
Analyzing the Data
The basic information I want to know about a OneDrive account is:
- How many files are in the account.
- What kind of files are in the account.
- How much quota is used and how much remains.
- What file types occupy most space.
- The number of files that have retention and sensitivity labels.
I’m sure that your imagination will come up with other insights that could be generated from the data. That’s the joy of PowerShell: create data and then decide how to use it. Figure 1 shows the analysis for the points listed above. My OneDrive for Business account has existed since 2013, so it’s not hard to understand how some of the debris accrued.
Unsurprisingly, the account holds a lot of Word. PDF, and PowerPoint files. There’s also lots of PowerShell files, probably the remnants of some badly managed installations.
Figure 3 shows the information generated about the usage of retention labels and sensitivity labels for files within the account. As might be expected, a lower percentage of OneDrive files receive retention and sensitivity labels than files in SharePoint Online sites do.
The reported files don’t include items in the recycle bin. A beta Graph API for listing items in the recycle bin is available that doesn’t seem to extend to OneDrive for Business.
You can download the complete script from GitHub.
Contemplating File Statistics
Overall, I struggle to use the 5 TB quota assigned to my OneDrive account. The availability of massive storage is certainly a justification for not bothering too much about the digital debris that accumulates over time alongside the valuable content in a OneDrive for Business account. After all, if Microsoft makes the storage available, why not fill it?
Seriously, it’s a good idea to perform regular housekeeping to keep debris under control. The report helps by exposing items that might remain invisible through the OneDrive GUI unless you go looking for specific information. I like that and have removed a thousand files since I started to develop the script. Maybe it can help you to clean up too.
if I wanted to run this for another user, should making $account = Read-host “enter account” work? Or how can I run this for another account?
The Real Person!
The Real Person!
If you have access to the OneDrive account then you should be able to do so by replacing the assignment of the user account principal name to the $Account variable. Right now, it’s coded as follows:
$Account = (Get-MgContext).Account
This gets the account of the currently signed in user. Assign whatever account you want to report on to $Account and everything should work. The important thing is that the script uses delegated permissions, so if you don’t have access to the account, you can’t report its contents. To get past that restriction, you’d need to use application permissions and authenticate using an Entra ID registered app that has consent to use the required permissions.
Hi Tony,
I can’t seem to locate the script for “Report for a OneDrive for Business Account” here. While following the article, I did find the script for “Practical Graph: Report SharePoint Online Files in a Document Library,” but nothing for OneDrive. Could you please let me know if I’m missing something?
The Real Person!
The Real Person!
There’s a GitHub link to the script at the bottom of the article… hopefully it’s visible!