Get a Handle on What’s Being Shared from OneDrive Accounts

Over the last few years, OneDrive for Business has evolved from personal storage for files created by Microsoft 365 users to become the default location for apps from Stream to Teams to Whiteboard to store files. More documents, spreadsheets, presentations, PDFs, and other types of files are being stored in OneDrive for Business accounts. The advantage gained through the approach is that users have a single file repository, but Microsoft’s enthusiasm to exploit OneDrive for Business also creates some issues for tenants to manage.

Much to the chagrin of some organizations, Microsoft 365 apps encourage the creation of valuable information in OneDrive for Business. For instance, co-authoring allows users to collaborate in Office documents. An even more extreme example is the almost instant collaboration enabled through Loop components on Teams chats and Outlook messages. Documents and Loop components remain in OneDrive instead of being safely stored in a shared location, like a SharePoint site. Cue problems that emerge when someone leaves the organization, and their OneDrive account disappears.

At the same time, the advent of generative AI created has heightened awareness about the potential for inadvertent exposure of confidential or sensitive information due to oversharing. By this, I mean that users (or the owners of SharePoint sites) assign overly-generous permissions to files or folders that result in making the information available to Microsoft 365 Copilot to include in its responses to users prompts.

Reporting OneDrive for Business

In June 2024, I wrote about using the Microsoft Graph PowerShell SDK to create a report about files in a OneDrive for Business account. The report helps to understand what files exist in an account. It’s often easier to look through a report than to navigate through multiple pages in the OneDrive browser GUI.

However, I didn’t include anything about sharing in the report. Five years ago, Vasil Michev wrote about using the Graph API to report shared OneDrive for Business files. Given the concerns about oversharing, it seemed like a good idea to create a new version of a script to report files shared from OneDrive for Business accounts using the Microsoft Graph PowerShell SDK. The process of building the new script is explained in this article.

Scripting the Search for Shared Files

The script (downloadable from GitHub) is not an off-the-shelf solution. The intention is to demonstrate the principles behind retrieving sharing information from files stored in OneDrive for Business. Many improvements could be made, such as adding logging to the script or making it parameter-driven so that the script processes selected OneDrive accounts instead of all accounts.

The script uses the Sites.Read.All application permission to read information for all sites in the tenant. The other permissions used are Users.Read.All, Group.Read.All, and GroupMember.Read.All, used to read user accounts, groups, and group members. To gain these permissions, the script uses an Entra ID registered app with consent for the permissions to authenticate in app-only mode.

Connect-MgGraph -AppId $AppId -TenantId $TenantId -CertificateThumbprint $Thumbprint -NoWelcome

After connecting, the script finds all sites in the tenant and uses a client-side filter to reduce the set to sites used OneDrive for Business. I tried hard to find a way to use a server-side filter to find the OneDrive sites but failed. Here’s what the script does:

[array]$Sites = Get-MgSite -All -PageSize 500 -Property DisplayName, WebUrl, IsPersonalSite, CreatedByUser, CreatedDateTime, Description, Name, id
[array]$OneDriveSites = $Sites | Where-Object {$_.IsPersonalSite -eq $true}

The set of OneDrive sites includes sites for unlicensed or deleted accounts. There can be many of these sites accumulated since 2014 or thereabouts, and the swelling amount of storage consumed by unlicensed sites is probably the reason why Microsoft is moving to charge for this storage from January 2025. To reduce the set to the sites belonging to current users, the script runs the Get-MgUser cmdlet to find licensed accounts and builds a hash table of the display names and user principal names.

[array]$Users = Get-MgUser -Filter "assignedLicenses/`$count ne 0 and userType eq 'Member'" -ConsistencyLevel eventual -CountVariable Records -All -PageSize 999 | Sort-Object displayName
$UserHash = @{}
ForEach ($User in $Users) {
    $UserHash.Add($User.DisplayName, $User.UserPrincipalName)
}

The script then loops through the OneDrive sites to check for shared files, but only for sites owned by current users. By looking up the name of the site against the user hash table, the script knows if it should check the site. If so, the Get-MgSiteDrive cmdlet fetches the drives (document libraries) for the site. Usually a single document library is present for a personal site, but to be sure, the script fetches the drive whose name is like “OneDrive*.” Recent OneDrive document librarues seem to be named “OneDrive” but some older OneDrive accounts have document libraries with a name created from “OneDrive” and the tenant name. After selecting the document library to process, the script passes its identifier to the Get-DriveItems function to find and interrogate the individual files:

ForEach ($Site in $OneDriveSites) {
    If ($UserHash[$Site.name]) {
        $Global:TotalFiles = 0
        $i++
        Write-Host ("Processing OneDrive site for {0} {1}/{2}" -f $Site.DisplayName, $i, $OneDriveSites.Count) -ForegroundColor Yellow
        Try {
            [array]$Drives = Get-MgSiteDrive -SiteId $Site.id
            $Drive = $Drives[0]
            Get-DriveItems -Drive $Drive.Id -FolderId "root"
        }
        Catch {
            Write-Host ("Error processing OneDrive site {0}. The account might be locked or the user might never have used OneDrive." -f $Site.DisplayName)
            Continue
        }
        # Brief pause before we process the next account
        Start-Sleep -Seconds 2
    }
}

Finding Files and Their Permissions

The Get-DriveItems function first calls the Get-MgDriveItemChild cmdlet to find the set of objects in the site starting at the root.

[array]$Data = Get-MgDriveItemChild -DriveId $Drive -DriveItemId $FolderId -All

After separating the files from folders (currently, the script only processes files), the script checks each file to validate if it is shared. If so, the script extracts the sharing permissions from the file by running the Get-MgDriveItemPermission cmdlet (based on the Permissions API):

[array]$Permissions = Get-MgDriveItemPermission -DriveId $Drive -DriveItemId $File.Id -Property Roles, GrantedTo, HasPassword, ExpirationDateTime, Invitation, InheritedFrom, Id, Link, GrantedToV2 | Sort-Object Roles

The script determines what kind of sharing permission (edit or view) and the scope of the permission, such as an anyone, organization, or direct access link. If the permission is granted to a group, the script extracts the group membership. Permissions might be present for users no longer known to the tenant. The identifiers for these entries are represented by numbers, and the script reports the permissions as for a “user account removed from tenant.” If the permission is given to a guest account, the script extracts the account’s email address and reports that rather than its user principal name.

Apart from some false starts, coding progressed quite quickly. The only problem I was not able to overcome is how to retrieve information about people who use a sharing link (sent by email or in a Teams message) to access a file. The permissions API doesn’t reveal this detail. SharePoint Online obviously knows how to find and interpret the data, but it’s not available in the public API.

Eventually, after extracting all the relevant information, the script updates a PowerShell list object that eventually serves as the source for reporting.

Reporting Shared Files

After processing all the sites, we have a set of data about shared files found in OneDrive for Business accounts. Figure 1 shows a sample of the kind of data generated by the script and output as an Excel worksheet using the ImportExcel module.

Extract from the OneDrive for Business file sharing report
Figure 1: Extract from the OneDrive for Business file sharing report

Although it’s interesting to delve into the details of who’s sharing what with whom, especially in terms of using Anyone or Organization links to share information (which automatically make files available to Microsoft 365 Copilot), analyzing the data helps to understand who’s doing what. You could import the information into Power BI to generate reports and visualize the content, but it’s also possible to do basic analysis with PowerShell.

For example, Figure 2 shows a summary of sharing behavior within my test site generated using a couple of lines of code. Naturally, I am the major sharer.

Summary of OneDrive for Business sharing
Figure 2: Summary of OneDrive for Business sharing

The Responsibility for Handling Potentially Confidential Information

The SharePoint Online admin center includes some reports to help identify oversharing. However, the reports are focused on SharePoint sites rather than OneDrive accounts. Hopefully, the concepts explained here will help administrators to understand how sharing happens in OneDrive accounts in their tenant, especially the use of Anyone and Organization-wide links.

One last point. Although no content is extracted from files, the reported data could still be confidential or reveal information that its owners would prefer not to be shared. Using high-profile Graph application permissions like Sites.Read.All allows access to every site in the tenant. That’s a big responsibility and the reason not to use permissions like this without a solid justification.

About the Author

Tony Redmond

Tony Redmond has written thousands of articles about Microsoft technology since 1996. He is the lead author for the Office 365 for IT Pros eBook, the only book covering Office 365 that is updated monthly to keep pace with change in the cloud. Apart from contributing to Practical365.com, Tony also writes at Office365itpros.com to support the development of the eBook. He has been a Microsoft MVP since 2004.

Comments

  1. Nuno Mota

    The Real Person!

    Author Nuno Mota acts as a real person and verified as not a bot.
    Passed all tests against spam bots. Anti-Spam by CleanTalk.

    The Real Person!

    Author Nuno Mota acts as a real person and verified as not a bot.
    Passed all tests against spam bots. Anti-Spam by CleanTalk.

    Great script as always, thank you!

    I had the same problem when filtering for OneDrive sites, it’s annoying there is no server-side filter, but anyway…

    I have a few suggestions if I may:

    #1 I would use the UPN as they key when building the hash table $UserHash as in most medium-large organisations there will be users with the same DisplayName, which will cause the script to skip/fail those users.

    #2 You said that “There should only be one drive for a personal site (…)”, but this is not always the case. For example, for my OneDrive site I have 3 drives:
    PersonalCacheLibrary
    Preservation Hold Library
    OneDrive

    So the script wouldn’t work for me because “$Drive = $Drives[0]” would retrieve the incorrect one. Instead, I would do something like “$Drive = $Drives | ? {$_.Name -eq “OneDrive”}”

    #3 If there are no shared files in the root folder, the Get-DriveItems function won’t process any other folders and subfolders because of the code:
    If (!($SharedItems)) {
    Write-Host (“No shared files found in {0}” -f $FolderName)
    Return
    }

    I would remove these lines as they are not doing much other than writing to the host that there are no files. The “ForEach ($File in $SharedItems) {” code will be skipped if there are no shared files anyway.

    #4 For me, the part where you only process OneDrive sites for existing users doesn’t work, namely “If ($UserHash[$Site.name]) {…}”. In my case, it’s checking if any of the hash table keys (which are the users’ DisplayName, such as ‘Nuno Mota’) match the site name, which for me is nuno_mota_domain_com, therefore never matching…

    Once I updated these, the script worked brilliantly, thank you again!

    1. Avatar photo
      Tony Redmond

      Hi Nuno,

      Thank you for your suggestions. The big upside with PowerShell is that anyone can change the code to match their needs. In any case:

      #1. I chose the display name from the account as I could match against OneDrive. But as you say, there could be several people in an organization with the same name. The UPN is certainly unique for an account, but which property do you suggest matching against for OneDrive? I don’t see a UPN in the set of properties returned for a OneDrive account. With some formatting, it is possible to match against the WebURL )/personal/peter_roche_office365itpros_com). What did you do?
      #2. It’s true that several drives are reported for OneDrive accounts. The code now looks for the drive with a name like “OneDrive” because the name is not always just “OneDrive.” I have some that are named “OneDrive – Office365forITPros.” I suspect that this naming convention is old and was simplified several years ago.
      #3. Your suggestion is now in the code.
      #4 is related to #1. You obviously need to have a reliable match to check the hashtable. The display name of an account is checked against the name of the OneDrive site, which works. The name property for all the OneDrive sites in my tenant have synchronized with the display name of the user account.

    2. Avatar photo
      Tony Redmond

      The current version of the script (in GitHub) now uses the UPN to match against OneDrive accounts. I had to add some code to convert the UPN into the format used for OneDrive URLs…

Leave a Reply