The Basics of Standard Collection Types

Using PowerShell, you can store objects inside variables. As mentioned in earlier articles, everything in PowerShell is considered an object. The type of an object determines the properties these objects possess and what operations you can perform on them through methods. But what if you need to work with a set of objects instead of a single object? For example, you want to create a collection of specific mailbox data. You may do this implicitly when fetching multiple objects using cmdlets such as Get-ExoMailbox.

In this article, I will briefly overview the basics of standard collection types. Then, I will walk you through additional useful options that leverage other object types available through the .NET Framework, which can significantly improve performance.

Common Collections

A collection is a set of zero, one, or more objects. Within PowerShell, there are two common collection types: arrays and hash tables. However, you can leverage other collection types because PowerShell is built on the .NET Framework. In some cases, this offers significant advantages over the PowerShell built-in types, which I will discuss later. First, I will provide some basics regarding the most common collection types, arrays, and hash tables.

An array is a collection of objects with a fixed number, which is unique for arrays compared to other types. You can create an array by defining its elements, e.g., $arr=@(‘francis’,’philip’). You can inspect the type of a variable using the getType() method, e.g., $arr.getType(). You can refer to the array variable elements using the index number between brackets. Note that the first element starts with index 0, e.g., $arr[0]. Reassigning an element is similarly possible, e.g., $arr[0]=’Olrik’. While I stated that an array is of fixed size, you will find code using the addition operator to add elements to arrays, e.g., $arr+= ‘nasir’. However, as mentioned in an earlier article in the series, this essentially copies the array to a new one with room for the element(s) to add. Copying is significantly slower than the add methods offered through other collection types. More on that in a bit.

When creating arrays, you could use alternative syntax using just the elements separated by commas. However, this might lead to unexpected results when a single element gets returned, or readers might overlook a comma. Consider the output of the following examples:

$arr1 = 'francis', 'philip'
$arr1.getType()

> $arr1.getType() | Select-Object Name, BaseType

Name     BaseType
----     --------
Object[] System.Array

# But this does not work with single elements. Compare:
$arr2= 'francis'
$arr2.getType()

> $arr2.getType() | Select-Object Name, BaseType

Name     BaseType
----     --------
String   System.Object

$arr3= @('francis')
$arr3.getType()

> $arr3.getType() | Select-Object Name, BaseType

Name     BaseType
----     --------
Object[] System.Array

# Alternative syntax
$arr4= ,'francis'

When you run this, you will see $arr2 is an array, while $arr3 is a single object. $arr4 also creates an array, starting with a discarded $null element. Still, the notation is not very readable, and the reader, unaware of this syntax, might mistake the comma for a typo. So, for readability and clarity, I suggest sticking with the at-notation.

Arrays are created when cmdlets return multiple objects, e.g., Get-ExoMailbox. When the result set is empty or a single object, it may lead to unexpected situations, as indicated earlier. When you want to check the number of items returned, for example. As one might expect, for multiple items, you can consult the count property:

$Mailboxes= Get-ExoMailbox
$Mailboxes.Count

Now, when zero or one result is returned, the availability of the count property depends on the PowerShell version you are using. As a convenience feature, PowerShell v6 and later will provide a count property of 1 for single object types. However, when using PowerShell v5 on a desktop or within an Azure runbook, this convenience is not available, which may potentially lead to unexpected results. Consider the following code fragment:

$Mailboxes= Get-ExoMailbox -Filter `
  "userPrincipalName -eq 'fred@contoso.com'"

If( $Mailboxes.Count -eq 1) {
    Write-Host ('We found a match')
}
Else {
    Write-Error ('We did not find a matching mailbox')
}

The code will fetch mailboxes matching the filter. Now, PowerShell v7 will provide a count property of 1 for a single object, causing the first branch of the If to execute. However, PowerShell v5 does not provide a count property for single objects, effectively comparing nothing and 1. This comparison fails, causing the second branch (Else) to execute. Execution of this branch depends on how many mailboxes get returned, which you might not have anticipated.

A straightforward workaround is to cast the result to an array. This way, single objects will also be converted to an array, avoiding the compatibility issue. It also provides consistency in the result type:

[array]$Mailboxes= Get-ExoMailbox

Fellow author Tony is using the same trick in many of his sample codes as well. It is not a requirement when using PowerShell v7, but it makes code more compatible with v5. Since v5 is still commonly used for administrative purposes or runbooks, I recommend adopting the same habit and casting results to an array.

Hash Tables

A hash table, sometimes also called hashtable, is the other common dictionary-type object. Where array items are referenced using their respective index number, elements in hash tables use a unique string as a key and a value. Hence, the dictionary reference. Take this example:

$htab= @{}
$htab['Philip']='philip@contoso.com'
# Alternative methods to set key/values
$htab.Add( 'Francis', 'francis@contoso.com')
$htab.Olrik= 'olrik@contoso.com'

PS> $htab['Philip']
'philip@contoso.com'

PS> $htab.remove( 'Philip')

The first line initializes an empty hash table; the following 3 lines set key/value pairs in the hash table using unique keys with e-mail addresses as values. The value in hashtables is not strongly typed and can be any object type or mixed.

Iterating over elements of a hash table is a bit different than iterating over an array. You might try manually iterating over the keys provided by the Keys property and retrieving every element using the key. While this works, using the GetEnumerator() method to iterate over hash tables is preferred. GetEnumerator() processes every item in a hash table, providing its key and value pairs:

# Set multiple items in one go
$htab= @{
    'Philip'= 'philip@contoso.com'
    'Francis'= 'francis@contoso.com'
    'Olrik'= 'olrik@contoso.com'
}
ForEach( $Elem in $htab.getEnumerator()) {
    Write( 'Key={0}, Value={1}' -f $Elem.Key, $Elem.Value)
}

Be advised that you cannot make changes while iterating over a hash table. Doing so will cause PowerShell to complain that you modified the collection during enumeration. The following code will not work:

$htab.getEnumerator() | ForEach-Object {
    Write( 'Key={0}, Value={1}' -f $Elem.Key, $Elem.Value)
    $htab['Olrik']= 'olrik@fabrikam.com'
}

Now, as mentioned, the value can be any object type. By any type, we mean we can also create a hashtable with mailbox information, using the userPrincipalName as the key:

$MailboxData= @{}
Get-ExoMailbox -ResultSize Unlimited | ForEach-Object {
    $MailboxData[ $_.userPrincipalName]= $_
}

This function iterates over all mailbox objects returned by Get-ExoMailbox and sets one item in the MailboxData hash table specified by the userPrincipalName. For simplicity, we set the value to all mailbox properties. We have now created a lookup table that allows us to determine if a mailbox is associated with a specific userPrincipalName. If so, the item’s value contains the mailbox properties we have set.

I frequently use lookup tables when working with Microsoft 365 to cache data, avoiding redundant calls to Exchange Online and reducing the likelihood of being throttled. Another practical example is caching recipient information returned by Get-ExoRecipient, for example, when you want to report on group members. As users or groups might be members of more than one group, you can suffice by fetching the recipient object once and reusing its information from the lookup table when necessary. Take a look at this simplified example using a function to fetch user data from Exchange Online or cache:

$Lookup= @{}

Function Get-MyUser( $Id) {
  If(-not $Lookup[ $Id]) {
    $Lookup[ $Id]= Get-ExoRecipient -Identity $Id
  }
  $Lookup[ $Id]
}

Get-Group -RecipientTypeDetails MailUniversalDistributionGroup | ForEach-Object {
    $_.Members | ForEach-Object { Get-MyUser -Id $_ }
}

Here is a quick walkthrough of this code sample:

  • First, initialize a (global) variable we call $Lookup as a hash table; this variable will function as a lookup table.
  • Then, define a simplified function to fetch objects, taking one identity parameter. The function tries to look up the global $Lookup hash table value using Id as the key. If an item is not found (the key does not exist), we have not fetched this recipient before, and we perform a Get-ExoRecipient using the Id. The result is stored in $Lookup using the identity as the key. Finally, we return the value from $Lookup; this can then either be from the lookup table or the result we just added to this table.
  • The main code is fetching all mail-enabled security groups. For the members’ property of each group, we call our function using each member’s identity.

This example demonstrates the principle, but it does not end here for tenants at scale. You may want to make it less memory-intensive, for example, by storing only the needed recipient object properties.

Bringing Order to Chaos

Where arrays are ordered, that is not the case with standard hash tables. The order in which elements are enumerated is not necessarily the order in which they are defined. You can, however, force hash tables to be ordered. The simplest way is to prefix the @ sign when creating a hashtable with [ordered], which instructs PowerShell to create a dictionary instead of a regular hashtable.

Using the hashtable, we defined in one of the earlier samples:

$htabOrdered= [ordered]@{
    'Philip'= 'philip@contoso.com'
    'Francis'= 'francis@contoso.com'
    'Olrik'= 'olrik@contoso.com'
}

PS> $htabOrdered

Name                           Value
----                           -----
Philip                         philip@contoso.com
Francis                        francis@contoso.com
Olrik                          olrik@contoso.com

When inspecting the object type using getType(), you will see that the type is OrderedDictionary instead of Hashtable.

Dynamic Arrays

We saw earlier that arrays have their limitations. Luckily, the .NET Framework, and thus PowerShell, offers a wide range of collection types that provide more flexibility and performance, albeit at the cost of some readability and requiring code changes.

A collection type often used is ArrayList, or System.Collections.ArrayList when referenced by its full name. Using this type, however, is no longer recommended. Therefor, I advise you to use superseding types, such as System.Collections.Generic.List.

Because Generic.List does not have accelerators, you need to reference it by its full name. Creating a variable of this type might look strange at first, but it is a matter of getting used to. Also note that unlike Array and ArrayList, Generic.List is strongly typed, meaning you need to specify the object type you want to store. While this may sound inconvenient, it comes with a performance benefit. And since everything in PowerShell is an object, when needed, you can always specify the object as the type:

$GL= [System.Collections.Generic.List[object]]::new()
$GL.add( 'One')
$GL.add( 'Two')
# Remove item, suppress output indicating successful removal
$GL.remove( 'One') | Out-Null
$GL

If you are wondering about the performance when using Generic.List instead of a good old Array, Microsoft published test results when adding items to an Array or Generic.List. You can find it in the Array Addition section as part of the Scripting Performance Considerations document on Learn. To repeat the finding: Array addition is dramatically slower than adding to List collections.

Other noteworthy types are:

  • System.Collections.SortedList creates a permanently sorted list. Sort order is based on keys.
    Example: $SortedList.Add( ‘key’, ‘secret’)
  • System.Collections.Stack creates a stack of objects following the Last In – First Out principle, e.g.
    $Stack.Push( ‘Fred’)
    $Stack.Pop()
  • System.Collections.Queue creates a queue of objects following the First In – First Out principle, e.g.
    $Queue.Enqueue( ‘Work Item 1’)
    $Queue.Enqueue( ‘Work Item 2’)
    $Queue.Dequeue()
  • System.Collections.Generic.Dictionary[string, object] is a more strongly typed hashtable, offering better performance. It does require PowerShell v7, e.g.
    $dictionary.Add(‘key’, ‘value’)

Finally, I have stored all the code that is mentioned throughout the series in a GitHub repository for your convenience. You can find it here.

About the Author

Michel De Rooij

Michel de Rooij, with over 25 years of mixed consulting and automation experience with Exchange and related technologies, is a consultant for Rapid Circle. He assists organizations in their journey to and using Microsoft 365, primarily focusing on Exchange and associated technologies and automating processes using PowerShell or Graph. Michel's authorship of several Exchange books and role in the Office 365 for IT Pros author team are a testament to his knowledge. Besides writing for Practical365.com, he maintains a blog on eightwone.com with supporting scripts on GitHub. Michel has been a Microsoft MVP since 2013.

Leave a Reply