PLATFORM
  • Tails

    Create websites with TailwindCSS

  • Wave

    Start building the next great SAAS

  • Pines

    Alpine & Tailwind UI Library

  • Auth

    Plug'n Play Authentication for Laravel

  • Designer comingsoon

    Create website designs with AI

  • DevBlog comingsoon

    Blog platform for developers

  • Static

    Build a simple static website

  • SaaS Adventure

    21-day program to build a SAAS

3 ways to sort a list unique

3 ways to sort a list unique

Today I would like to show you a performance table comparing different ways to sort a list/array unique.

Sometimes it is necessary to sort a list or an array unique to get rid of duplicates this can be a time consuming task


In this post we will have a look at 3 ways to sort a list unique.

  • Sort-Object -Unique
  • Get-Unique
  • HashSet-Class

First we will create 3 different lists containing random strings in several sizes (small, medium, large)

#List elements
$ListOptionA ="Blue","Red","Green"
$ListOptionb ="Dog","Horse","Cat"

#Create a small set of strings based on list elemtents and a random number
$ListSmall = (0..100).ForEach({
 "$($ListOptionA[$(Get-Random -Minimum 0 -Maximum ($ListOptionA.count-1))])_$($ListOptionB[$(Get-Random -Minimum 0 -Maximum ($ListOptionB.count-1))])_$(Get-Random -Maximum 10 -Minimum 0)"
})

#Create a medium set of strings based on list elemtents and a random number
$ListMedium = (0..10000).ForEach({
 "$($ListOptionA[$(Get-Random -Minimum 0 -Maximum ($ListOptionA.count-1))])_$($ListOptionB[$(Get-Random -Minimum 0 -Maximum ($ListOptionB.count-1))])_$(Get-Random -Maximum 10 -Minimum 0)"
})

#Create a large set of strings based on list elemtents and a random number
$ListLarge = (0..1000000).ForEach({
 "$($ListOptionA[$(Get-Random -Minimum 0 -Maximum ($ListOptionA.count-1))])_$($ListOptionB[$(Get-Random -Minimum 0 -Maximum ($ListOptionB.count-1))])_$(Get-Random -Maximum 10 -Minimum 0)"
})

Now we can start to fetch results:

$Results = New-Object -TypeName System.Collections.Generic.List[PSCustomObject]

$ListOptions = "Small","Medium","Large"
$Method = "Sort-Object -Unique"
$Index = 0
($ListSmall,$ListMedium,$ListLarge).ForEach({
    $StopWatch = New-Object System.Diagnostics.Stopwatch
    $StopWatch.Start()
    $UniqueList = $($_ | Sort-Object -Unique)
    $StopWatch.Stop()
    $Results.Add([PSCustomObject]@{
        MethodName = $Method
        ListSize = "$($ListOptions[$Index]) $($_.Count)"
        Result = $UniqueList.count
        TimeElapsed = $StopWatch.Elapsed
        TimeElapsedMS = $StopWatch.ElapsedMilliseconds
    })
    $Index++
})
$Method = "get-unique"
$Index = 0
($ListSmall,$ListMedium,$ListLarge).ForEach({
    $StopWatch = New-Object System.Diagnostics.Stopwatch
    $StopWatch.Start()
    $UniqueList = $($_ | Sort-Object | get-Unique)
    $StopWatch.Stop()
    $Results.Add([PSCustomObject]@{
        MethodName = $Method
        ListSize = "$($ListOptions[$Index]) $($_.Count)"
        Result = $UniqueList.count
        TimeElapsed = $StopWatch.Elapsed
        TimeElapsedMS = $StopWatch.ElapsedMilliseconds
    })
    $Index++
})
$Method = "Hashset"
$Index = 0
($ListSmall,$ListMedium,$ListLarge).ForEach({
    $StopWatch = New-Object System.Diagnostics.Stopwatch
    $StopWatch.Start()
    $HashSet = New-Object System.Collections.Generic.HashSet[string]
    foreach($Listelement in $_){
        $HashSet.Add($Listelement) | Out-Null
    }
    $StopWatch.Stop()
    $Results.Add([PSCustomObject]@{
        MethodName = $Method
        ListSize = "$($ListOptions[$Index]) $($_.Count)"
        Result = $HashSet.count
        TimeElapsed = $StopWatch.Elapsed
        TimeElapsedMS = $StopWatch.ElapsedMilliseconds
    })
    $Index++
})

The result from this run looks on my machine like this: MethodName|ListSize|Result|TimeElapsed|TimeElapsedMS ----------|--------|------|-----------|------------- Sort-Object -Unique|Small 101|34|00:00:00.0003934|0 Sort-Object -Unique|Medium 10001|40|00:00:00.0582319|58 Sort-Object -Unique|Large 1000001|40|00:00:12.6371431|12637 get-unique|Small 101|34|00:00:00.0005651|0 get-unique|Medium 10001|40|00:00:00.0877467|87 get-unique|Large 1000001|40|00:00:15.0103995|15010 Hashset|Small 101|34|00:00:00.0050367|5 Hashset|Medium 10001|40|00:00:00.0995172|99 Hashset|Large 1000001|40|00:00:07.8959100|7895


Which conclusion can we get from this table above? At first not one of them is the best solution for any situation. We should choose Sort-Object -unique for lists from 0 up to 1000 elements. If the list increases dramatically we should choose the Hashset approach. Also we should not use get-unique, because to make this work we have to sort the list first and this is more time consuming as to use the plain sort-object method like you can see this in the result-table.

Best regards, Christian

Comments (0)

loading comments