PowerShell Performance-Test: File Reading - Which method reads a file the fastest?
There are several methods and ways to read the content of a file in PowerShell, with the most commonly used being Get-Content.
I have to point out that all of the tests regards to read a file line by line. Thanks for that hint to: nohwnd.
However, there are faster ways to read files that are not as widely known. In this article, we will compare six different methods for reading files in PowerShell and determine which one is the fastest. The following methods were tested:
- Get-Content
- [System.IO.StreamReader]::ReadLine() - Classic Way
- [System.IO.StreamReader]::ReadLine() - Peek
- [System.IO.StreamReader]::ReadLine() - End Of Stream
- Switch-Statement -File All tests were performed using a file containing 500000 lines.
Get-Content
The most typical PowerShell way to read a file is to use Get-Content:
$File = Get-Content $LargeFile
foreach ($Line in $File){
$lines++
}
In this method, we first have to store the file content into a variable, and then we can iterate through it.
System.IO.File::ReadAllLines()
The second most common way is to use the [System.IO.File] class with the ReadAllLines() method:
$File = [system.io.file]::ReadAllLines($LargeFile)
foreach ($Line in $File){
$lines++
}
Similarly to Get-Content, we have to read all lines first and then iterate through them.
[System.IO.StreamReader]::ReadLine() - Classic Way
The following three methods are variations of the System.IO.StreamReader class with the ReadLine() method. The most common way to use it is:
$sread = [System.IO.StreamReader]::new($largefile)
while ($sread.ReadLine()) {
$lines++
}
In this method, we initialize the StreamReader object and iterate right through, which should be quite fast. However, issues can appear with this method, such as empty lines not being read, as demonstrated in Evgenij Smirnov`s Blog-Post. To solve this problem, Evgenij provides two other methods using the StreamReader class.
[System.IO.StreamReader]::ReadLine() - Peek
The first of these methods is the Peek() method:
$sread = [System.IO.StreamReader]::new($largefile)
while ($sread.Peek() -gt -1) {
$sread.ReadLine() | Out-Null
$lines++
}
This method uses the Peek() method to check if there are any characters left to read, and then reads the line and increments the line count.
[System.IO.StreamReader]::ReadLine() - End Of Stream
The second method uses the EndOfStream property:
$sread = [System.IO.StreamReader]::new($largefile)
while ($sread.EndOfStream -eq $false) {
$line = $sread.ReadLine()
$lines++
}
This method checks if the EndOfStream property is false and then reads the line and increments the line count.
Switch-Statement -File
One uncommon but useful way to read a file in PowerShell is to use a switch statement with the -File parameter. The switch statement can then be used to perform different actions based on the contents of the file.
switch -File ($LargeFile){
Default {
$lines++
}
}
To compare the performance of this method with other popular ways of reading files, we conducted a test on a file with 500,001 lines. Here are the results:
Lines | Method | Time |
---|---|---|
500001 | Get-Content | 10815 |
500001 | [System.IO.File]::ReadAllLines() | 3374 |
500001 | [System.IO.StreamReader]::ReadLine() - Classic Way | 3644 |
500001 | [System.IO.StreamReader]::ReadLine() - Peek | 10278 |
500001 | [System.IO.StreamReader]::ReadLine() - End Of Stream | 4619 |
500001 | Switch-Statement -File | 1356 |
Surprisingly, the switch statement with the -File parameter was the fastest method. The peek and end-of-stream methods had a much larger difference in performance.
It's always good to keep different methods in mind when working with files and choosing the right one for the task at hand. Who knows, in the future we might discover even faster ways to read files!
Thats all for now.
If you have any thoughts or feedback on this topic, feel free to share them with me on Twitter at Christian Ritter. Best regards,
Christian.
Comments (0)