powershell 快速读取二进制大文件内容

PSTip Reading file content as a byte array

Note: This tip requires PowerShell 2.0 or above.

A few days ago I had to write a script to upload files to a remote FTP server. I needed to read the file (0.7 mb) and store it as a byte array. My first attempt to do this was to use the Get-Content cmdlet.

Get-Content c:\test.log -Encoding Byte

It works great but there’s only one downside to it–it is painfully slow–and I quickly resorted to an alternative method using a .NET class:

 
[System.IO.File] ::ReadAllBytes( 'c:\test.log' )

ReadAllBytes() worked incredibly fast in compare to the cmdlet. I measured how much it took for each command to finish. Get-Content took 18.308045 seconds to complete while ReadAllBytes() took only 0.2811065!

I had a time limit to finish the script so I left it with the .NET method and decided to check later what can be done to make Get-Content perform faster. Later on I came back to it and checked the help of Get-Content. The answer was found in the ReadCount parameter. The default behavior is sending one line at a time, in my case it was one byte at a time.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
PS> Get-Help Get-Content -Parameter ReadCount
 
-ReadCount
     Specifies how many lines of content are sent through the pipeline at a time. The default value is 1. A value of 0
     (zero) sends all of the content at one time.
 
     This parameter does not change the content displayed, but it does affect the time it takes to display the content.
     As the value of ReadCount increases, the time it takes to return the first line increases, but the total time for
     the operation decreases. This can make a perceptible difference in very large items.
 
     Required?                    false
     Position ?                    named
     Default value                1
     Accept pipeline input?       true (ByPropertyName)
     Accept wildcard characters?  false

I changed it to 0 so all content can be read in a single operation and then I measured again its execution time.

1
Get-Content c:\test.log -Encoding Byte -ReadCount 0

At first glance the result looked very similar to the .NET method, but to my big surprise, it was even faster to complete–only 0.2384541 seconds!


# Read the entire file to an array of bytes.
$bytes = [System.IO.File]::ReadAllBytes("C:/lmt816.exe.td")

write first 16 bytes to file.
[System.IO.File]::WriteAllBytes("Desktop/1.dat",$bytes[0..15])


for($i=1;$i -le $text.length;$i++)
{
    if(($text[$i] -eq 0xFF) -and ($text[$i+1] -eq 0xFF)  -and ($text[$i+2] -eq 0xFF)){
        $i
    }
}




function split($inFile, $outPrefix, [Int32] $bufSize){
 $stream = [System.IO.File]::OpenRead($inFile)
 $chunkNum = 1
 [Int64]$curOffset = 0
 $barr = New-Object byte[] $bufSize

 while( $bytesRead = $stream.Read($barr,0,$bufsize)){
 $outFile ="$outPrefix$chunkNum"
 $ostream = [System.IO.File]::OpenWrite($outFile)
 $ostream.Write($barr,0,$bytesRead);
 $ostream.close();
 echo "wrote $outFile"
 $chunkNum += 1
 $curOffset += $bytesRead
 $stream.seek($curOffset,0);
 }
}




 $stream = [System.IO.File]::OpenRead("F:/software/ubuntu-16.04.1-desktop-amd64.iso")
 $chunkNum = 0
 [Int64]$curOffset = 0
 $barr = New-Object byte[] 1024000
 
 while($bytesRead = $stream.Read($barr,0,1024000)){
   $chunkNum += 1
   $curOffset += $bytesRead
   $stream.seek($curOffset,0);
 }

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值