什么是创建一个校验和在C#大文件的最快方法创建一个、大文件、最快、方法

2023-09-02 20:44:56 作者:暮

我有过一些计算机同步大文件。这些文件可以高达6GB的大小。同步将手动完成每隔几个星期。我不能把文件名考虑,因为他们可以随时更改。

I have to sync large files across some machines. The files can be up to 6GB in size. The sync will be done manually every few weeks. I cant take the filename into consideration because they can change anytime.

我的计划是在目标计算机上,并在源电脑上,比复制与校验和的所有文件,这是不是已经在目的地,目的地创建校验。 我的第一次尝试是这样的:

My plan is to create checksums on the destination PC and on the source PC and than copy all files with a checksum, which are not already in the destination, to the destination. My first attempt was something like this:

using System.IO;
using System.Security.Cryptography;

private static string GetChecksum(string file)
{
    using (FileStream stream = File.OpenRead(file))
    {
        SHA256Managed sha = new SHA256Managed();
        byte[] checksum = sha.ComputeHash(stream);
        return BitConverter.ToString(checksum).Replace("-", String.Empty);
    }
}

问题是运行时: - 与SHA256有1,6 GB的文件 - >20分钟 - 用MD5与1,6 GB的文件 - >6.15分钟

The Problem was the runtime: - with SHA256 with a 1,6 GB File -> 20 minutes - with MD5 with a 1,6 GB File -> 6.15 minutes

有没有更好的 - 快 - 的方式来获得校验(也许有更好的散列函数)

Is there a better - faster - way to get the checksum (maybe with a better hash function)?

推荐答案

这里的问题是, SHA256Managed 读4096个字节的时间(从的FileStream 和覆盖读(字节[],INT,INT)来看看它从文件流了多少次读取),这实在太少了,缓冲磁盘IO。

The problem here is that SHA256Managed reads 4096 bytes at a time (inherit from FileStream and override Read(byte[], int, int) to see how much it reads from the filestream), which is too small a buffer for disk IO.

要加快进度(2分钟散列2Gb的我机SHA256,对MD5 1分钟文件)包装的FileStream BufferedStream ,并设置合理的大小的缓冲区的大小(我试过〜1 MB缓存):

To speed things up (2 minutes for hashing 2 Gb file on my machine with SHA256, 1 minute for MD5) wrap FileStream in BufferedStream and set reasonably-sized buffer size (I tried with ~1 Mb buffer):

// Not sure if BufferedStream should be wrapped in using block
using(var stream = new BufferedStream(File.OpenRead(filePath), 1200000))
{
    // The rest remains the same
}