最好的方式找到位置,在流去给定的字节序列开始最好的、序列、字节、位置

2023-09-11 22:37:09 作者:先森,借过一下

您如何看待什么是找到在System.Stream位置的最佳方法,其中给定的字节序列开始(第一次出现):

How do you think what is the best way to find position in the System.Stream where given byte sequence starts (first occurence):

public static long FindPosition(Stream stream, byte[] byteSequence)
{
    long position = -1;

    /// ???
    return position;
}

P.S。该simpliest但最快的解决方法是preffered。 :)

P.S. The simpliest yet fastest solution is preffered. :)

推荐答案

我已经达到了这个解决方案。

I've reached this solution.

我做了一些基准测试用,这是 3.050 KB 38803行的ASCII文件。 与搜索字节 阵列 22字节在过去的该文件的行我已经得到了结果,在约 2.28 秒(在缓慢/旧机)。

I did some benchmarks with an ASCII file that was 3.050 KB and 38803 lines. With a search byte array of 22 bytes in the last line of the file I've got the result in about 2.28 seconds (in a slow/old machine).

public static long FindPosition(Stream stream, byte[] byteSequence)
{
    if (byteSequence.Length > stream.Length)
        return -1;

    byte[] buffer = new byte[byteSequence.Length];

    using (BufferedStream bufStream = new BufferedStream(stream, byteSequence.Length))
    {
        int i;
        while ((i = bufStream.Read(buffer, 0, byteSequence.Length)) == byteSequence.Length)
        {
            if (byteSequence.SequenceEqual(buffer))
                return bufStream.Position - byteSequence.Length;
            else
                bufStream.Position -= byteSequence.Length - PadLeftSequence(buffer, byteSequence);
        }
    }

    return -1;
}

private static int PadLeftSequence(byte[] bytes, byte[] seqBytes)
{
    int i = 1;
    while (i < bytes.Length)
    {
        int n = bytes.Length - i;
        byte[] aux1 = new byte[n];
        byte[] aux2 = new byte[n];
        Array.Copy(bytes, i, aux1, 0, n);
        Array.Copy(seqBytes, aux2, n);
        if (aux1.SequenceEqual(aux2))
            return i;
        i++;
    }
    return i;
}
 
精彩推荐