字符串类型.NET与字符数组数组、字符串、字符、类型

2023-09-06 06:00:28 作者:ッ过期不候

我一直使用某些程序在这里工作了一个月,现在有很多的字符串解析和这样的事情的。我一直在建议使用一个字符数组这个东西,而不是一个字符串,因为字符数组更快。我明白了为什么一个字符数组是快,但它是什么关于字符串类型,使得它更慢?哪些数据结构是实施和有没有什么办法,使之尽可能快地字符数组?

I've been working with some programs here at work for about a month now that have a lot of string parsing and such going on. I've been advised to use a char array for this stuff as opposed to a string because the char array is faster. I understand why a char array is fast, but what is it about the string type that makes it slower? What data structure is it implementing and is there any way to make it as fast as a char array?

推荐答案

最明显的区别是,字符串是不可改变的。所以你不能改变它的部分,需要在每次修改创建一个全新的副本。

The most obvious difference is that string is immutable. So you can't modify parts of it and need to create a completely new copy on each modification.

字符串本身有一个非常特殊的实现(这是一个可变大小的类),而不是由数组支持。我不明白为什么只读到字符进入S在一个字符串应该是缓慢的。

String itself has a very special implementation (it's a variable size class) and is not backed by an array. I see no reason why read-only access to chars in a string should be slow.

所以,如果你想改变一个字符串的一小部分,你需要为使用的StringBuilder 的char [] 。这两个的char [] 是/是因为的StringBuilder 的补充鉴定和迂回更快。但由于这是一个实现细节,可能已经改变,因为我上次测试它。

So if you want to change small parts of a string, you need to use either StringBuilder or char[]. Of these two char[] is/was faster since StringBuilder has additional verifications and indirections. But since this is an implementation detail it might have changed since I last tested it.

只是基准它,因为.NET 4中设置的的char [] 的四倍左右的速度相比,的StringBuilder 。但两者每秒可以做200多畅想任务,所以它很少在实践中重要的。

Just benchmarked it, and as of .NET 4 setting a member of char[] is about four times as fast compared to a StringBuilder. But both can do more than 200 milion assignments per second, so it rarely matters in practice.

的char [] 稍快(对于我的测试code 25%),从串读。从的StringBuilder ,另一方面读慢(3倍),比字符阅读[]

Reading from a char[] is slightly faster (25% for my test code) that reading from string. Reading from StringBuilder on the other hand is slower (a factor of 3) than reading from char[].

在所有的基准,我忽略了我的其他code中的开销。这意味着,我的测试低估了分歧位。

In all benchmarks I neglected the overhead of my other code. This means that my test underestimates the differences a bit.

我的结论是,虽然的char [] 比替代它的事项,如果你打算在几百兆字节每秒的速度更快。

My conclusion is that while char[] is faster than the alternatives it only matters if you're going over hundreds of megabytes per second.

//Write StringBuilder
StringBuilder sb = new StringBuilder();
sb.Length = 256;
for(int i=0; i<1000000000; i++)
{
    int j = i&255;
    sb[j] = 'A';
}

//Write char[]
char[] cs = new char[256];
for(int i=0; i<1000000000; i++)
{
    int j = i&255;
    cs[j] = 'A';
}

// Read string
string s = new String('A',256);
int sum = 0;
for(int i=0; i<1000000000; i++)
{
    int j = i&255;
    sum += s[j];
}

//Read char[]
char[] s = new String('A',256).ToCharArray();
int sum = 0;
for(int i=0; i<1000000000; i++)
{
    int j = i&255;
    sum += s[j];
}

//Read StringBuilder
StringBuilder s= new StringBuilder(new String('A',256));
int sum = 0;
for(int i=0; i<1000000000; i++)
{
    int j = i&255;
    sum += s[j];
}

(是的,我知道我的标杆code不是很好,但我不认为它使一个很大的区别。)

(Yes, I know my benchmark code isn't very good, but I don't think it makes a big difference.)