X.ToCharArray()的长度等于的GetBytes(X).Length长度、ToCharArray、Length、GetBytes

2023-09-04 04:41:01 作者:风向决定发型

string s = "test";
int charCount = s.ToCharArray().Length;
int byteCount = System.Text.Encoding.Default.GetBytes(s).Length;

当可(charCount!= BYTECOUNT)发生的呢?我相信情况下,统一code字符,但不是在一般的情况。

When can (charCount != byteCount) happen? I believe in case of Unicode characters but not in general case.

.NET支持的Uni code字符,但为默认(System.Text.Encoding.Default)为.NET编码? System.Text.Encoding.Default显示System.Text.SBCS codePageEncoding因为这是单字节编码。

.NET supports Unicode characters but is that the default(System.Text.Encoding.Default) encoding for .NET? "System.Text.Encoding.Default" shows "System.Text.SBCSCodePageEncoding" as the encoding which is single byte.

推荐答案

默认的编码是UTF8,可包含1-4个字节的每个字符的空间。

The default encoding is UTF8 which can contain 1-4 bytes of space per character.

charCount和BYTECOUNT将不等于当字符串s任意字符使用超过1个字节。

charCount and byteCount will not be equal when any character in string s uses more than 1 byte.

要强制使用4个字节你可以检查使用统一code编码,然后BYTECOUNT会= 8。

To force the use of 4 bytes you can check using the Unicode encoding, then byteCount will = 8.

int byteCount = System.Text.Encoding.Unicode.GetBytes(s).Length;