净正则表达式:什么是单词字符\ W?单词、字符、正则表达式

2023-09-02 01:35:47 作者:被撕碎了的回忆

简单问题: 什么是模式的单词字符 W 在C#,. NET?

Simple question: What is the pattern for the word character w in c#, .net?

我首先想到的是它匹配 [A-ZA-Z0-9 _] 和文档告诉我:

My first thought was that it matches [A-Za-z0-9_] and the documentation tells me:


Character class    Description          Pattern     Matches
w                 Matches any          w          "I", "D", "A", "1", "3"
                   word character.                  in "ID A1.3"

这是不是非常有帮助。 而 W 似乎符合 AOU ,太。还有什么?有没有更好的(精确)的定义可用?

which is not very helpful. And w seems to match äöü, too. What else? Is there a better (exact) definition available?

推荐答案

从文档:

W 匹配任何单词字符。单词字符是任何的下表中列出的统一code类会员。

Word Character: w

w matches any word character. A word character is a member of any of the Unicode categories listed in the following table.    LL (字母,小写)    (字母,大写)    中尉(字母,首字母大写)    (字母,其它)    Lm的(字母,修饰符)    的Nd (数字,十进制数)    PC (标点符号,连接器)      在此类别包括十个字符,最常用的,其中是LOWLINE字符(_),U + 005F。    Ll (Letter, Lowercase) Lu (Letter, Uppercase) Lt (Letter, Titlecase) Lo (Letter, Other) Lm (Letter, Modifier) Nd (Number, Decimal Digit) Pc (Punctuation, Connector) This category includes ten characters, the most commonly used of which is the LOWLINE character (_), u+005F.

如果指定符合ECMAScript的行为, W 等同于 [A-ZA-Z_0-9]

If ECMAScript-compliant behavior is specified, w is equivalent to [a-zA-Z_0-9].

统一code字符数据库 统一code中的人物标点,连接器类别

See also

Unicode Character Database Unicode Characters in the 'Punctuation, Connector' Category

 
精彩推荐
图片推荐