正则表达式匹配模式,只要它不是由不同的图案pceded $ P $是由、它不、图案、不同

2023-09-04 00:48:05 作者:不开外挂

我需要一个正则表达式是用于文本替换。举例:文本匹配是农行(可以用方括号包围),替代文本是 DEF 。这是基本够用。在复杂的是,我不要要匹配的农行文本时,它是由图案pceded $ P $ \ [[\ D] + \] \ - 换句话说,当它是由字pceded或设置在括号字$ P $,后跟一个句点。

I need a regex that is to be used for text substitution. Example: text to be matched is ABC (which could be surrounded by square brackets), substitution text is DEF. This is basic enough. The complication is that I don't want to match the ABC text when it is preceded by the pattern \[[\d ]+\]\. - in other words, when it is preceded by a word or set of words in brackets, followed by a period.

下面是源文本的一些实例被匹配,结果,后正则表达式替换将作出:

Here are some examples of source text to be matched, and the result, after the regex substitution would be made:

1. [xxx xxx].[ABC] > [xxx xxx].[ABC] (does not match - first part fits the pattern)
2. [xxx xxx].ABC   > [xxx xxx].ABC   (does not match - first part fits the pattern)
3. [xxx.ABC        > [xxx.DEF        (matches - first part has no closing bracket)
4. [ABC]           > [DEF]           (matches - no first part)
5. ABC             > DEF             (matches - no first part)
6. [xxx][ABC]      > [xxx][DEF]      (matches - no period in between)
7. [xxx]. [ABC]    > [xxx] [DEF]     (matches - space in between)

什么它归结为是:我怎么可以指定preceding模式,当present如将 prevent 比赛?什么模式是在这种情况下? (C#味正则表达式)

What it comes down to is: how can I specify the preceding pattern that when present as described will prevent a match? What would the pattern be in this case? (C# flavor of regex)

推荐答案

您要负看隐藏前pression。这些看起来像; ,所以(小于格局?!):

You want a negative look-behind expression. These look like (?<!pattern), so:

(?<!\[[\d ]+\]\.)\[?ABC\]?

请注意,这并不强制匹配的一对方括号ABC;它只是允许前一个可选的开放式支架和后一个可选的右括号。如果你想强行对匹配或者没有,你就必须使用轮换:

Note that this does not force a matching pair of square brackets around ABC; it just allows for an optional open bracket before and an optional close bracket after. If you wanted to force a matching pair or none, you'd have to use alternation:

(?<!\[[\d ]+\]\.)(?:ABC|\[ABC\])

本使用非捕获括号来分隔交替。如果你想真正捕捉ABC,你可以在把它转换成一个捕获组。

This uses non-capturing parentheses to delimit the alternation. If you want to actually capture ABC, you can of turn that into a capture group.

ETA:第一八佰伴pression似乎要失败的原因是,它是匹配的 ABC] ,这是不是$ p $由禁止文本pceded。开放式托架 [是可选的,所以它只是不匹配。解决这个问题的办法是转移可选开括号 [进入负查找背后断言,像这样:

ETA: The reason the first expression seems to fail is that it is matching on ABC], which is not preceded by the prohibited text. The open bracket [ is optional, so it just doesn't match that. The way around this is to shift the optional open bracket [ into the negative look-behind assertion, like so:

(?<!\[[\d ]+\]\.\[?)ABC\]?

它匹配和的例子并不:

An example of what it matches and doesn't:

[123].[ABC]: fail (expected: fail)
[123 456].[ABC]: fail (expected: fail)
[123.ABC: match (expected: match)
    matched: ABC
ABC: match (expected: match)
    matched: ABC
[ABC]: match (expected: match)
    matched: ABC]
[ABC[: match (expected: fail)
    matched: ABC

试图让开括号的presence [强制匹配的右括号] ,作为第二个模式预期,是麻烦,但是这似乎工作:

Trying to make the presence of an open bracket [ force a matching close bracket ], as the second pattern intended, is trickier, but this seems to work:

(?:(?<!\[[\d ]+\]\.\[)ABC\]|(?<!\[[\d ]+\]\.)(?<!\[)ABC(?!\]))

它匹配和的例子并不:

An example of what it matches and doesn't:

[123].[ABC]: fail (expected: fail)
[123 456].[ABC]: fail (expected: fail)
[123.ABC: match (expected: match)
    matched: ABC
ABC: match (expected: match)
    matched: ABC
[ABC]: match (expected: match)
    matched: ABC]
[ABC[: fail (expected: fail)

用这个code产生的例子:

The examples were generated using this code:

// Compile and run with: mcs so_regex.cs && mono so_regex.exe
using System;
using System.Text.RegularExpressions;

public class SORegex {
  public static void Main() {
    string[] values = {"[123].[ABC]", "[123 456].[ABC]", "[123.ABC", "ABC", "[ABC]", "[ABC["};
    string[] expected = {"fail", "fail", "match", "match", "match", "fail"};
    string pattern = @"(?<!\[[\d ]+\]\.\[?)ABC\]?";  // Don't force [ to match ].
    //string pattern = @"(?:(?<!\[[\d ]+\]\.\[)ABC\]|(?<!\[[\d ]+\]\.)(?<!\[)ABC(?!\]))";  // Force balanced brackets.
    Console.WriteLine("pattern: {0}", pattern);
    int i = 0;
    foreach (string text in values) {
      Match m = Regex.Match(text, pattern);
      bool isMatch = m.Success;
      Console.WriteLine("{0}: {1} (expected: {2})", text, isMatch? "match" : "fail", expected[i++]);
      if (isMatch) Console.WriteLine("\tmatched: {0}", m.Value);
    }
  }
}