正则表达式的特定标签标签、正则表达式

2023-09-06 10:11:03 作者：-/不管怎样活着就好

您好！

我的工作经常EX pression在.NET项目中得到一个特定的标记。我想整个DIV标签，其内容匹配：

I'm working on a regular expression in a .NET project to get a specific tag. I would like to match the entire DIV tag and its contents:

<html>
   <head><title>Test</title></head>
   <body>
     <p>The first paragraph.</p>
     <div id='super_special'>
        <p>The Store paragraph</p>
     </div>
     </body>
  </head>

code：

Code:

    Regex re = new Regex("(<div id='super_special'>.*?</div>)", RegexOptions.Multiline);


    if (re.IsMatch(test))
        Console.WriteLine("it matches");
    else
        Console.WriteLine("no match");

我要匹配这样的：

I want to match this:

<div id="super_special">
   <p>Anything could go in here...doesn't matter.  Let's get it all</p>
</div>

我想。应该让所有的字符，但它似乎有与cariage回报麻烦。什么是我的正则表达式失踪了？

I thought . was supposed to get all characters, but it seems to having trouble with the cariage returns. What is my regex missing?

感谢。

推荐答案

外的开箱即用，无需特殊改性剂，大多数正则表达式实现不超越结束的行来匹配文本。你或许应该看看你使用这样的修改正则表达式引擎的文档。

Out-of-the-box, without special modifiers, most regex implementations don't go beyond the end-of-line to match text. You probably should look in the documentation of the regex engine you're using for such modifier.

我有另外一个建议：提防贪婪！传统上，正则表达式的是贪婪的，这意味着你的正则表达式可能会匹配这样的：

I have one other advice: beware of greed! Traditionally, regex are greedy which means that your regex would probably match this:

<div id="super_special">
  I'm the wanted div!
</div>
<div id="not_special">
  I'm not wanted, but I've been caught too :(
</div>

您应该检查是否有不贪婪修改器，让你的正则表达式将停止在第一 occurence ＆LT匹配的文本; / DIV＆GT; ，而不是在最后之一。

You should check for a "not-greedy" modifier, so that your regex would stop matching text at the first occurence of </div>, not at the last one.

此外，正如其他人所说，考虑使用正则表达式的一个HTML解析器来代替。它将为您节省大量的头痛。

Also, as others have said, consider using an HTML parser instead of regexes. It will save you a lot of headache.

编辑：即使是一个非贪婪正则表达式不会按预期或者，如果＆LT; DIV＆GT; s的嵌套！另一个原因考虑使用一个HTML解析器。的

even a non-greedy regex wouldn't work as expected either, if <div>s are nested! Another reason to consider using an HTML parser.

上一篇：.NET抽象类。抽象类、NET

下一篇：更新 Cocos2d 中的标签值标签、Cocos2d

相关推荐

精彩图集

精彩推荐

图片推荐