使用常规的前pressions删除HTML标签的Flex / AS3常规、标签、pressions、HTML

2023-09-09 21:52:03 作者:雨中哭泣

我写的Flex(AS3)一个HTML解析器,我需要删除一些不需要的HTML标签。

I'm writing a HTML parser in Flex (AS3) and I need to remove some HTML tags that are not needed.

例如,我想从这个code删除的div:

For example, I want to remove the divs from this code:

           <div>
              <div>
                <div>
                  <div>
                    <div>
                      <div>
                        <div>
                          <p style="padding-left: 18px; padding-right: 20px; text-align: center;">
                            <span></span>
                            <span style=" font-size: 48px; color: #666666; font-style: normal; font-weight: bold; text-decoration: none; font-family: Arial;">20% OFF.</span>
                            <span> </span>
                            <span style=" font-size: 48px; color: #666666; font-style: normal; font-weight: normal; text-decoration: none; font-family: Arial;">Do it NOW!</span>
                            <span> </span>
                          </p>
                        </div>
                      </div>
                    </div>
                  </div>
                </div>
              </div>
            </div>

和像这样结束:

                      <div>
                          <p style="padding-left: 18px; padding-right: 20px; text-align: center;">
                            <span></span>
                            <span style=" font-size: 48px; color: #666666; font-style: normal; font-weight: bold; text-decoration: none; font-family: Arial;">20% OFF.</span>
                            <span> </span>
                            <span style=" font-size: 48px; color: #666666; font-style: normal; font-weight: normal; text-decoration: none; font-family: Arial;">Do it NOW!</span>
                            <span> </span>
                          </p>
                        </div>

我的问题是,我怎么能写一个正EX pression删除这些不必要的DIV?有没有更好的办法做到这一点?

My question is, how can I write a regular expression to remove these unwanted DIVs? Is there a better way to do it?

在此先感谢。

推荐答案

假设你的目标HTML实际上是有效的XML,您可以用递归函数来拖延非DIV位。

Assuming that your target HTML is actually valid XML, you can use a recursive function to drag out the non-div bits.

static function grabNonDivContents(xml:XML):XMLList {
    var out:XMLList = new XMLList();
    var kids:XMLList = xml.children();
    for each (var kid:XML in kids) {
        if (kid.name() && kid.name() == "div") {
            var grandkids:XMLList = grabNonDivContents(kid);
            for each (var grandkid:XML in grandkids) {
                out += grandKid;
            }
        } else {
            out += kid;
        }
    }
    return out;
}