如何使用HTML敏捷性包拿到IMG / src目录或/的HREF?如何使用、目录、敏捷性、IMG

2023-09-03 00:36:54 作者:枝头花几许

我要使用的HTML敏捷包来分析从一个HTML页面的图像和HREF链接,但我不知道很多关于XML或XPath.Though有抬头望着他,许多网站的帮助文档,我只是'牛逼解决problem.In另外,我使用的VisualStudio 2005.And C#我不能讲一口流利的英语,所以,我会给我真诚地感谢一个可以写一些有用的codeS。

I want to use the HTML agility pack to parse image and href links from a HTML page,but I just don't know much about XML or XPath.Though having looking up help documents in many web sites,I just can't solve the problem.In addition,I use C# in VisualStudio 2005.And I just can't speak English fluently,so,I will give my sincere thanks to the one can write some helpful codes.



The first example on the home page does something very similar, but consider:

 HtmlDocument doc = new HtmlDocument();
 doc.Load("file.htm"); // would need doc.LoadHtml(htmlSource) if it is not a file
 foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href"])
    string href = link["href"].Value;
    // store href somewhere

所以,你可以想像,对于IMG @ SRC,只需更换每个 A IMG 的href 的src 。 你甚至可以简化为:

So you can imagine that for img@src, just replace each a with img, and href with src. You might even be able to simplify to:

 foreach(HtmlNode node in doc.DocumentElement
              .SelectNodes("//a/@href | //img/@src")


For relative url handling, look at the Uri class.