TagSoup与Jsoup与HTML解析器与HotSax VSJsoup、TagSoup、HTML、VS

2023-09-05 06:22:20 作者:说爱太烫嘴

HTML解析器的大量可供选择(并坚持)是超乎想象的:

The abundance of HTML parsers to choose from (and stick with) is mind boggling:

http://java-source.net/open-source/html-parsers

我如何选择一个最适合下列要求:

How do I choose one that best suits the following requirements:

成熟的(比其他错误较少) Live和呼吸(即感 维护) 快速和资源节约型 (打算在Android上运行) Mature (fewer bugs than the rest) Live and breathing (i.e. being maintained) Fast and resource-efficient (intended to run on Android)

推荐答案

嗯,我找到了答案,这是在给定的由@BalusC一个different螺纹:

Well, I found the answer, which was given by @BalusC on a different thread:

如果您只是想使用一个基于XML 工具来遍历:把JTidy 如果你喜欢单元测试的HTML: 的HtmlUnit 如果您想提取特定数据 从HTML: Jsoup If you just want to use a XML based tool to traverse it: JTidy. If you like to unit test the HTML: HtmlUnit If you like to extract specific data from the HTML: Jsoup

感谢您@BalusC。

Thank you @BalusC.