完整的HTML剥离功能完整、功能、HTML

2023-09-04 06:56:53 作者：鱼和胸罩不可兼得#

我有一个这样的HTML字符串：

I have an HTML string like this:

<p>First Sentence is this.&#160;Second sentence is this.</p>

我可以删除＆LT; P＆GT; 使用正则表达式从上面的字符串代码功能

但是，如何删除＆放大器;＃160; - EN从上面的字符串中的codeD字符的的WinForms ？

But, how to remove   - encoded characters from the above string in winforms?

我不希望＆放大器;＃160; 是present输出

I don't want   to be present in the output.

推荐答案

您可以使用 XElement.Parse 来得到这样的节点值：

You can use XElement.Parse to get the node value like this:

 var htmlString = "<p>First Sentence is this.&#160;Second sentence is this.</p>";
 var result = System.Xml.Linq.XElement.Parse(htmlString).Value;

如果不是所有的字符串包含有效的XML结构，或者可能没有任何标签的一切，你可以添加虚假标签是这样的：

If not all the strings contain valid XML structure, or may have no tags at all, you can add fake tags like this:

 var htmlString = "<p>First Sentence is this.&#160;Second sentence is this.</p>";
 var result = System.Xml.Linq.XElement.Parse("<root>" + htmlString + "</root>").Value;

结果：

您可能需要添加错误处理的问题，但是这显然比使用正则表达式这更好的。

You might want to add error handling for this, but this is clearly better than using a regex for this.

编辑：

在此情况下，仍无法正常工作，而且你想只处理实体，您可以利用 System.Web.HttpUtility.HtmlDe code 方法来替代与文字HTML实体：

In case this is still not working, and you want to just handle the entities, you can leverage System.Web.HttpUtility.HtmlDecode method to replace HTML entities with literals:

var final_result = System.Web.HttpUtility.HtmlDecode(result);

上一篇：机器人：RecyclerView滚动型内机器人、RecyclerView

下一篇：如何使一个深拷贝词典模板词典、模板

相关推荐

精彩图集

精彩推荐

图片推荐

2021十部熬夜也要瞅的电视剧免费瞅望，醒悟