从框架使用WebBrowser控件获取HTML - unauthorizedaccessexception控件、框架、unauthorizedaccessexception、WebBrowser

2023-09-04 03:44:43 作者:安于时光

我正在寻找一个免费的工具或DLL,我可以用它来写我自己的code。在.NET来处理某些Web请求。 比方说,我有一些查询字符串参数类似于 http://www.example.com?param=网址1 当我在浏览器中的几个重定向发生使用它,并最终HTML呈现具有框架和框架的内部HTML包含了数据的表,我需要。我想存储在一个CSV格式的外部文件中这一数据。显然,根据查询参数的参数数据不同。比方说,我想要运行的应用程序,并生成1000 CSV文件参数值从1到1000。

I'm looking for a free tool or dlls that I can use to write my own code in .NET to process some web requests. Let's say I have a URL with some query string parameters similar to http://www.example.com?param=1 and when I use it in a browser several redirects occur and eventually HTML is rendered that has a frameset and a frame's inner html contains a table with data that I need. I want to store this data in the external file in a CSV format. Obviously the data is different depending on the querystring parameter param. Let's say I want to run the application and generate 1000 CSV files for param values from 1 to 1000.

我在.NET中,JavaScript的HTML良好的知识,但主要的问题是如何让服务器code最终的HTML。

I have good knowledge in .NET, javascript, HTML, but the main problem is how to get the final HTML in the server code.

我试过是我创建了一个新的窗体应用程序,添加一个WebBrowser控件,并用code是这样的:

What I tried is I created a new Form Application, added a webbrowser control and used code like this:

private void FormMain_Shown(object sender, EventArgs e)
    {
        var param = 1; //test
        var url = string.Format(Constants.URL_PATTERN, param);

        WebBrowserMain.Navigated += WebBrowserMain_Navigated;
        WebBrowserMain.Navigate(url);
    }

    void WebBrowserMain_Navigated(object sender, WebBrowserNavigatedEventArgs e)
    {
        if (e.Url.OriginalString == Constants.FINAL_URL)
        {
            var document = WebBrowserMain.Document.Window.Frames[0].Document;
        }
    }

但不幸的是我receieve unauthorizedaccessexception因为大概框架和文档在不同的领域。是否有人对如何解决这个想法,也许另一种全新的方法来实现这样的功能?

But unfortunately I receieve unauthorizedaccessexception because probably frame and the document are in different domains. Does anybody has an idea of how to work around this and maybe another brand new approach to implement functionality like this?

推荐答案

感谢Noseratio的意见我设法做到这一点与WebBrowser控件。下面是一些要点,可以帮助其他人谁也有类似的问题:

Thanks to the Noseratio's comments I managed to do that with the WebBrowser control. Here are some major points that might help others who have similar questions:

1)DocumentCompleted事件应该被使用。对于文档中导航的身体情况为NULL。

1) DocumentCompleted event should be used. For Navigated event body of the document is NULL.

2)以下的答案帮助了很多:WebBrowserControl:访问框架的属性时UnauthorizedAccessException

2) Following answer helped a lot: WebBrowserControl: UnauthorizedAccessException when accessing property of a Frame

3)我不知道有关IHTMLWindow2类似的界面,让他们正确的,我添加引用下面的COM库的工作:Microsoft Internet控制(SHDOCVW),Microsoft HTML对象库(MSHTML)

3) I was not aware about IHTMLWindow2 similar interfaces, for them to work correctly I added references to following COM libs: Microsoft Internet Controls (SHDocVw), Microsoft HTML Object Library (MSHTML).

4)我抓住用以下code框架的HTML:

4) I grabbed the html of the frame with the following code:

    void WebBrowserMain_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
    {
        if (e.Url.OriginalString == Constants.FINAL_URL)
        {
            try
            {
                var doc = (IHTMLDocument2) WebBrowserMain.Document.DomDocument;
                var frame = (IHTMLWindow2) doc.frames.item(0);
                var document = CrossFrameIE.GetDocumentFromWindow(frame);
                var html = document.body.outerHTML;

                var dataParser = new DataParser(html);
                //my logic here
            }

5)对于HTML中的工作,我用pretty的不错的HTML敏捷性包,它有一些pretty的良好的XPath搜索的 HTTP://htmlagilitypack.$c$cplex.com/