读取XML数据很大一部分来自插座和解析飞插座、很大、数据、XML

2023-09-05 04:31:04 作者:人穷脸丑农村户口

我工作的Andr​​oid客户端读取上继续XML数据流从通过TCP套接字我的Java服务器。该服务器发送一个\ n字符作为连续反应之间的分隔符。下面给出的是一个模型实现​​。

 < response1>
   < D​​ATAS>
      <数据>
           .....
           .....
      < /数据>
      <数据>
           .....
           .....
      < /数据>
      ........
      ........
   < / DATAS>
< / response1> \ N'LT; --- \ñ作为分隔符--- />
<响应2>

   < D​​ATAS>
      <数据>
           .....
           .....
      < /数据>
      <数据>
           .....
           .....
      < /数据>
      ........
      ........
   < / DATAS>
< /响应2​​> \ñ
 

嗯,我希望现在的结构是明确的。这种反应是从服务器的zlib COM pressed传输。所以,我得先吹什么,我从服务器读取,使用分隔符分隔的反应和分析。和我使用SAX解析我的XML

现在我的主要问题是XML响应来自服务器的未来会非常大(可以在3到4 MB的范围内)。因此,

分离基础上的分隔符(\ n)的反应我必须使用 的StringBuilder 以响应存储块,它从插座中读取 而在某些手机上的StringBuilder不能储存在弦 兆字节的范围。它给人的内存不足例外,从 像this我认识了一个保持较大的字符串(甚至 临时)是不是一个好主意。

接下来我试图通过inflatorReadStream(而这又需要数据 从套接字输入流)作为SAX解析器的输入流(无 打扰到单独的XML自己,依靠SAX的发现能力 文档基于标记的末尾)。这个时候一个响应变 解析成功,但随后就发现了'\ N'分隔符SAX 抛出 ExpatParserParseException 说垃圾文件后, 元素

没收的 ExpatParserParseException 我试图读取后, 再次,但抛出异常SAX解析器后关闭该流,所以 当我尝试读取/解析一遍,这是给 IOException异常说 输入流已关闭。

什么我已经做了code段中给出以下(删除了所有不相关的try catch块为清楚起见)。

 专用插座ClientSocket的= NULL;
的DataInputStream readStream = NULL;
DataOutputStream类writeStream = NULL;
私人StringBuilder的incompleteResponse = NULL;
私人AppContext上下文= NULL;


公共布尔在connectToHost(字符串ip地址,INT端口,AppContext myContext){
        上下文= myContext;
        网站=网站;
        InetAddress类serverAddr = NULL;

    serverAddr = InetAddress.getByName(website.mIpAddress);

    ClientSocket的=新的Socket(serverAddr,口);

    //如果连接创建一个读写流对象..
    readStream =新的DataInputStream(新的InflaterInputStream(clientSocket.getInputStream()));
    writeStream =新DataOutputStream类(clientSocket.getOutputStream());

    螺纹readThread =新的Thread(){
            @覆盖
            公共无效的run(){
            ReadFromSocket();
        }
    };
    readThread.start();
    返回true;
}


公共无效ReadFromSocket(){
   而(真){
       InputSource的xmlInputSource =新的InputSource(readStream);
       的SAXParserFactory SPF = SAXParserFactory.newInstance();
       SAXParser的SP = NULL;
       XMLReader的XR = NULL;
       尝试{
           SP = spf.newSAXParser();
       XR = sp.getXMLReader();
       ParseHandler xmlHandler =新ParseHandler(context.getSiteListArray()的indexOf(网站),上下文。);
       xr.setContentHandler(xmlHandler);
       xr.parse(xmlInputSource);
   // postSuccessfullParsingNotification();
       }赶上(的SAXException E){
           e.printStackTrace();
           postSuccessfullParsingNotification();
       }赶上(的ParserConfigurationException E){
           e.printStackTrace();
           postSocketDisconnectionBroadcast();
           打破;
       }赶上(IOException异常E){
           postSocketDisconnectionBroadcast();
           e.printStackTrace();
           e.toString();
           打破;
       }赶上(例外五){
           postSocketDisconnectionBroadcast();
           e.printStackTrace();
           打破;
       }
    }
}
 
这样 撩 大数据,小白都能看懂

而现在我的问题是

有没有什么办法让SAX解析器忽略后垃圾字符 XML响应,而不是抛出异常,并关闭该流.. 如果不是有什么办法,以避免内存不足的错误的 StringBuilder的。坦率地说,我也不例外的正面回答 本。任何解决办法? 解决方案 您可能能够使用一个包装周围的读者或流传递给检测换行,然后关闭分析器和启动一个新的解析器,与流继续过滤:你流是不是有效的XML,你赢了'T能够分析它,你现在已经实现了。看看http://commons.apache.org/io/api-release/org/apache/commons/io/input/CloseShieldInputStream.html. 无。

I am working on an android client which reads continues stream of xml data from my java server via a TCP socket. The server sends a '\n' character as delimiter between consecutive responses. Below given is a model implementation..

<response1>
   <datas>
      <data>
           .....
           .....
      </data>
      <data>
           .....
           .....
      </data>
      ........
      ........
   </datas>
</response1>\n    <--- \n acts as delimiter ---/> 
<response2>

   <datas>
      <data>
           .....
           .....
      </data>
      <data>
           .....
           .....
      </data>
      ........
      ........
   </datas>
</response2>\n

Well I hope the structure is clear now. This response is transmitted from server zlib compressed. So I have to first inflate whatever I am reading from the server, separate on response using delimiter and parse. And I am using SAX to parse my XML

Now my main problem is the xml response coming from server can be very large (can be in the range of 3 to 4 MB). So

to separate responses based on delimiter (\n) I have to use a stringBuilder to store response blocks as it reads from socket and on some phones StringBuilder cannot store strings in the MegaBytes range. It is giving OutOfMemory exception, and from threads like this I got to know keeping large strings (even on a temporary basis) is not such a good idea.

Next I tried to pass the inflatorReadStream (which in turn takes data from socket input stream) as the input stream of SAX parser (without bothering to separate xml myself and relying on SAX's ability to find the end of document based on tags). This time one response gets parsed successfully, but then on finding the '\n' delimiter SAX throws ExpatParserParseException saying junk after document element .

After catching that ExpatParserParseException I tried to read again, but after throwing exception SAX Parser closes the stream, so when I try to read/parse again, it is giving IOException saying input stream is closed.

A code snippet of what I have done is given below (removed all unrelated try catch blocks for clarity).

private Socket clientSocket     =   null;
DataInputStream readStream      =   null;
DataOutputStream writeStream        =   null;
private StringBuilder incompleteResponse    =   null;
private AppContext  context     =   null;


public boolean connectToHost(String ipAddress, int port,AppContext myContext){
        context                     =   myContext;
        website                     =   site;
        InetAddress serverAddr          =   null;

    serverAddr                      =   InetAddress.getByName(website.mIpAddress);

    clientSocket                    =   new Socket(serverAddr, port);

    //If connected create a read and write Stream objects..
    readStream   =  new DataInputStream(new InflaterInputStream(clientSocket.getInputStream()));
    writeStream             =   new DataOutputStream(clientSocket.getOutputStream());

    Thread readThread = new Thread(){
            @Override
            public void run(){                              
            ReadFromSocket();                   
        }
    };
    readThread.start();     
    return true;
}


public void ReadFromSocket(){
   while(true){
       InputSource xmlInputSource = new InputSource(readStream);
       SAXParserFactory spf =   SAXParserFactory.newInstance();
       SAXParser sp =   null;
       XMLReader xr =   null;
       try{
           sp   = spf.newSAXParser();
       xr   = sp.getXMLReader();
       ParseHandler xmlHandler =    new ParseHandler(context.getSiteListArray().indexOf(website), context);
       xr.setContentHandler(xmlHandler);
       xr.parse(xmlInputSource);
   //  postSuccessfullParsingNotification();
       }catch(SAXException e){
           e.printStackTrace();
           postSuccessfullParsingNotification();
       }catch(ParserConfigurationException e){
           e.printStackTrace();
           postSocketDisconnectionBroadcast();
           break;
       }catch (IOException e){
           postSocketDisconnectionBroadcast();
           e.printStackTrace();
           e.toString();
           break;
       }catch (Exception e){
           postSocketDisconnectionBroadcast();
           e.printStackTrace();
           break;
       }
    }
}

And now my questions are

Is there any way to make SAX Parser ignore junk characters after on xml response, and not throw exception and close the stream.. If not is there any way to avoid out of memory error on stringBuilder. To be frank,I am not excepting a positive answer on this. Any workaround?

解决方案

You might be able to use a wrapper around the reader or stream you pass to the filter that detects the newline and then closes the parser and launches a new parser that continues with the stream: your stream is NOT valid XML and you won't be able to parse it as you currently have implemented. Take a look at http://commons.apache.org/io/api-release/org/apache/commons/io/input/CloseShieldInputStream.html. No.