如何获得XML声明头? Android版如何获得、声明、XML、Android

2023-09-07 04:27:54 作者:who you

我正在写为Android RSS阅读器应用程序,现在我需要知道什么是XML的编码之前,我开始分析它(窗口-1251或UTF-8)。这是XML声明头即说明<?XML版本=1.0编码=UTF-8> 。我怎样才能解析之前得到这个头?我用SAX解析器的android.sax实施,并通过编码为字符串参数的InputStreamReader。   我发现了一个相关的问题:SAX解析器不能识别Windows-1255编码 - 但解决的办法有向CP-1251转换为UTF-8,这太麻烦了,要求上的资源。我认为必须有更好的解决办法,因为我只需要知道头编码值<?XML版本=1.0编码=UTF-8> 。但我不能设法得到XML这个头,解析器&LT启动; RSS> 标记。我应该怎么做呢?

I'm writing a rss reader app for android and now i need to know what is the encoding of xml before i start parsing it (windows-1251 or utf-8). This is described in xml declaration header i.e. <?xml version="1.0" encoding="UTF-8"?>. How can i get this header before parsing? I use android.sax implementation of sax parser and pass encoding as string parameter to InputStreamReader. I found a related question: SAX Parser doesn't recognize windows-1255 encoding - but the solution there is to convert cp-1251 to utf-8, which is too cumbersome and demanding on resources. I think there must be better solution, as i only need to know encoding value from header <?xml version="1.0" encoding="UTF-8"?>. But i can't manage to get this header from xml, parser starts from <rss> tag. How should i get it?

推荐答案

好了,问题是pretty明显:)这里是code,它的工作的基础上,湿眶客的评论:

Well, the question was pretty obvious :) Here is the code that worked, based on Squonk's comment:

byte[] data = new byte[50];     
            try{
            bs.mark(60);
            bs.read(data, 0, data.length);
            String value = new String(data,"UTF-8");
            if(value.toLowerCase().contains("utf-8"))
                return "UTF-8";
            else if(value.contains("1251"))
                return "windows-1251";
            } catch (IOException e) {
                Log.d("debug", "Exception: " + e);
                return "XML not found";
            }

然后,只需重新BS(的BufferedInputStream),并在任何需要的字符集与它的工作。

Then just reset bs (BufferedInputStream) and work with it in any needed charset.