XMLHtt prequest和Chrome开发人员工具就不多说了同样的话的话、说了、不多、开发人员

2023-09-10 16:44:38 作者:孤堡恋情 Dark

我在下载使用XMLHtt prequest和Range头5 MB块了〜50MB的文件。物联网工作的伟大,除了当我下载了最后一个块检测。

I'm downloading a ~50MB file in 5 MB chunks using XMLHttpRequest and the Range header. Things work great, except for detecting when I've downloaded the last chunk.

下面是请求和响应的第一的大块的屏幕截图。请注意,内容长度是 1024 * 1024 * 5 (5 MB)。还要注意,在服务器前5 MB的正确响应,并在内容,范围头,正确指定整个文件的大小(后 / ):

Here's a screenshot of the request and response for the first chunk. Notice the Content-Length is 1024 * 1024 * 5 (5 MB). Also notice that the server responds correctly with the first 5 MB, and in the Content-Range header, properly specifies the size of the entire file (after the /):

当我复制响应体到文本编辑器(卓异),我只得到5242736字符,而不是预期的5242880按所指示的内容长度

When I copy the response body into a text editor (Sublime), I only get 5,242,736 characters instead of the expected 5,242,880 as indicated by Content-Length:

为什么144个字符缺失?这是被下载的,每块的真实,虽然确切的不同而变化一点点。

Why are 144 characters missing? This is true of every chunk that gets downloaded, though the exact difference varies a little bit.

不过,有什么特别奇怪的是最后的块。服务器与上次〜2.9 MB的文件(而不是整个5 MB)的回应,显然正确指示该在响应:

However, what's especially strange is the last chunk. The server responds with the last ~2.9 MB of the file (instead of a whole 5 MB) and apparently properly indicates this in the response:

请注意,我请求下一个5 MB(即使它超越了总文件大小)。根本不算什么,服务器的响应与该文件的最后部分和集管表示实际字节范围返回

Notice that I am requesting the next 5 MB (even though it goes beyond the total file size). No biggie, the server responds with the last part of the file and the headers indicate the actual byte range returned.

但确实是真的吗?

当我称之为 xhr.getResponseHeader(内容长度)使用Javascript,我看到在Chrome一个不同的故事:

When I call xhr.getResponseHeader("Content-Length") with Javascript, I see a different story in Chrome:

在XMLHtt prequest对象告诉我,再过5 MB被下载,超出了文件的末尾。有什么我不明白有关 XHR 对象?

The XMLHttpRequest object is telling me that another 5 MB was downloaded, beyond the end of the file. Is there something I don't understand about the xhr object?

什么是连的怪异的是,它在Firefox 30的预期:

What's even weirder is that it works in Firefox 30 as expected:

所以之间的 xhr.responseText.length 不匹配的内容长度和这些标题之间不同意的 XHR 对象和网络工具,我不知道该怎么做才能解决这个问题。

So between the xhr.responseText.length not matching the Content-Length and these headers not agreeing between the xhr object and the Network tools, I don't know what to do to fix this.

是什么造成这些差异?

更新:我已经证实,服务器本身是否正确发送请求,尽管在过去的块请求打捞范围头。这是从原始的HTTP请求的输出,得益于良好的'醇的telnet

Update: I have confirmed that the server itself is properly sending the request, despite the overshot Range header in the request for the last chunk. This is the output from the raw HTTP request, thanks to good 'ol telnet:

HTTP/1.1 206 Partial Content
Server: nginx/1.4.5
Date: Mon, 14 Jul 2014 21:50:06 GMT
Content-Type: application/octet-stream
Content-Length: 2987360
Last-Modified: Sun, 13 Jul 2014 22:05:10 GMT
Connection: keep-alive
ETag: "53c30296-2fd9560"
Content-Range: bytes 47185920-50173279/50173280

所以看起来Chrome浏览器出现故障。如果有这样的作为提出一个错误?在哪里?

So it looks like Chrome is malfunctioning. Should this be filed as a bug? Where?

推荐答案

主要的问题是,你正在阅读的二进制数据为文本。请注意,服务器内容类型响应:应用程序/八位字节流不明确指定编码 - 在这种情况下,浏览器通常会假设数据是EN codeD的UTF-8。而长度将主要是不变(字节,值0到127都是PTED如UTF-8和较高的值字节的单个字符间$ P $通常由替换字符代替),您的二进制文件一定包含一些有效的多字节的UTF-8序列 - 和它们将被合并成一个字符。这就解释了为什么 responseText.length 不匹配,从服务器接收的字节数。

The main issue is that you are reading binary data as text. Note that the server responds with Content-Type: application/octet-stream which doesn't specify the encoding explicitly - in that case the browser will typically assume that the data is encoded in UTF-8. While the length will mostly be unchanged (bytes with values 0 to 127 are interpreted as a single character in UTF-8 and bytes with higher values will usually be replaced by the replacement character �), your binary file will certainly contain a few valid multi-byte UTF-8 sequences - and these will be combined into one character. That explains why responseText.length doesn't match the number of bytes received from the server.

现在,你当然可以强制使用的 request.overrideMimeType()方法,ISO 8859-1是有意义的,特别是因为前256个统一code code点是等同采用ISO 8859-1:

Now you could of course force some specific encoding using request.overrideMimeType() method, ISO 8859-1 would make sense in particular because the first 256 Unicode code points are identical with ISO 8859-1:

request.overrideMimeType("application/octet-stream; charset=iso-8859-1");

这应该确保一个字节永远是跨preTED为一个字符。不过,更好的方法将存储在 ArrayBuffer 这是明确为了处理二进制数据。

That should make sure that one byte will always be interpreted as one character. Still, a better approach would be storing the server response in an ArrayBuffer which is explicitly meant to deal with binary data.

var request = new XMLHttpRequest();
request.open(...);
request.responseType = "arraybuffer";
request.send();

...

var array = new Uint8Array(request.response);
alert("First byte has value " + array[0]);
alert("Array length is " + array.length);

根据 MDN , responseType =arraybuffer支持开始的Chrome 10,火狐6和Internet Explorer 10参见:的类型数组的。

According to MDN, responseType = "arraybuffer" is supported starting with Chrome 10, Firefox 6 and Internet Explorer 10. See also: Typed arrays.

侧面说明的:火狐还支持 responseType =MOZ-分块文本 responseType =MOZ -chunked-arraybuffer开始使用Firefox 9,允许接收的块数据,而无需诉诸不等的请求。看来,Chrome不打算实现它,而不是他们的工作对实施流API 。

Side-note: Firefox also supports responseType = "moz-chunked-text" and responseType = "moz-chunked-arraybuffer" starting with Firefox 9 which allow receiving data in chunks without resorting to ranged requests. It seems that Chrome doesn't plan to implement it, instead they are working on implementing the Streams API.

修改:我无法与Chrome浏览器对你说谎有关的响应头重现您的问题,至少在没有你的code。然而,code责任应该在的 partial_data.cc :

Edit: I was unable to reproduce your issue with Chrome lying to you about the response headers, at least not without your code. However, the code responsible should be this function in partial_data.cc:

// We are making multiple requests to complete the range requested by the user.
// Just assume that everything is fine and say that we are returning what was
// requested.
void PartialData::FixResponseHeaders(HttpResponseHeaders* headers,
                                     bool success) {
  if (truncated_)
    return;

  if (byte_range_.IsValid() && success) {
    headers->UpdateWithNewRange(byte_range_, resource_size_, !sparse_entry_);
    return;
  }

这code将删除内容长度由服务器返回内容范围头和由那些从你的请求参数生成取代它们。既然我不能重现该问题喽,下面只有猜测:

This code will remove the Content-Length and Content-Range headers returned by the server and replace them by ones generated from your request parameters. Given that I cannot reproduce the issue myself, the following is only guesses:

这code路径似乎只对可以从缓存中满足请求中使用,所以我想事情会正常工作,如果你清除缓存。 resource_size _ 变量必须在你的情况下,一个错误的值,比所需的文件的实际大小。这个变量是从所要求的第一个块的内容范围头确定,也许你有一个服务器响应缓存的存在表明一个更大的文件。 This code path seems to be used only for requests that can be satisfied from cache, so I guess that things will work correctly if you clear your cache. resource_size_ variable must have a wrong value in your case, larger than the actual size of the requested file. This variable is determined from the Content-Range header in the first chunk requested, maybe you have a server response cached there which indicates a larger file.