阿贾克斯网址访问由谷歌网址、阿贾克斯

2023-09-11 01:25:33 作者:思君朝与暮

我们遇到了由于某些URL连接code问题一个问题Googlebot的试图访问一个网址上的Ajax功能和失败。首先,我们有点困惑,为什么Googlebot的试图访问URL的JS函数中的JS脚本。

We’re having an issue with Googlebot trying to access a URL on an Ajax function and failing due to some URL encode issue. First of all we’re bit confused why googlebot is trying to access a URL inside a JS function on a JS script.

JS code:

 ajaxFunction(siteid) {
   $.get(location.protocol + '//' + location.hostname + '/ajax/?ajaxscript=detail&siteid='+ siteid, function() { ... });
}

以上函数是一个JS脚本包含在我们的web页面被调用点击一个链接/按钮时。 Googlebot的某种方式试图去通过上述功能,直接生成的URL,并收到错误信息是由于?字符作为URL连接codeD,所以的siteID价值没有得到通过。

Above function is in a JS script included on our web page which gets called when a link/button is clicked. Googlebot somehow trying to go to the URL generated by the above function directly and getting errors due to "?" character being URL encoded so the siteid value not getting passed.

为例网址,谷歌正试图访问:

Example URL that google is trying to access:

 http://www.google.com/url?sa=t&rct=j&q=duo%2Bboots&source=web&cd=4&ved=0CDQQFjAD&url=http%3A%2F%2Fwww.MYSITE.com%2Fajax%2F%253Fajaxscript%3Ddetail%26siteid%3D1 

你有什么想法,为什么Googlebot的直接试图访问由JS函数生成的URL,而且有可能Googlebot的直接访问基于AJAX的功能和网址?基本上的首要问题是,?在我们的服务器错误日志是越来越转换为%2F是因此不传递所需要的数据,以我的剧本,这是越来越记录为错误。

Do you have any idea why googlebot is trying directly access the URL generated by the JS function and is it possible for googlebot to access ajax based functions and URLs directly? Basically the primary problem is that the ? is getting converted to %2F which is therefore not passing the required data to my script, and this is getting logged as an error in our server error log.

推荐答案

谷歌越来越很好奇这些的JavaScript重定向,他知道这些URL一个完整的页面呈现(包括JS),谷歌工具栏数据或Chrome数据。

Google is getting very curious about these JavaScript redirects, he knows these urls with a full page rendering (including JS), Google Toolbar data or Chrome data.

我总是使用preFIX我所有的AJAX请求,如: http://domain.com/_ajax/xxxxx ,那么我禁止所有漫游器抓取开始的URL / _ajax /使用的robots.txt

I always use a prefix to all my AJAX request, e.g. http://domain.com/_ajax/xxxxx, then I forbid all bots to crawl urls starting with /_ajax/ with robots.txt

您也可以添加一个NOINDEX,nofollow的在X-机器人-标签HTTP标头。

You could also add a "noindex,nofollow" to the X-Robots-Tag HTTP header.