HTTP GET在Craigslist受阻HTTP、GET、Craigslist

2023-09-11 09:18:52 作者:穿透心灵的冰

我试图做一个HTTP GET在Craigslist 。这里是我的(红宝石)code这是非常简单的。

I'm trying to do a HTTP GET on craigslist Here is my (ruby) code which is really simple

require 'net/http'
result = Net::HTTP.get(URI.parse(''))


I end up getting an error "This IP has been automatically blocked."


This behaviour only happens when I try this from Amazon EC2 or on heroku. When I try again on my own computer localhost I get the correct result. Does this have to do with Amazon EC2?


I'm wondering if other people have had the same issue. What can I do to access craigslist from EC2?


我可以证实,Craigslist的是从主要Amazon EC2的IP阻止通过IP范围(而不是由用户代理)。它的工作原理在其他地方,虽然我怀疑任何音量会导致其他IP,以得到阻止。

I can confirm that Craigslist is blocking from the major Amazon EC2 IP ranges by IP (not by user agent). It works elsewhere, though I suspect any volume would cause other IPs to get blocked.


You could step around it with tor. More significantly, this stackoverflow question discusses data sources used by craigslist mashups.


I even tested a Brazil EC2, assuming they might not have all the CIDRs blocked. No bueno.