Hadoop的数据复制错误错误、数据、Hadoop

2023-09-12 23:36:05 作者:暧昧的爱 我要不起

希望大家都度过了一个美妙的假期。我想在Amazon EC2上安装Hadoop集群。而从本地磁盘中的数据文件复制到HDFS的命令 Hadoop的FS -copyFromLocal d.txt /用户/ Ubuntu的/数据,我得到的数据复制错误。从日志的错误是继

Hope you all had a wonderful vacation. I am trying to setup Hadoop cluster on Amazon EC2. While copying data file from local disk to hdfs with the command hadoop fs -copyFromLocal d.txt /user/ubuntu/data, I am getting data replication error. The error from the log is following

15/01/06七时40分36秒WARN hdfs.DFSClient:错误恢复为空坏的Datanode [0]节点== NULL

15/01/06 07:40:36 WARN hdfs.DFSClient: Error Recovery for null bad datanode[0] nodes == null

15/01/06七时40分36秒WARN hdfs.DFSClient:无法获取块位置。源文件/user/ubuntu/data/d.txt - >败... copyFromLocal:java.io.IOException异常:文件/user/ubuntu/data/d.txt只能被复制,而不是1到0节点,

15/01/06 07:40:36 WARN hdfs.DFSClient: Could not get block locations. Source file /user/ubuntu/data/d.txt" - > Aborting... copyFromLocal: java.io.IOException: File /user/ubuntu/data/d.txt could only be replicated to 0 nodes, instead of 1

15/01/06七时40分36秒错误hdfs.DFSClient:无法关闭文件/user/ubuntu/data/d.txt

15/01/06 07:40:36 ERROR hdfs.DFSClient: Failed to close file /user/ubuntu/data/d.txt

现在,我已经检查计算器和有关此问题的其他论坛,我发现他们大多讲的的DataNode 的TaskTracker 不运行作为一个可能的原因和放大器;相关的解决方案。但是,这些东西在我的设置中运行良好。该JPS命令的屏幕截图 https://m.xsw88.com/allimgs/daicuo/20230912/334.png

Now, I had been checking StackOverFlow and other forums about this problem and I found most of them talk about DataNode, TaskTracker not running as a probable cause & relevant solutions. But these things are running fine in my setup. The screenshot of the JPS command https://m.xsw88.com/allimgs/daicuo/20230912/334.png

从HadooWiki,其他可能的原因是DataNode的无法倾诉的服务器,通过网络或Hadoop配置问题或一些配置问题是preventing有效的双向沟通​​。

From HadooWiki, the other possible causes are DataNode not able talk to the server, through networking or Hadoop configuration problems or some configuration problem is preventing effective two-way communication.

我已经配置hadoop-env.sh,核心site.xml中,HDFS-site.xml中马preD-site.xml中下面的教程中的 http://tinyurl.com/l2wv6y9 。谁能告诉我高兴我要去的地方错了?我将非常感谢,如果有人能帮我解决这个问题。

I have configured hadoop-env.sh, core-site.xml, hdfs-site.xml and mapred-site.xml following the tutorial http://tinyurl.com/l2wv6y9 . Could anyone tell please me where I am going wrong ? I will be immensely grateful if anyone help me to resolve the problem.

谢谢

推荐答案

那么,问题是在安全组。当我创建的EC2实例我创建的,我还没有允许端口打开连接配置的规则,一个新的安全组。

Well, the problem was in security groups. When I've created the EC2 instances I created a new security group in which I haven't configured the rules for allowing ports to open for connection.

在创建一组使用默认选项,我们必须在SSH端口22添加一条规则为了有TCP和ICMP的访问,我们需要增加2个额外的安全规则。加入所有TCP,所有的ICMP和SSH(22)在入站规则,这应该很好地工作。

While creating a group with default options, we must add a rule for SSH at port 22. In order to have TCP and ICMP access we need to add 2 additional security rules. Add ‘All TCP’, ‘All ICMP’ and ‘SSH (22)’ under the inbound rules, This should work fine.

如果我们使用的是现有安全组,我们应检查入站和出站规则。

If we are using an existing security group, we should check the Inbound and outbound rules.