在Hadoop的Amazon S3和S3N的区别区别、Amazon、Hadoop、S3N

2023-09-11 08:16:09 作者:孤独成性只是看透人心

当我连我的Hadoop集群亚马逊的存储和下载文件到HDFS,我发现S3://没有工作,但看在互联网上的一些帮助,我发现我可以使用S3N所以当我用S3N它的工作。我不明白,使用S3或S3N我的Hadoop集群之间的不同,有人能解释一下吗?

When I connected my Hadoop cluster to Amazon storage and downloading file to HDFS, I found s3:// did not work but looking some help on internet I found I can use S3n so when I used S3n it worked. I do not understand the different between using S3 or s3n with my hadoop cluster, can someone explain?

推荐答案

我认为具有S3和S3N两个独立的连接点的Hadoop你的主要问题是有关。 S3N://意思是一个普通文件,从外界可读的,在这个S3 URL。 S3://指映射到S3存储是坐在AWS存储集群上的HDFS文件系统。所以,当你正在使用亚马逊的存储桶的文件,你必须使用S3N,这就是为什么你的问题得到解决。通过@Steffen添加的信息也是伟大的!

I think you main problem was related with having S3 and S3N two separate connection point for Hadoop. S3n:// means "A regular file, readable from the outside world, at this S3 url". S3:// refers to an HDFS file system mapped into an S3 bucket which is sitting on AWS storage cluster. So when you were using a file from Amazon storage bucket you must be using S3N and that's why your problem is resolved. The information added by @Steffen is also great!!