无法在AWS EMR安装R / rmr2AWS、EMR

2023-09-11 12:15:34 作者:萝莉小仙女

已经花费了大约一个星期要安装R和rmr2对AWS-EMR,我转向大家一点帮助。我的启动脚本被成功安装 - [R 2.14.1-1〜lennycran.0(感谢JD朗的博客)。当我尝试安装rmr2我有经典的依赖问题。看来我得安装软件包像RCPP,RJSONIO,bitops,消化和5个。因为只有一个老RCPP工作有R 2.14.1,我下载了一个名为版本,并安装它。多大年纪,我不知道 - 我随意尝试了几个版本和0.8.9的工作。我会做一些更打了试验。

  sudo的卷曲-o Rcpp.tar.gz http://cran.us.r-project.org/src/contrib/Archive/Rcpp/Rcpp_0.8.9.tar.gz
须藤 -  [R CMD INSTALL Rcpp.tar.gz
 

现在我应该安装依赖的其余部分(如何?)

和最终rmr2将被安装。我使用下面的脚本,这,当然失败 -

 须藤的wget --no检查证书-o rmr2.tar.qz -S -T 10 -t 5 http://goo.gl/dvBric
须藤 -  [R CMD INSTALL rmr2.tar.gz
 
榨干 EMR开销 AWS EMR在搭建大数据平台ETL的应用实践

我的问题是 -

这应该是一个简单的引导脚本安装依赖的其余部分(RJSONIO,bitops,消化,功能性,stringr,plyr,reshape2,caTools) ?我担心那些包的兼容性,以及?

下面是我的完整bootstrap.sh code -

 #!/斌/庆典

#debianř升级

GPG --keyserver pgpkeys.mit.edu --recv键06F90DE5381BA480
GPG -a --export 06F90DE5381BA480 | sudo易于键添加 - 
回声DEB http://streaming.stat.iastate.edu/CRAN/bin/linux/debian莱尼 -  CRAN /| sudo的发球-a的/etc/apt/sources.list
sudo易于得到更新
sudo易于得到-t莱尼 -  CRAN安装--yes --force-有R基R基本开发

须藤卷曲-o rmr2.tar.gz http://goo.gl/dvBric
须藤 -  [R CMD INSTALL rmr2.tar.gz<<<<没有超越这个。

设置-e
斗= muxxx-bisxxx桶
PATH = input.tar.gz
wget的-S -T 10 -t 5 HTTP://$bucket.s3.amazonaws.com/$path
MKDIR -p /家庭/ Hadoop的/内容
焦油-C /家庭/ Hadoop的/内容-xzf input.tar.gz

出口HADOOP_CMD = /家庭/ Hadoop的/ bin中/ Hadoop的
出口HADOOP_STREAMING = /家庭/ Hadoop的/的contrib /流/ hadoop_streaming.jar

/主页/ Hadoop的/ bin中/ Hadoop的FS -mkdir /家庭/ Hadoop的/内容
/主页/ Hadoop的/ bin中/ Hadoop的FS -put /家庭/ Hadoop的/内容/ * /家用/ Hadoop的/内容/
 

解决方案

我还没有决定,我手头上的问题,但我得到了一个方向。我加了code以下行的启动脚本 - [R 2.14.1安装后rmr2安装之前 -

 须藤RSCRIPT -e'install.packages(C(rJava,RCPP,RJSONIO,bitops,消化,功能性,stringr plyr,reshape2,caTools),回购=htt​​p://ftp.heanet.ie/mirrors/cran.r-project.org/)
 

目前的引导过程分解为plyr,我的猜测,是由于RCPP的旧版本,我有。

我将结束这一职务。

Having spent around a week trying to install R and rmr2 on AWS-EMR, I am turning to you all for a little help. My bootstrap script is successfully installing R 2.14.1-1~lennycran.0 (thanks to JD Long's blog). When I am trying to install rmr2 I am having the classic dependency problem. Seems I have to install packages like Rcpp, RJSONIO, bitops, digest and 5 more. Because only an older Rcpp works with R 2.14.1, I am downloading a named version and installing it. How old, I don't know - I randomly tried a few versions and 0.8.9 worked. I will make a few more hit-and-trials.

sudo curl -o Rcpp.tar.gz http://cran.us.r-project.org/src/contrib/Archive/Rcpp/Rcpp_0.8.9.tar.gz
sudo R CMD INSTALL Rcpp.tar.gz

Now I am supposed to install the rest of the dependencies (How?)

And eventually rmr2 would be installed. I am using the following script, which, of course fails -

sudo wget --no-check-certificate -o rmr2.tar.qz -S -T 10 -t 5 http://goo.gl/dvBric
sudo R CMD INSTALL rmr2.tar.gz

My question is -

What should be a simple bootstrap script for installing the rest of the dependencies ("RJSONIO", "bitops", "digest", "functional", "stringr", "plyr", "reshape2", "caTools")? Do I have to worry about compatibility of those packages as well?

Here is my complete bootstrap.sh code -

#!/bin/bash

#debian R upgrade

gpg --keyserver pgpkeys.mit.edu --recv-key 06F90DE5381BA480
gpg -a --export 06F90DE5381BA480 | sudo apt-key add -
echo "deb http://streaming.stat.iastate.edu/CRAN/bin/linux/debian lenny-cran/" | sudo tee -a /etc/apt/sources.list
sudo apt-get update
sudo apt-get -t lenny-cran install --yes --force-yes r-base r-base-dev

sudo curl -o rmr2.tar.gz http://goo.gl/dvBric
sudo R CMD INSTALL rmr2.tar.gz <<<< Does not go beyond this.

set -e
bucket=muxxx-bisxxx-bucket
path=input.tar.gz
wget -S -T 10 -t 5 http://$bucket.s3.amazonaws.com/$path
mkdir -p /home/hadoop/contents
tar -C /home/hadoop/contents -xzf input.tar.gz

export HADOOP_CMD=/home/hadoop/bin/hadoop
export HADOOP_STREAMING=/home/hadoop/contrib/streaming/hadoop_streaming.jar

/home/hadoop/bin/hadoop fs -mkdir /home/hadoop/contents
/home/hadoop/bin/hadoop fs -put /home/hadoop/contents/* /home/hadoop/contents/

解决方案

I have not resolved my problem on hand but I got a direction. I added the following line of code in the bootstrap script after R 2.14.1 installation and before rmr2 installation -

sudo Rscript -e 'install.packages(c("rJava", "Rcpp", "RJSONIO", "bitops", "digest", "functional", "stringr", "plyr", "reshape2", "caTools"), repos="http://ftp.heanet.ie/mirrors/cran.r-project.org/")'

Currently the bootstrapping process breaks down at plyr, which I guess, is due to older version of Rcpp that I have.

I am closing this post.