我与Apache钻一个newbye。
I'm a newbye with Apache Drill.
该方案是这样的:
我有一个S3桶,在这里我把我叫test.csv csv文件。 我已经安装Apache钻从官方网站的说明。
I've an S3 bucket, where I place my csv file called test.csv. I've install Apache Drill with instructions from official website.
我跟着这个教程: https://drill.apache.org/blog/2014/12/09/running-sql-queries-on-amazon-s3/ ,以创建一个S3的插件。
I followed this tutorial: https://drill.apache.org/blog/2014/12/09/running-sql-queries-on-amazon-s3/ for create an S3 plugin.
我开始钻孔,使用正确的工作区(含:用我的-S3),但是当我尝试选择记录从test.cav文件时发生错误:
I start Drill, use the correct "workspace" (with: use my-s3;), but when I try to select records from test.cav file an error occured:
表S3 / test.csv'未找到。
Table 's3./test.csv' not found.
谁能帮助我? 谢谢!
使用的名字您的工作空间(如果你使用一个)和背面蜱在使用命令如下:
Use the name of your workspace (if you use one) and back ticks in the USE command as follows:
USE `my-s3`.`<workspace-name>`;
SHOW files; //should list test.csv file
SELECT * FROM `test.csv`;
查询CSV中使用DFS存储插件配置,以排除像一个导致问题标题中的本地文件系统。这页面可能会帮助,如果你还没有看到它。
Query the CSV in the local file system using the dfs storage plugin configuration to rule out things like a header causing a problem. This page might help if you haven't seen it.
在上面的评论中提及了存储插件:
Storage plugin mentioned in comment above:
{
"type": "file",
"enabled": true,
"connection": "s3n://<accesskey>:<secret>@catpaws",
"workspaces": {},
"formats": {
"psv": {
"type": "text",
"extensions": [
"tbl"
],
"delimiter": "|"
},
"csv": {
"type": "text",
"extensions": [
"csv"
],
"delimiter": ","
},
"tsv": {
"type": "text",
"extensions": [
"tsv"
],
"delimiter": "\t"
},
"parquet": {
"type": "parquet"
},
"json": {
"type": "json"
}
}
}
也许,这是不相关的。这是从Amazon S3的帮助,其中包含大量详细信息的摘录:
Probably, this is not relevant. It's an excerpt from the Amazon S3 help, which contains lots more info:
<property>
<name>fs.s3.awsAccessKeyId</name>
<value>ID</value>
</property>
<property>
<name>fs.s3.awsSecretAccessKey</name>
<value>SECRET</value>
</property>