如何使用红宝石单个记录写入到红移数据库?红宝石、如何使用、数据库、到红移

2023-09-11 11:40:07 作者:  □мē婲伈? | □мē婲痴?

目前,我们有一个分析数据,并上传它一条记录的时间到MySQL数据库的脚本。最近,我们决定改用AWS红移。

有没有一种方法,我可以用我的亚马逊登录凭据,我的红移集群信息直接上传这些记录到红移数据库?

所有的导游,我发现网上建议从S3斗导入文本或CSV文件,但不是很实用我的使用情况。

感谢您的帮助

我希望做这样的事情:

 要求AWS-SDK
要求PG

AWS.config(access_key_id:my_access_key_id,secret_access_key:my_secret_access_key,地区:美西-2')

红移= AWS :: Redshift.new

凭据= {
    司机:org.postresql.Driver
    网址:my_connect_url
    用户名:my_username
    密码:MY_PASSWORD
    数据库:MY_DB
}

DB = redshift.connect(凭证)#**的code不实线,我想这是**

sql_query =INSERT INTO my_table的(my_column)
        VALUES('你好世界'); 

db.query(sql_query)
db.close
 

解决方案

真的是你应该做的,是插入一个记录在S3的时候。然后定期做该文件的负荷。红移是更有效的加载10万线的文件,然后说进入100行数据逐一(粗略估计为我的遭遇...)。如果你真的想通过记录插入记录的东西,你可以与红宝石任何标准PSQL接口做到这一点。红移可以连接到使用JDBC / ODBC驱动程序PSQL。还挺喜欢你写的示例程序。

我不建议这样做......但这里是文档的插入 http://docs.aws.amazon.com/redshift/latest/dg /r_INSERT_30.html

我想看看关于追加到S3文件这个问题。这真的是你想要做什么......

Ruby - 使用在现有的s3的文件的末尾追加内容雾的

修改的 所以我还挺跳上这个问题不读答案.... 所以修正时,你需要在本地创建的文件,一旦达到一定的大小把它上传到S3,那么红移加载命令。

在这里,装载到红移 http://docs.aws.amazon.com /redshift/latest/dg/t_Loading-data-from-S3.html

或者....你可以尝试从远程主机此装载数据......我从来没有这样做过,但其基本跳过S3负荷的一部分,但你还是希望有一个大的文件移动。 的http://docs.aws.amazon.com/redshift/latest/dg/loading-data-from-remote-hosts.html

伊甸 里宝石属性介绍 伊甸宝石数据库

和最后如果你真的被唱片插入要记录,你应该使用RDS而不是红移的,除非你的数据集是巨大的,你会得到更好的性能。

好吧,这是我尝试在红宝石,但说实话,以前我从来没有做过红宝石,但实际上它只是一个到PSQL数据库连接。您正在尝试连接到红移通过AWS SDK,这就是用来启动,调整和管理。连接到红移这应该通过JDBC / ODBC驱动程序sqlworkbench,PSQL的Linux命令行,等等...

完成

 要求PG
主机='redshift-xxxx.aws.com
端口= 5439
选项​​=''
TTY =''
数据库名='MYDB'
登录='主人'
密码='M @ st3rP @ ssw0rd
康恩= PGconn.new(主机,端口选择,TTY,DBNAME,登录名,密码)
 

当主机,端口,数据库名,登录名和密码都设置了红移的午餐时间。数据库名称是一个PSQL的事情,你知道很多关于PSQL?

Currently, we have a script that parses data and uploads it one record at a time to a mysql database. Recently, we decided to switch to aws redshift.

Is there a way I can use my amazon login credentials and my redshift cluster information to upload these records directly to the redshift database?

All the guides I'm finding online recommend importing text or csv files from an S3 bucket, but that is not very practical for my use case.

Thanks for any help

I'm looking to do something like this:

require 'aws-sdk'
require 'pg'

AWS.config(access_key_id: 'my_access_key_id', secret_access_key: 'my_secret_access_key', region: 'us-west-2')

redshift = AWS::Redshift.new

credentials = {
    driver: "org.postresql.Driver"
    url: "my_connect_url"
    username: "my_username"
    password: "my_password"
    database: "my_db"
}

db = redshift.connect(credentials) # **NOT A REAL LINE OF CODE, I WISH IT WAS**

sql_query = "INSERT INTO my_table (my_column) 
        VALUES ('hello world'); " 

db.query(sql_query)
db.close

解决方案

Really what you should do here is insert your records one at a time in S3. Then periodically do a load of that file. Redshift is much more efficient at loading a 100,000 line file, then say entering 100 lines of data one by one(rough estimate for my experince...). If you really want to insert stuff record by record you can do this with any standard PSQL connector for ruby. Redshift can be connected to using jdbc/odbc psql drivers. Kinda like the sample program you wrote.

I dont recommend doing this... but here is the doc for insert http://docs.aws.amazon.com/redshift/latest/dg/r_INSERT_30.html

I would check out this question about appending to an s3 file. This is REALLY what you want to do...

Ruby - Append content at the end of the existing s3 file using fog

EDIT So I kinda jumped on that question without reading answer.... So correction, you need to create the file locally, and once it reaches a certain size upload it to s3, then redshift load command.

And here for loading into redshift http://docs.aws.amazon.com/redshift/latest/dg/t_Loading-data-from-S3.html

OR.... you could try this loading data from remote hosts... I have never done this before, but its basically skips the s3 load part, but you still want a large file to move. http://docs.aws.amazon.com/redshift/latest/dg/loading-data-from-remote-hosts.html

And lastly if you really want record by record inserts, you should probably use RDS instead of Redshift, you will get better performance unless you dataset is huge.

Okay this is my try at ruby, but to be honest I have never done RUBY before, but really its just a connection to a PSQL database. You are trying to connect to redshift through AWS SDK, thats used to launch, resize and manage. Connection to redshift for this should be done via JDBC/ODBC driver sqlworkbench, psql linux cli, etc...

require 'pg'
host = 'redshift-xxxx.aws.com'
port = 5439
options = ''
tty = ''
dbname = 'myDB'
login = 'master'
password = 'M@st3rP@ssw0rd'
conn = PGconn.new(host, port, options, tty, dbname, login, password)

Where host, port, dbname, login, and password are all set up during lunch of redshift. DBname is a psql thing, do you know much about psql?