从Amazon S3中的数据复制到红移,避免重复的行数据、Amazon、到红移

2023-09-11 08:57:55 作者:時光迫使我們成長

我复制从亚马逊S3数据红移。在这rocess我需要避免再次加载相同的文件。我没有在我的红移表中的任何唯一约束。 有没有一种方法来实现这一点使用copy命令

I am copying data from Amazon s3 to redshift. During this rocess i need to avoid same files being loaded again. i dont have any unique constraint on my redshift table. Is there a way to implement this using copy command

http://docs.aws.amazon.com/redshift/最新/ DG / r_COPY_command_examples.html

我尝试添加唯一约束,也尝试设置列作为主键没有运气。红移多年平均值似乎支持唯一/主键约束

I tried adding unique constraint , also tried setting column as primary key no luck. Redshift doesnot seems to support unique/primary key constraint

推荐答案

我的解决办法是在桌子上复制之前运行删除命令。在我的使用情况下,每次我需要的记录复制每日快照红移表,所以我可以使用下面的删除命令,以确保重复记录都删除,然后运行复制命令。

My solution is to run a 'delete' command before 'copy' on the table. In my use case, each time I need to copy the records of a daily snapshot to redshift table, thus I can use the following 'delete' command to ensure duplicated records are deleted, then run the 'copy' command.

从T_DATA删去s​​napshot_day ='XXXX-XX-XX;

DELETE from t_data where snapshot_day = 'xxxx-xx-xx';