使用 Amazon RedShift 透视表透视、Amazon、RedShift

2023-09-07 17:23:40 作者:不归路

我在 Amazon RedShift 中有几个表,它们遵循几个维度列和一对指标名称/值列的模式.

I have several tables in Amazon RedShift that follow the pattern of several dimension columns and a pair of metric name/value columns.

DimensionA  DimensionB  MetricName  MetricValue
----------  ----------  ----------  -----------
dimA1       dimB1       m1          v11
dimA1       dimB2       m1          v12
dimA1       dimB2       m2          v21
dimA2       dimB2       m1          v13
dimA3       dimB1       m2          v22        

我正在寻找一种将数据展开/透视为每个唯一维度集一行的形式的好方法,例如:

I am looking for a good way to unwind/pivot the data into a form of one row per each unique dimension set, e.g.:

DimensionA  DimensionB  m1   m2 
----------  ----------  ---  ---
dimA1       dimB1       v11
dimA1       dimB2       v12  v21
dimA2       dimB2       v13
dimA3       dimB1            v22        

生成执行这种展开的查询的好模式是什么?

What is a good pattern for generating queries that would perform this unwinding?

Amazon RedShift 基于 ParAccel 并支持 PostgreSQL 8.0.2,它没有 crosstabunnestpivot反透视.

Amazon RedShift is based on ParAccel and supports PostgreSQL 8.0.2, which does not have crosstab, unnest, pivot or unpivot.

推荐答案

您可以为每个 MetricName 创建一个 CASE 语句,但您还必须使用聚合来使 GROUP BY 工作.

You can just create a CASE statement per MetricName but you'll have to use an aggregate as well to make the GROUP BY work.

SELECT dimension_a
      ,dimension_b
      ,MAX(CASE WHEN metric_name = 'm1' THEN metric_value ELSE NULL END) m1
      ,MAX(CASE WHEN metric_name = 'm2' THEN metric_value ELSE NULL END) m2
  FROM my_table
 GROUP BY dimension_a
         ,dimension_b
;

值得注意的是,Redshift 对象名称从不区分大小写,但列内容始终区分大小写,这与 SQL Server 的默认设置相反.

Worth noting that Redshift object names are never case sensitive but column content always is, which is the opposite of SQL Server defaults.