我在 Amazon RedShift 中有几个表,它们遵循几个维度列和一对指标名称/值列的模式.
I have several tables in Amazon RedShift that follow the pattern of several dimension columns and a pair of metric name/value columns.
DimensionA DimensionB MetricName MetricValue
---------- ---------- ---------- -----------
dimA1 dimB1 m1 v11
dimA1 dimB2 m1 v12
dimA1 dimB2 m2 v21
dimA2 dimB2 m1 v13
dimA3 dimB1 m2 v22
我正在寻找一种将数据展开/透视为每个唯一维度集一行的形式的好方法,例如:
I am looking for a good way to unwind/pivot the data into a form of one row per each unique dimension set, e.g.:
DimensionA DimensionB m1 m2
---------- ---------- --- ---
dimA1 dimB1 v11
dimA1 dimB2 v12 v21
dimA2 dimB2 v13
dimA3 dimB1 v22
生成执行这种展开的查询的好模式是什么?
What is a good pattern for generating queries that would perform this unwinding?
Amazon RedShift 基于 ParAccel 并支持 PostgreSQL 8.0.2,它没有 crosstab
、unnest
、pivot
或 反透视
.
Amazon RedShift is based on ParAccel and supports PostgreSQL 8.0.2, which does not have crosstab
, unnest
, pivot
or unpivot
.
您可以为每个 MetricName 创建一个 CASE 语句,但您还必须使用聚合来使 GROUP BY 工作.
You can just create a CASE statement per MetricName but you'll have to use an aggregate as well to make the GROUP BY work.
SELECT dimension_a
,dimension_b
,MAX(CASE WHEN metric_name = 'm1' THEN metric_value ELSE NULL END) m1
,MAX(CASE WHEN metric_name = 'm2' THEN metric_value ELSE NULL END) m2
FROM my_table
GROUP BY dimension_a
,dimension_b
;
值得注意的是,Redshift 对象名称从不区分大小写,但列内容始终区分大小写,这与 SQL Server 的默认设置相反.
Worth noting that Redshift object names are never case sensitive but column content always is, which is the opposite of SQL Server defaults.