Hive数据仓库行转列

查了很多资料发现网上很多文章都是转发和抄袭,有些问题。这里分享一个自己项目中使用的行转列例子,供大家参考。代码如下:

SELECT
  my_id,
  nm_cd_map['A'] AS my_cd_a,
  nm_cd_map['B'] AS my_cd_b,
  nm_cd_map['C'] AS my_cd_c,
  nm_num_map['A'] AS my_num_a,
  nm_num_map['B'] AS my_num_b,
  nm_num_map['C'] AS my_num_c
FROM
  (
    SELECT
      t.my_id,
      STR_TO_MAP(my_nm_cds,';',':') AS nm_cd_map,
      STR_TO_MAP(my_nm_nums,';',':') AS nm_num_map
    FROM
      (
        SELECT
          my_id,
          CONCAT_WS(';',COLLECT_LIST(CONCAT(my_nm,':',my_cd))) AS my_nm_cds,
          CONCAT_WS(';',COLLECT_LIST(CONCAT(my_nm,':',my_num))) AS my_nm_nums
        FROM
          (
            SELECT '1' AS my_id,'A' AS my_nm,'D01' AS my_cd,19 AS my_num
            UNION ALL
            SELECT '1' AS my_id,'B' AS my_nm,'D04' AS my_cd,18 AS my_num
            UNION ALL
            SELECT '1' AS my_id,'C' AS my_nm,'D02' AS my_cd,17 AS my_num
            UNION ALL
            SELECT '2' AS my_id,'A' AS my_nm,'D03' AS my_cd,16 AS my_num
            UNION ALL
            SELECT '2' AS my_id,'B' AS my_nm,'D05' AS my_cd,15 AS my_num
            UNION ALL
            SELECT '2' AS my_id,'C' AS my_nm,'D06' AS my_cd,14 AS my_num
          )
        GROUP BY my_id
      ) t
  ) t
WHERE 1=1;

如果是在SparkSQL或Presto平台,或者阿里云的MaxCompute平台,还可使用如下方式:

-- 其实也可使用CONCAT然后STR_TO_MAP的方式,或者用MAP_FROM_ARRAYS,再或者用数组排序后ARRAY[n] AS的方式
SELECT
  my_id,
  nm_cd_map['A'] AS my_cd_a,
  nm_cd_map['B'] AS my_cd_b,
  nm_cd_map['C'] AS my_cd_c,
  nm_num_map['A'] AS my_num_a,
  nm_num_map['B'] AS my_num_b,
  nm_num_map['C'] AS my_num_c
FROM
  (
    SELECT
      t.my_id,
      MAP_FROM_ENTRIES(COLLECT_LIST(nm_cd)) AS nm_cd_map,
      MAP_FROM_ENTRIES(COLLECT_LIST(nm_num)) AS nm_num_map
    FROM
      (
        SELECT
          my_id,
          my_nm,
          my_cd,
          my_num,
          STRUCT(my_nm,my_cd) AS nm_cd,
          STRUCT(my_nm,my_num) AS nm_num
        FROM
          (
            SELECT '1' AS my_id,'A' AS my_nm,'D01' AS my_cd,19 AS my_num
            UNION ALL
            SELECT '1' AS my_id,'B' AS my_nm,'D04' AS my_cd,18 AS my_num
            UNION ALL
            SELECT '1' AS my_id,'C' AS my_nm,'D02' AS my_cd,17 AS my_num
            UNION ALL
            SELECT '2' AS my_id,'A' AS my_nm,'D03' AS my_cd,16 AS my_num
            UNION ALL
            SELECT '2' AS my_id,'B' AS my_nm,'D05' AS my_cd,15 AS my_num
            UNION ALL
            SELECT '2' AS my_id,'C' AS my_nm,'D06' AS my_cd,14 AS my_num
          )
      ) t
    GROUP BY my_id
  ) t
WHERE 1=1;

相关推荐

  1. Hive数据仓库

    2024-02-21 06:14:05       52 阅读
  2. 【大数据面试题】38 说说 Hive 怎么

    2024-02-21 06:14:05       24 阅读
  3. 利用pandas进行数据转行

    2024-02-21 06:14:05       37 阅读
  4. Hive进阶函数:inline() 和 struct() ,一

    2024-02-21 06:14:05       57 阅读
  5. hive函数stack(int n, v_1, v_2, ..., v_k)

    2024-02-21 06:14:05       43 阅读

最近更新

  1. docker php8.1+nginx base 镜像 dockerfile 配置

    2024-02-21 06:14:05       94 阅读
  2. Could not load dynamic library ‘cudart64_100.dll‘

    2024-02-21 06:14:05       100 阅读
  3. 在Django里面运行非项目文件

    2024-02-21 06:14:05       82 阅读
  4. Python语言-面向对象

    2024-02-21 06:14:05       91 阅读

热门阅读

  1. starrocks对大量数据怎么实现hash join

    2024-02-21 06:14:05       45 阅读
  2. npm install一直卡在 sill idealTree buildDeps

    2024-02-21 06:14:05       47 阅读
  3. 机器学习速成

    2024-02-21 06:14:05       56 阅读
  4. 6CCS3ML1 Machine Learning

    2024-02-21 06:14:05       36 阅读
  5. 机器学习 -- 矩阵和向量

    2024-02-21 06:14:05       47 阅读
  6. PaddleSeg分割框架解读[02] 配置文件config详解

    2024-02-21 06:14:05       49 阅读
  7. 74.如何实现无XML零配置的SpringMVC

    2024-02-21 06:14:05       47 阅读
  8. git项目管理

    2024-02-21 06:14:05       54 阅读
  9. linux docker部署深度学习环境(docker还是conda)

    2024-02-21 06:14:05       56 阅读
  10. singularity-ce-4.1.0 + go 完整安装步骤,及报错解决

    2024-02-21 06:14:05       47 阅读