hive:创建自定义python UDF


add file hdfs://home/user/py3_script/;
set spark.yarn.dist.archives=hdfs://home/user/py3.tar.gz;
set spark.shuffle.hdfs.enabled=true;
set spark.shuffle.io.maxRetries=1;
set spark.shuffle.io.retryWait=0s;
set spark.network.timeout=120s;

INSERT OVERWRITE TABLE seains.image_infos PARTITION (date = '${date}')

SELECT
    TRANSFORM(*) USING 'py3.tar.gz/py3/bin/python py3_script/script.py' AS (
        mid,
        uri,
        count
    )
from 
(
    select
        mid
        sum(count)
    from 
        log_hourly
    where p_date = '${date}'
    group by mid
)
import sys

total_mids = []
mid_infos = {}

for line in sys.stdin:
    # line = "7321132368127836199\t66\t0\t"
    arr = line.strip('\n').split('\t')
    if len(arr) != 4:
        continue

    mid = arr[0]
    total_mids.append(mid)
    mid_infos[mid] = arr

    if len(total_mids) >= get_batch:
        res = get_feature(total_mids)
        print("error", file=sys.stderr)

        for mid in total_mids:

            line = res[mid]
            line = [str(x) for x in line]

            print('\t'.join(line))

        total_mids = []
        mid_infos = {}





hive:创建自定义python UDF_hive python udf只需要两列-CSDN博客

相关推荐

  1. hive:创建定义python UDF

    2024-01-16 13:32:03       57 阅读
  2. hive定义函数

    2024-01-16 13:32:03       33 阅读
  3. Hive定义函数详解

    2024-01-16 13:32:03       66 阅读
  4. Hive定义UpperGenericUDF函数

    2024-01-16 13:32:03       40 阅读
  5. Hive定义UDF函数

    2024-01-16 13:32:03       42 阅读
  6. hive定义udtf函数

    2024-01-16 13:32:03       32 阅读

最近更新

  1. docker php8.1+nginx base 镜像 dockerfile 配置

    2024-01-16 13:32:03       94 阅读
  2. Could not load dynamic library ‘cudart64_100.dll‘

    2024-01-16 13:32:03       101 阅读
  3. 在Django里面运行非项目文件

    2024-01-16 13:32:03       82 阅读
  4. Python语言-面向对象

    2024-01-16 13:32:03       91 阅读

热门阅读

  1. mysql定时任务

    2024-01-16 13:32:03       58 阅读
  2. SpringMVC数据传递及数据处理

    2024-01-16 13:32:03       62 阅读
  3. 【vue】nextTick的使用

    2024-01-16 13:32:03       56 阅读
  4. openssl3.2 - 官方demo学习 - mac - gmac.c

    2024-01-16 13:32:03       59 阅读
  5. 返利机器人详细解读,纯属个人观点

    2024-01-16 13:32:03       52 阅读
  6. 基于Asterisk和TTS/ASR语音识别的配置示例

    2024-01-16 13:32:03       56 阅读
  7. 1-1.this指针&闭包&作用域

    2024-01-16 13:32:03       57 阅读