python 读取 hdfs 数据

 使用python hdfs库 进行访问hdfs

import pandas as pd
from hdfs import InsecureClient
import os

hdfs_ip写入自己的hdfs namenode主机的ip,端口号自己修改,用户名自己修改

client_hdfs = InsecureClient('http://hdfs_ip:50070', user='my_user')

读取hdfs数据

with client_hdfs.read('/user/hdfs/wiki/helloworld.csv', encoding = 'utf-8') as reader:
  df = pd.read_csv(reader,index_col=0)

写入hdfs数据

// ====write file
liste_hello = ['hello1','hello2']
liste_world = ['world1','world2']
df = pd.DataFrame(data = {'hello' : liste_hello, 'world': liste_world})

# To write a Dataframe to HDFS.
with client_hdfs.write('/user/hdfs/wiki/helloworld.csv', encoding = 'utf-8') as writer:
  df.to_csv(writer)

相关推荐

  1. python 读取 hdfs 数据

    2024-07-18 00:00:08       20 阅读
  2. python读取kafka数据

    2024-07-18 00:00:08       37 阅读
  3. python/pytorch读取数据

    2024-07-18 00:00:08       52 阅读
  4. HDC2010+STM32读取数据发送到onenet平台

    2024-07-18 00:00:08       39 阅读
  5. python&Pandas二:数据读取与写入

    2024-07-18 00:00:08       58 阅读
  6. 20.python——数据读取与存储

    2024-07-18 00:00:08       43 阅读

最近更新

  1. docker php8.1+nginx base 镜像 dockerfile 配置

    2024-07-18 00:00:08       67 阅读
  2. Could not load dynamic library ‘cudart64_100.dll‘

    2024-07-18 00:00:08       72 阅读
  3. 在Django里面运行非项目文件

    2024-07-18 00:00:08       58 阅读
  4. Python语言-面向对象

    2024-07-18 00:00:08       69 阅读

热门阅读

  1. 营销策划方案模板

    2024-07-18 00:00:08       21 阅读
  2. C#模式匹配 关系模式,多个输入

    2024-07-18 00:00:08       21 阅读
  3. NumPy中np.clip()的用法

    2024-07-18 00:00:08       21 阅读
  4. geojson的数据格式是什么

    2024-07-18 00:00:08       18 阅读
  5. 深入解析JVM内存模型:面试题及详细解答

    2024-07-18 00:00:08       19 阅读
  6. C# 3.数组遍历和储存对象

    2024-07-18 00:00:08       22 阅读
  7. c++初阶知识——类和对象(下)

    2024-07-18 00:00:08       25 阅读
  8. 【Rust】使用日志记录利器flexi_logger

    2024-07-18 00:00:08       18 阅读