首页后端开发Python使用Arrow管理数据(apache arrow使用场景)

使用Arrow管理数据(apache arrow使用场景)

时间2023-04-03 15:06:43发布访客分类Python浏览783
导读:Formathttps://arrow.apache.orgApache Arrow defines a language-independent columnar memory format for flat and hierarchic...

Format

https://arrow.apache.org

Apache Arrow defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also supports zero-copy reads for lightning-fast data access without serialization overhead.

Language Supported

Libraries are available for C, C++, C#, Go, Java, JavaScript, Julia, MATLAB, Python, R, Ruby, and Rust.

R

install.packages("arrow")
library(arrow)
# write iris to iris.arrow and compressed by zstd
arrow::write_ipc_file(iris,'iris.arrow', compression =  "zstd",compression_level=1)
# read iris.arrow as DataFrame
iris=arrow::read_ipc_file('iris.arrow')

python

# conda install -y pandas pyarrow
import pandas as pd
# read iris.arrow as DataFrame
iris=pd.read_feather('iris.arrow')
# write iris to iris.arrow and compressed by zstd
iris.to_feather('iris.arrow',compression='zstd', compression_level=1)

Julia

using Pkg
Pkg.add(["Arrow","DataFrames"])

using Arrow, DataFrames
# read iris.arrow as DataFrame
iris = Arrow.Table("iris.arrow") |>
     DataFrame
# write iris to iris.arrow, using 8 threads and compressed by zstd
Arrow.write("iris.arrow",iris,compress=:zstd,ntasks=8)

声明:本文内容由网友自发贡献,本站不承担相应法律责任。对本内容有异议或投诉,请联系2913721942#qq.com核实处理,我们将尽快回复您,谢谢合作!

python

若转载请注明出处: 使用Arrow管理数据(apache arrow使用场景)
本文地址: https://pptw.com/jishu/820.html
pytest学习和使用24-如何清空allure报告历史记录?我每次都手动删除,有点Low了~ 快速部署Vue.js前端项目(vue前端怎么部署)

游客 回复需填写必要信息