maxcomputer pyodps数据基本操作

最新推荐文章于 2024-05-01 10:02:45 发布

x_ing_feng

最新推荐文章于 2024-05-01 10:02:45 发布

阅读量3k

点赞数

原文链接：https://help.aliyun.com/document_detail/90412.html?spm=a2c4g.11186623.6.804.508d55ccFBWJqU

版权

1.阿里maxcomputer中的datawork支持python代码调用，运行；新建PyODPS 节点，将会包含一个全局的变量 odps 或者 o ，即 ODPS 入口，用户调用datawork中的表数据。

2.pyodps中创建表，但不建议这样操作，建议更直接的sql节点建表https://help.aliyun.com/document_detail/90412.html?spm=a2c4g.11186623.2.8.12d744cfBBmd6V#concept-lhx-tmf-cfb

3.pyodps执行sql语句，入口对象的execute_sql()和run_sql()方法可以执行SQL语句，其返回值是任务实例。

o.execute_sql('select * from dual')  #同步的方式执行，会阻塞直到SQL语句执行完成。
instance = o.run_sql('select * from dual')  #异步的方式执行。

4.运行SQL的Instance能够直接执行open_reader操作读取SQL执行结果。

with o.execute_sql('select * from dual').open_reader() as reader:
    for record in reader:
    # 处理每一个record。

5.获取表数据

使用对象入口的read_table()方法，举例如下。

for record in o.read_table('test_table', partition='pt=test'):
# 处理一条记录。

如果您仅需要查看每个表的最开始的小于1万条数据，可以对表对象调用head()方法。
```
t = o.get_table('dual')
# 处理每个Record对象。
for record in t.head(3):
```

在表上执行open_reader()操作来读取数据。如下：

使用with表达式的写法如下所示。

with t.open_reader(partition='pt=test') as reader:
count = reader.count
for record in reader[5:10]  # 可以执行多次，直到将count数量的record读完，此处可以改造成并行操作。
# 处理一条记录。

不使用with表达式的写法如下所示。

reader = t.open_reader(partition='pt=test')
count = reader.count
for record in reader[5:10]  # 可以执行多次，直到将count数量的record读完，这里可以改造成并行操作。
# 处理一条记录。

x_ing_feng

关注

0
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
maxcomputer pyodps数据基本操作

1.阿里maxcomputer中的datawork支持python代码调用，运行；新建PyODPS 节点，将会包含一个全局的变量odps或者o，即 ODPS 入口，用户调用datawork中的表数据。2.pyodps中创建表，但不建议这样操作，建议更直接的sql节点建表https://help.aliyun.com/document_detail/90412.html?spm=a2c4g.11186623.2.8.12d744cfBBmd6V#concept-lhx-tmf-cfb3.pyo...
复制链接

扫一扫