hbase python api_Hbase的Python API模块Starbase介绍

本文介绍了Starbase,一个用于HBase的Python API,提供了创建、操作和管理HBase表的功能。通过示例代码展示了如何安装、创建表、插入数据、更新和删除数据,以及批处理操作。
摘要由CSDN通过智能技术生成

The following guest post is provided by Artur Barseghyan, a web developer currently employed by Goldmund, Wyldebeast & Wunderliebe in The Netherlands.

Python is my personal (and primary) programming language of choice and also happens to be the primary programming language at my company. So, when starting to work with a new technology, I prefer to use a clean and easy (Pythonic!) API.

After studying tons of articles on the web, reading (and writing) white papers, and doing basic performance tests (sometimes hard if you’re on a tight schedule), my company recently selected Cloudera for our Big Data platform (including using Apache HBase as our data store for Apache Hadoop), with Cloudera Manager serving a role as “one console to rule them all.”

However, I was surprised shortly thereafter to learn about the absence of a working Python wrapper around the REST API for HBase (aka Stargate). I decided to write one in my free time, and the result, ladies and gentlemen, wasStarbase (GPL).

In this post, I will provide some code samples and briefly explain what work has been done on Starbase. I assume that reader of this blog post already has some basic understanding of HBase (that is, of tables, column families, qualifiers, and so on).

一、安装

Next, I’ll show you some frequently used commands and use cases. But first, install the current version of Starbase from CheeseShop (PyPi).

# pip install starbase导入模块:

>>> from starbase import Connection…and create a connection instance. Starbase defaults to 127.0.0.1:8000; if your settings are different, specify them here.>>> c = Connection()

二、API 操作实例

2.1 显示所有的表

假设有两个现有的表名为table1和table2表,以下将会打印出来。

>>> c.tables()

['table1', 'table2']2.2 表的设计操作

每当你需要操作的表,你需要先创建一个表的实例。

创建一个表实例(注意,在这一步骤中没有创建表):

>>> t = c.table('table3')Create a new table:

Create a table with columns ‘column1′, ‘column2′, ‘column3′ (here the table is actually created):

>>> t.create('column1', 'column2', 'column3')

201检查表是否存在:

>>> t.exists()

True查看表的列:

>>> t.columns()

['column1', 'column2', 'column3']将列添加到表,(‘column4’,‘column5’,‘column6’,‘column7’):

>>> t.add_columns('column4', 'column5', 'column6', 'column7')

200删除列表,(‘column6’, ‘column7’):

>>> t.drop_columns('column6', 'column7')

201删除整个表:

>>> t.drop()

2002.3 表的数据操作

将数据插入一行:

>>> t.insert(

[quote]>> 'my-key-1',

>>> {

>>> 'column1': {'key11': 'value 11', 'key12': 'value 12', 'key13': 'value 13'},

>>> 'column2': {'key21': 'value 21', 'key22': 'value 22'},

>>> 'column3': {'key32': 'value 31', 'key32': 'value 32'}

>>> }

>>> )

200请注意,您也可以使用“本地”的命名方式列和细胞(限定词)。以下的结果等于前面的例子的结果。

>>> t.insert(

>>> 'my-key-1a',

>>> {

>>> 'column1:key11': 'value 11', 'column1:key12': 'value 12', 'column1:key13': 'value 13',

>>> 'column2:key21': 'value 21', 'column2:key22': 'value 22',

>>> 'column3:key32': 'value 31', 'column3:key32': 'value 32'

>>> }

>>> )

200更新一排数据:

>>> t.update(

>>> 'my-key-1',

>>> {'column4': {'key41': 'value 41', 'key42': 'value 42'}}

>>> )

200Remove a row cell (qualifier):

>>> t.remove('my-key-1', 'column4', 'key41')

200Remove a row column (column family):

>>> t.remove('my-key-1', 'column4')

200Remove an entire row:

>>> t.remove('my-key-1')

200Fetch a single row with all columns:>>> t.fetch('my-key-1')

{

'column1': {'key11': 'value 11', 'key12': 'value 12', 'key13': 'value 13'},

'column2': {'key21': 'value 21', 'key22': 'value 22'},

'column3': {'key32': 'value 31', 'key32': 'value 32'}

}Fetch a single row with selected columns (limit to ‘column1′ and ‘column2′ columns):

>>> t.fetch('my-key-1', ['column1', 'column2'])

{

'column1': {'key11': 'value 11', 'key12': 'value 12', 'key13': 'value 13'},

'column2': {'key21': 'value 21', 'key22': 'value 22'},

}Narrow the result set even more (limit to cells ‘key1′ and ‘key2′ of column `column1` and cell ‘key32′ of column ‘column3′):>>> t.fetch('my-key-1', {'column1': ['key11', 'key13'], 'column3': ['key32']})

{

'column1': {'key11': 'value 11', 'key13': 'value 13'},

'column3': {'key32': 'value 32'}

}Note that you may also use the native means of naming the columns and cells (qualifiers). The example below does exactly the same thing as the example above.

>>> t.fetch('my-key-1', ['column1:key11', 'column1:key13', 'column3:key32'])

{

'column1': {'key11': 'value 11', 'key13': 'value 13'},

'column3': {'key32': 'value 32'}

}If you set the perfect_dict argument to False, you’ll get the native data structure:

>>> t.fetch('my-key-1', ['column1:key11', 'column1:key13', 'column3:key32'], perfect_dict=False)

{

'column1:key11': 'value 11', 'column1:key13': 'value 13',

'column3:key32': 'value 32'

}2.4 对表数据批处理操作

Batch operations (insert and update) work similarly to routine insert and update, but are done in a batch. You are advised to operate in batch as much as possible.[/quote]

In the example below, we will insert 5,000 records in a batch:

>>> data = {

[quote]>> 'column1': {'key11': 'value 11', 'key12': 'value 12', 'key13': 'value 13'},

>>> 'column2': {'key21': 'value 21', 'key22': 'value 22'},

>>> }

>>> b = t.batch()

>>> for i in range(0, 5000):

>>> b.insert('my-key-%s' % i, data)

>>> b.commit(finalize=True)

{'method': 'PUT', 'response': [200], 'url': 'table3/bXkta2V5LTA='}In the example below, we will update 5,000 records in a batch:

>>> data = {

>>> 'column3': {'key31': 'value 31', 'key32': 'value 32'},

>>> }

>>> b = t.batch()

>>> for i in range(0, 5000):

>>> b.update('my-key-%s' % i, data)

>>> b.commit(finalize=True)

{'method': 'POST', 'response': [200], 'url': 'table3/bXkta2V5LTA='}Note: The table batch method accepts an optional size argument (int). If set, an auto-commit is fired each the time the stack is full.

2.5 表数据搜索(行扫描)

A table scanning feature is in development. At the moment it’s only possible to fetch all rows from a table. The result set returned is a generator.[/quote]

注意:表数据扫描功能正在开发中。目前仅支持取出表中所有数据(Full Table Scan),暂不支持范围扫描(RowKey Range Scan),其结果以一个迭代器形式返回。

>>> t.fetch_all_rows()就介绍到这里了,没有时间翻译,聽简单的英文!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值