[原创]Python通过Thrift连接HBase

1.安装Thrift

下载thrift-0.8.0.tar.gz并解压,开始安装

[admin@server1 thrift-0.8.0]$ ./configure

 

[admin@server1 thrift-0.8.0]$ make

 

[root@server1 thrift-0.8.0]# make install

 

测试,可知已安装成功

[admin@server1 thrift-0.8.0]$ thrift -version

Thrift version 0.8.0

 

2.启动Thrift

 

生成Python语言通过Thrift访问HBase的库文件

[admin@server1 ~]$ thrift --gen py ~/hbase-0.90.5/src/main/resources/org/apache/hadoop/hbase/thrift/Hbase.thrift

 

查看所生成的文件夹gen-py

[admin@server1 ~]$ ll -R gen-py

gen-py:

total 4

drwxrwxr-x. 2 admin admin 4096 Dec 16 02:46 hbase

-rw-rw-r--. 1 admin admin    0 Dec 16 02:46 __init__.py

 

gen-py/hbase:

total 320

-rw-rw-r--. 1 admin admin    221 Dec 16 02:46 constants.py

-rw-rw-r--. 1 admin admin 274722 Dec 16 02:46 Hbase.py

-rwxr-xr-x. 1 admin admin  10966 Dec 16 02:46 Hbase-remote

-rw-rw-r--. 1 admin admin     43 Dec 16 02:46 __init__.py

-rw-rw-r--. 1 admin admin  25704 Dec 16 02:46 ttypes.py

 

将所目录gen-py/hbase 复制到python的包目录下

[root@server1 admin]# cp -R gen-py/hbase/   /usr/lib/python2.6/site-packages/

 

启动thrift,使用7777端口

[admin@server1 ~]$ hbase thrift -p 7777 start

starting thrift, logging to /home/admin/hbase-0.90.5/logs/hbase-admin-thrift-server1.out

 

查看thrift服务状态,可见已经成功启动

[admin@server1 ~]$ jps

3034 HMaster

2712 HQuorumPeer

2249 NameNode

2371 SecondaryNameNode

11296 Jps

2435 JobTracker

11268 ThriftServer

 

3. 连接HBase

 

HBase中有四个表

hbase(main):003:0> list

TABLE                                                                                                                             

jvmMonitor                                                                                                                         

stations                                                                                                                          

test                                                                                                                               

test2                                                                                                                             

4 row(s) in 0.0170 seconds

 

HBase中表test内容如下:

hbase(main):002:0> scan 'test'

ROW                               COLUMN+CELL                                                                                     

 row1                       column=data:1, timestamp=1353256169818, value=value2                                            

 row1                       column=data:2423, timestamp=1353256192813, value=value2                                         

 row1                       column=data:fff, timestamp=1353256202096, value=value2                                          

 row2                       column=data:2, timestamp=1353256325610, value=value2                                            

 row3                       column=data:3, timestamp=1353256320975, value=value3                                            

 row4                       column=data:4, timestamp=1353256315140, value=value4                                            

4 row(s) in 0.4310 seconds

 

编写文件testThrift.py,代码如下

#!/usr/bin/python

from thrift import Thrift

from thrift.transport import TSocket

from thrift.transport import TTransport

from thrift.protocol import TBinaryProtocol

from hbase import Hbase

from hbase.ttypes import *

 

 

transport = TSocket.TSocket('server1', 7777)

protocol = TBinaryProtocol.TBinaryProtocol(transport)

client = Hbase.Client(protocol)

transport.open()

print '\n all tables in hbase:'

print client.getTableNames()

print '\n row1 of test:'

print client.getRow('test','row1');

 

运行程序:

[admin@server1 python]$ python testThrift.py

 

 all tables in hbase:

['jvmMonitor', 'stations', 'test', 'test2']

 

 row1 of test:

[TRowResult(columns={'data:1': TCell(timestamp=1353256169818, value='value2'), 'data:2423': TCell(timestamp=1353256192813, value='value2'), 'data:fff': TCell(timestamp=1353256202096, value='value2')}, row='row1')]

 

posted @ 2014-05-28 08:44  lihui1625  阅读(256)  评论(0编辑  收藏  举报