2025年4月12日 星期六 乙巳(蛇)年 正月十三 设为首页 加入收藏
rss
您当前的位置:首页 > 计算机 > 编程开发 > Python

python连接clickhouse数据库

时间:08-27来源:作者:点击数:28
简介

Yandex在2016年6月15日开源了一个数据分析的数据库,名字叫做ClickHouse,这对保守俄罗斯人来说是个特大事。更让人惊讶的是,这个列式存储数据库的跑分要超过很多流行的商业MPP数据库软件,例如Vertica。如果你没有听过Vertica,那你一定听过 Michael Stonebraker,2014年图灵奖的获得者,PostgreSQL和Ingres发明者(Sybase和SQL Server都是继承Ingres而来的), Paradigm4和SciDB的创办者。Michael Stonebraker于2005年创办Vertica公司,后来该公司被HP收购,HP Vertica成为MPP列式存储商业数据库的高性能代表,Facebook就购买了Vertica数据用于用户行为分析。简单的说,ClickHouse作为分析型数据库,有三大特点:一是跑分快,二是功能多,三是文艺范

官网地址:https://clickhouse.tech/

官方文档:https://clickhouse.tech/docs/zh/single/

clickhouse-driver

ClickHouse没有官方的Python接口,有个第三方的库,叫clickhouse-driver,GitHub地址是:mymarilyn/clickhouse-driver: ClickHouse Python Driver with native interface support

安装:

  • pip install clickhouse-driver

使用方法如下:

  • from clickhouse_driver import Client
  • client = Client(host='localhost', database='default', user='default', password='')
  • client.execute('SHOW DATABASES')
  • ==========================================================
  • >>> from clickhouse_driver import connect
  • >>>
  • >>> conn = connect('clickhouse://localhost')
  • >>> cursor = conn.cursor()
  • >>>
  • >>> cursor.execute('SHOW TABLES')
  • >>> cursor.fetchall()
  • [('test',)]
clickhouse-sqlalchemy

安装

  • pip install clickhouse-sqlalchemy==0.1.4
  • pip install sqlalchemy==1.3.19

使用

  • # -*- coding:utf-8 -*-
  • from clickhouse_sqlalchemy import make_session
  • from sqlalchemy import create_engine
  • conf = {
  • "user": "default",
  • "password": "",
  • "server_host": "47.104",
  • "port": "8123",
  • "db": "test"
  • }
  • connection = 'clickhouse://{user}:{password}@{server_host}:{port}/{db}'.format(**conf)
  • engine = create_engine(connection, pool_size=100, pool_recycle=3600, pool_timeout=20)
  • def get_session(engine):
  • return make_session(engine)
  • def execute(sql):
  • session = get_session(engine)
  • cursor = session.execute(sql)
  • try:
  • fields = cursor._metadata.keys
  • return [dict(zip(fields, item)) for item in cursor.fetchall()]
  • finally:
  • cursor.close()
  • session.close()
  • query='SHOW TABLES'
  • result=execute(query)
  • print(result)
pandahouse

github:https://github.com/kszucs/pandahouse

安装

  • pip install pandahouse

Writing dataframe to clickhouse

  • from pandahouse.core import to_clickhouse
  • connection = {'host': 'http://clickhouse-host:8123',
  • 'database': 'test','user':'user' ,'password':'password','encoding':'utf-8'}
  • affected_rows = to_clickhouse(df, table='name', connection=connection)

Reading arbitrary clickhouse query to pandas

  • from pandahouse.core import read_clickhouse
  • df = read_clickhouse('SELECT * FROM {db}.table', index_col='id',
  • connection=connection)
clickhouse2pandas

github:https://github.com/lee19840806/clickhouse2pandas

安装

  • pip install clickhouse2pandas

使用

  • import clickhouse2pandas as ch2pd
  • connection_url = 'http://user:password@clickhouse_host:8123'
  • query = 'select * from system.numbers limit 1000000'
  • df = ch2pd.select(connection_url, query)
  • # df is a pandas dataframe converted from ClickHouse query result

API Reference

  • clickhouse2pandas.select(connection_url, query = None, convert_to = 'DataFrame', settings = None)
其它阅读

ClickHouse表引擎到底怎么选:

https://www.cdsy.xyz/computer/soft/database/other_database/240827/cd63085.html

clickHouse可视化查询工具:

https://www.cdsy.xyz/computer/soft/database/other_database/240827/cd63086.html

方便获取更多学习、工作、生活信息请关注本站微信公众号城东书院 微信服务号城东书院 微信订阅号
推荐内容
相关内容
栏目更新
栏目热门
本栏推荐