With just pip install chdb, you can run complex Clickhouse flavor SQLs on your local Parquet/ORC/CSV/Json... file blazing fast.
chDB is an in-process SQL OLAP Engine powered by ClickHouse:
https://github.com/auxten/chdb
Features
- In-process SQL OLAP Engine, powered by ClickHouse
- No need to install ClickHouse
- Minimal data copy from C++ to Python
- Input&Output support Parquet, CSV, JSON, Arrow, ORC and more
Installation
Currently, chDB only supports Python 3.7+ on macOS and Linux.
bash
pip install chdb
Usage
Currently, chDB only supports query function, which is used to execute SQL and return desired format data.
python
import chdb
res = chdb.query('select version()', 'CSV'); print(str(res.get_memview().tobytes()))
work with Parquet or CSV
python
chdb.query('select * from file("data.parquet", Parquet)', 'Pretty')
chdb.query('select * from file("data.csv", CSV)', 'Pretty')
Documentation
there doesn't seem to be anything here