GitHub - GlareDB/glaredb: GlareDB: An analytics DBMS for distributed data

About

Data exists everywhere: your laptop, Postgres, Snowflake and as files in S3. It exists in various formats such as Parquet, CSV and JSON. Regardless, there will always be multiple steps spanning several destinations to get the insights you need.

GlareDB is designed to query your data wherever it lives using SQL that you already know.

Install

Install/update glaredb in the current directory:

curl https://glaredb.com/install.sh | sh

It may be helpful to install the binary in a location on your PATH. For example, ~/.local/bin.

If you prefer manual installation, download, extract and run the GlareDB binary from a release in our releases page.

Getting Started

After Installing, get up and running with:

Local CLI

To start a local session, run the binary:

./glaredb

Or, you can execute SQL and immediately return (try it out!):

# Query a CSV on Hugging Face
./glaredb --query "SELECT * FROM \
'https://huggingface.co/datasets/fka/awesome-chatgpt-prompts/raw/main/prompts.csv';"

To see all options use --help:

./glaredb --help

Hybrid Execution

Sign up at https://console.glaredb.com for a free fully-managed deployment of GlareDB

Copy the connection string from GlareDB Cloud, for example:

./glaredb --cloud-url="glaredb://user:pass@host:port/deployment"
# or
./glaredb
> \open "glaredb://user:pass@host:port/deployment

Read our announcement on Hybrid Execution for more information.

Using GlareDB in Python

Install the official GlareDB Python library
```
pip install glaredb
```

Import and use glaredb.

import glaredb
con = glaredb.connect()
con.sql("select 'hello world';").show()

To use Hybrid Execution, sign up at https://console.glaredb.com and use the connection string for your deployment. For example:

import glaredb
con = glaredb.connect("glaredb://user:pass@host:port/deployment")
con.sql("select 'hello hybrid exec';").show()

GlareDB work with Pandas and Polars DataFrames out of the box:

import glaredb
import polars as pl

df = pl.DataFrame(
    {
        "A": [1, 2, 3, 4, 5],
        "fruits": ["banana", "banana", "apple", "apple", "banana"],
        "B": [5, 4, 3, 2, 1],
        "cars": ["beetle", "audi", "beetle", "beetle", "beetle"],
    }
)

con = glaredb.connect()

df = con.sql("select * from df where fruits = 'banana'").to_polars();

print(df)

Local Server

The server subcommand can be used to launch a server process for GlareDB:

./glaredb server

To see all options for running in server mode, use --help:

./glaredb server --help

When launched as a server process, GlareDB can be reached on port 6543 using a Postgres client. The following example uses psql to connect to a locally running server:

psql "host=localhost user=glaredb dbname=glaredb port=6543"

Configure the First Data Source

You can use a demo Postgres instance at pg.demo.glaredb.com. Adding this Postgres instance as data source is as easy as running the following command:

CREATE EXTERNAL DATABASE my_pg
    FROM postgres
    OPTIONS (
        host = 'pg.demo.glaredb.com',
        port = '5432',
        user = 'demo',
        password = 'demo',
        database = 'postgres',
    );

Once the data source has been added, it can be queried using fully qualified table names:

SELECT *
FROM my_pg.public.lineitem
WHERE l_shipdate <= date '1998-12-01' - INTERVAL '90'
LIMIT 5;

Check out the docs to learn about all supported data sources. Many data sources can be connected to the same GlareDB instance.

Done with this data source? Remove it with the following command:

DROP DATABASE my_pg;

Supported Data Sources

Source	Read	`INSERT INTO`	`COPY TO`	Table Function	External Table	External Database
Databases	--	--	--	--	--	--
MySQL	✅	✅	✅	✅	✅	✅
PostgreSQL	✅	✅	✅	✅	✅	✅
MariaDB (via mysql)	✅	✅	✅	✅	✅	✅
MongoDB	✅	✅	✅	✅	✅	✅
Microsoft SQL Server	✅	🚧	🚧	✅	✅	✅
Snowflake	✅	🚧	🚧	✅	✅	✅
BigQuery	✅	🚧	🚧	✅	✅	✅
Cassandra/ScyllaDB	✅	🚧	🚧	✅	✅	✅
ClickHouse	✅	🚧	🚧	✅	✅	✅
Oracle	🚧	🚧	🚧	🚧	🚧	🚧
ADBC	🚧	🚧	🚧	🚧	🚧	🚧
ODBC	🚧	🚧	🚧	🚧	🚧	🚧
Database Files	--	--	--	--	--	--
SQLite	✅	✅	🚧	✅	✅	✅
Microsoft Excel	✅	🚧	🚧	✅	✅	➖
DuckDB	🚧	🚧	🚧	🚧	🚧	🚧
File Formats	--	--	--	--	--	--
Apache Arrow	✅	🚧	✅	✅	✅	➖
Apache Parquet	✅	🚧	✅	✅	✅	➖
CSV	✅	🚧	✅	✅	✅	➖
JSON	✅	🚧	✅	✅	✅	➖
BSON	✅	🚧	✅	✅	✅	➖
Apache Avro	🚧	🚧	🚧	🚧	🚧	➖
Apache ORC	🚧	🚧	🚧	🚧	🚧	➖
Table Formats	--	--	--	--	--	--
Lance	✅	✅	✅	✅	✅	➖
Delta	✅	✅	✅	✅	✅	➖
Iceberg	✅	🚧	🚧	✅	✅	➖

✅ = Supported ➖ = Not Applicable 🚧 = Not Yet Supported

Building from Source

Building GlareDB requires Rust/Cargo to be installed. Check out rustup for an easy way to install Rust on your system.

Running the following command will build a release binary:

just build --release

The compiled release binary can be found in target/release/glaredb.

Documentation

Browse GlareDB documentation on our docs.glaredb.com.

Contributing

Contributions welcome! Check out CONTRIBUTING.md for how to get started.

License

See LICENSE. Unless otherwise noted, this license applies to all files in this repository.

Acknowledgements

GlareDB is proudly powered by Apache Datafusion and Apache Arrow. We are grateful for the work of the Apache Software Foundation and the community around these projects.

Name		Name	Last commit message	Last commit date
Latest commit History 1,343 Commits
.cargo		.cargo
.github		.github
benchmarks		benchmarks
bindings		bindings
crates		crates
examples		examples
rstests		rstests
scripts		scripts
testdata		testdata
tests		tests
xtask		xtask
.clang-format		.clang-format
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitmodules		.gitmodules
.ignore		.ignore
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
flake.lock		flake.lock
flake.nix		flake.nix
justfile		justfile
rust-toolchain.toml		rust-toolchain.toml
rustfmt.toml		rustfmt.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Install

Getting Started

Local CLI

Hybrid Execution

Using GlareDB in Python

Local Server

Configure the First Data Source

Supported Data Sources

Building from Source

Documentation

Contributing

License

Acknowledgements

About

Releases 36

Packages

Contributors 20

Languages

License

GlareDB/glaredb

Folders and files

Latest commit

History

Repository files navigation

About

Install

Getting Started

Local CLI

Hybrid Execution

Using GlareDB in Python

Local Server

Configure the First Data Source

Supported Data Sources

Building from Source

Documentation

Contributing

License

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 36

Packages 0

Contributors 20

Languages

Packages