DuckDB - Distributed Client-side Query

Create a session, share the worker URL with volunteers, then start runs.

This is a DuckDB Wasm project

Session

Larger chunks = fewer round-trips but more work per volunteer tab. Leave blank to use the server default.

Start Run

One URL per line. Parquet and CSV supported. Each file must serve CORS headers (Access-Control-Allow-Origin: *).

Runs in each volunteer browser. Must reference the table partition_data.

Runs on the server against partial_results (UNION ALL of all worker outputs).

Run status

Starting…
0 / ? chunks complete

Merge complete — result is a Parquet file ready to download.

Download result

Schema Inspector

Inspect the column schema and row count of a remote Parquet file before running a job. The file must serve CORS headers (Access-Control-Allow-Origin: *).

Schema

Sample data (10 rows)

About this project

This is an open-source experiment. Bugs should be expected.

How it works: The researcher creates a session here and shares the worker URL with others. Each volunteer opens that URL in their browser, which acts as a worker. The coordinator splits the dataset into chunks and distributes them across all connected workers. Each worker processes its chunk locally using DuckDB-WASM — no data is sent to a central server. Partial results are returned to the coordinator and merged into a final result available for download.

View source on GitHub