Performance Tuning

Canonical reference: docs/PERFORMANCE.md and the TL;DR docs/PERFORMANCE_TLDR.md. Benchmarks: docs/BENCHMARKS.md and the live dashboard at qsv.dathere.com/benchmarks.

This page is a workflow guide to performance — when to reach for which knob, in what order. Numbers quoted on this page link to BENCHMARKS.md as the source of truth.

The five-minute rule

If you have five minutes to make qsv faster on your workload, in order:

Index your file. qsv index data.csv — one-time cost (~14s on 15 GB). Speeds up 9 commands.
Pre-populate the stats cache. qsv stats --stats-jsonl data.csv — makes frequency, schema, validate, pragmastat, pivotp, describegpt, sqlp, and scoresql smarter.
Set QSV_AUTOINDEX_SIZE. Auto-creates the index for any file above the threshold; auto-refreshes stale ones.
For Polars commands, generate a Polars schema. qsv schema --polars data.csv — written to data.pschema.json, picked up automatically by sqlp / joinp / pivotp.
For files > RAM, use extsort / extdedup instead of sort / dedup.

Most users stop here. The rest of this page is for the remaining 1%.

When to index, and what gets faster

qsv builds a <file>.idx sidecar that lets commands skip parsing rows they don't need.

Command	Without index	With index
`count`	scans the file	instantaneous
`sample` (reservoir)	scans the file	random I/O — no full scan
`slice`	parses rows up to the slice	parses only the sliced rows
`stats`	streaming, single-threaded	streaming, multithreaded
`frequency`	single-threaded	multithreaded
`split`	single-threaded	multithreaded
`schema`	single-threaded	multithreaded
`luau`	sequential mode only	unlocks random-access mode + `LASTROW`
`search`	sequential	parallel (per-chunk)

Index size is typically ~1-2% of the source file. A 15 GB CSV indexes in ~14 seconds and the index is ~27 MB.

qsv index huge.csv
# Or — auto-index everything above 100 MB:
export QSV_AUTOINDEX_SIZE=100000000

See Indexing, Compression & Diff → index.

Stats cache — the secret weapon

Running qsv stats --stats-jsonl <file> writes two sidecar files:

<file>.stats.csv — human-readable
<file>.stats.csv.data.jsonl — machine-readable (the file other commands actually consume)

Every "smart" command (🪄 in the README legend) automatically picks up the JSONL and uses it:

frequency — short-circuits all-unique columns (would otherwise blow memory)
schema — skips redundant type inference
validate — faster generated-schema runs
pragmastat — date-aware mode requires the cache
pivotp — smart aggregation auto-selection
tojsonl — smart JSON type inference
sqlp scoresql — query plan analysis
describegpt — deterministic statistical context for the LLM
sample --systematic/--weighted/--cluster — extra checks against cardinality

Set QSV_STATSCACHE_MODE=force to automatically create the cache whenever a smart command runs on a file that doesn't have one yet. See Stats Cache & Caching.

Multithreading

qsv detects logical CPUs and uses all of them by default. Cap with QSV_MAX_JOBS=N if you want to share CPU with other workloads.

Always-multithreaded (🚀): apply, applydp, blake3, datefmt, dedup, diff, excel, extsort, geocode, joinp, jsonl, replace, snappy, sort, sqlp, template, to, tojsonl, validate.

Multithreaded with an index (🏎️): frequency, sample, schema, search, searchset, split, stats.

For the full mapping, see the README legend.

Memory management

Commands marked 🤯 load the entire file into memory:

dedup (unless --sorted)
pragmastat
reverse (unless an index is present — then constant memory)
sort — for files > RAM use extsort
stats — for --cardinality / --quartiles / --median modes
table
transpose (unless --multipass or --long)

Commands marked 😣 use memory proportional to column cardinality:

frequency
schema
tojsonl

For huge files where you can't avoid memory mode, set the safety env vars:

export QSV_MEMORY_CHECK=1                  # opt into pre-check
export QSV_FREEMEMORY_HEADROOM_PCT=10      # reserve 10% of free RAM

With QSV_MEMORY_CHECK=1, qsv computes available memory (including swap), applies a platform multiplier (1.3× macOS, 1.15× Linux, 1.0× Windows), subtracts the headroom, and aborts if the file won't fit.

Approximate algorithms — bounded memory for huge cardinalities

stats and frequency can switch to Apache DataSketches for constant-memory operation:

# stats: approximate quartiles (t-digest) and cardinality (HyperLogLog)
qsv stats --everything --cardinality-method approx --quantile-method approx data.csv

# frequency: top-K with bounded error
qsv frequency --sketch-method frequent_items --sketch-map-size 16384 data.csv

Both add ~1.5% error in exchange for O(1) memory per column. Use when exact methods OOM.

Build-time optimizations

`target-cpu=native`

Prebuilt binaries for x86_64 are conservative to avoid SIGILL on older CPUs. If you build from source on your own machine, opt into your CPU's full instruction set:

CARGO_BUILD_RUSTFLAGS='-C target-cpu=native' \
  cargo build --release --locked --bin qsv -F all_features

Apple Silicon, Windows-on-ARM, IBM Power, and IBM Z prebuilds always have CPU optimizations enabled — no need to rebuild on those platforms.

Nightly Release Builds

qsv ships a nightly-toolchain prebuilt with extra optimizations enabled. See docs/PERFORMANCE.md#nightly-release-builds.

Allocator choice

qsv builds with mimalloc by default. For huge-file workloads, jemalloc is sometimes faster. Both have their own environment variables — see the mimalloc options and jemalloc options docs.

To see which allocator your binary uses: qsv --version shows it in the version string.

Buffer sizes

export QSV_RDR_BUFFER_CAPACITY=131072    # reader buffer, default 128 KB
export QSV_WTR_BUFFER_CAPACITY=524288    # writer buffer, default 512 KB

Bump up on slow disks (network mounts, spinning disks). Bump down for memory-constrained environments.

Polars-specific tuning

Polars-powered commands (🐻‍❄️): color, count, joinp, lens, pivotp, prompt, schema, scoresql, sqlp. They respond to Polars's env vars:

export POLARS_VERBOSE=1                # log plan + execution to stderr
export POLARS_PANIC_ON_ERR=1           # panic instead of returning an error
export POLARS_BACKTRACE_IN_ERR=1       # include backtrace in errors

For the full set, see Polars env vars.

The single biggest Polars optimization is the Polars schema (qsv schema --polars data.csv). It tells Polars the column types without an inference scan.

Performance Tuning

Performance Tuning

The five-minute rule

When to index, and what gets faster

Stats cache — the secret weapon

Multithreading

Memory management

Approximate algorithms — bounded memory for huge cardinalities

Build-time optimizations

target-cpu=native

Nightly Release Builds

Allocator choice

Buffer sizes

Polars-specific tuning

Tuning checklist

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Get Started

Command Reference

Cookbook

Tuning & Internals

Ecosystem

Reference

Legacy

Clone this wiki locally

`target-cpu=native`