simdrng

simdrng is a library of C++20 random number generators that run scalar and SIMD-accelerated, with nanobind-powered Python bindings that drop straight into numpy and scipy.

It pairs best-in-class scalar generators (xoshiro256++, Philox, the ChaCha cipher core) with hand-tuned SIMD backends and picks the right implementation at runtime via xsimd dispatch — so one binary uses AVX-512 on a server and NEON on an ARM laptop, with no recompilation.

Core generator families

  • Xoshiro256++ — all-purpose 64-bit generator (scalar, SIMD dispatch, -march=native)

  • SplitMix64 — seeding helper used to expand a 64-bit seed into engine state

  • ChaCha 8 / 12 / 20 — counter-based cipher core (scalar, SIMD dispatch, native)

  • Philox 2x32 / 4x32 / 2x64 / 4x64 — stateless counter-based, trivially parallel (scalar, SIMD dispatch, native)

Choosing a generator

Following Vigna’s all-purpose-versus-special-purpose framing (extended to the counter-based families):

  • Default to xoshiro256++ — the all-purpose 64-bit generator: large state, passes all known tests, very fast, not cryptographically secure. See Xoshiro256++.

  • Need stateless, seekable, trivially-parallel streams? Use Philox — each work item derives its own sub-stream from (seed, counter) with no coordination. See Philox.

  • Need cryptographic-grade quality / a CSPRNG? Use ChaCha (8 / 12 / 20). See ChaCha.

  • Seeding is always done with SplitMix64 (Vigna’s recommendation); it is a seeding helper, not a stream generator. See SplitMix64.

The rationale behind this guidance is in Choosing a generator.

Quick start

Note

Ready-to-run Compiler Explorer links let you try simdrng without cloning anything.

#include <simdrng/xoshiro.hpp>

simdrng::Xoshiro rng(42);          // seed
std::uint64_t x = rng();           // next 64-bit value
double u = rng.uniform();          // double in [0, 1)

Every generator satisfies the standard UniformRandomBitGenerator requirements, so it composes with std::uniform_int_distribution and friends; uniform() is a faster path to a double in [0, 1).

Parallel streams

Pass optional thread_id / cluster_id to carve out independent, non-overlapping streams per thread and per node (via xoshiro’s jump() / long_jump()); Philox instead derives each work item’s sub-stream from (seed, counter) with no coordination. See the per-family Guides below.

Next reads

  • New here? Start with Installation, then the Examples.

  • Choosing a generator? See the per-family Guides.

  • Reproducibility, periods and the uniform() rationale: References.

Reference

Indices