The two ground-truth example sessions are hosted on Cloudflare R2 (~378 GiB total, 29,266 files). R2 serves direct downloads with no egress fees.
Biometric / likeness notice. These sessions contain video of an identifiable human performer - the author/operator himself, who consents to publication of his own likeness (he is the sole subject). The data is published for research verification of the recording protocol; it is a biometric/likeness corpus, so use it accordingly and respectfully. No other person is depicted.
download.shThe TruthBeam repo ships a friendly tiered downloader - grab a slice instead of the whole corpus:
./download.sh <tier> |
Size | What you get |
|---|---|---|
scores |
~2 MB | Path A inputs; the headline metric is recomputable at truthbeam.com |
models |
~1.1 GB | verifier (456 MB) + F-A v1 forger checkpoints |
sample [d2|v10] |
~180 MB | a taste of one session: metadata + 8 preview/emission pairs + 2 raw frames - enough to see the data |
video |
~640 MB | the hand-made 2023 video (+ the 64 s intro) |
session d2|v10 |
232 / 146 GiB | a full ground-truth session |
all |
378 GiB | everything |
The headline result is recomputable from these public files at truthbeam.com, which carries the verification scripts and mechanics; sample lets you open real previews and emissions. The full per-file lists and IPFS CIDs are below.
Every object has its own direct URL. The complete, machine-readable lists (one URL per
line, ready for wget -i / curl) are:
downloads/d2_files.txt - Session A (D2), 17,987 files, 232 GiBdownloads/v10_files.txt - Session B (V10), 11,279 files, 146 GiBPer-object MD5 checksums for every file (paths relative to each session root):
(The raw captures and emission tiles also carry per-frame BLAKE3 file hashes in each
session's chain_log; the MD5 lists are a transfer-integrity convenience. The verification
mechanics live at truthbeam.com.)
Bulk download, e.g.:
wget -x -c -i downloads/d2_files.txt # mirrors the session tree locally (resumable)
# verify a mirrored session tree (run from the sessions/d2/ directory):
md5sum -c d2_md5sums.txt
# or a single file:
curl -O https://data.truthbeam.com/sessions/d2/verification_bundle.json
The dataset is also content-addressed on IPFS (permanence courtesy; gateway availability varies). Each session is its own reproducible CID (UnixFS directory DAG):
bafybeicrssbic35534es3sbwyzhlw7reboh6wy75htmo53ke5mfsphkmwibafybeier2sfcjrrgw7amne3lwogise6umyeyf6qgivgrmvx3to4vsdsbcmEvery CID is reproducible from the bytes:
ipfs add --only-hash --recursive --cid-version=1 --raw-leaves --chunker=size-1048576 <session>.
(The full per-unit manifest - sessions, models, evidence - is published as CID_MANIFEST.json.)
PolieBotics.mp4An original record: the original, hand-made 2023 PolieBotics video (≈2:08, ~608 MiB, H.264/MP4) - a human-authored record from before the LLM-assisted rebuild, and one of the project's primary records to verify against (see the README's provenance & error note), not an authority in itself. It is committed by hash here, so its integrity is verifiable independent of where it is hosted:
8fbdb64ddd248246e7a8d840fa191467ab24ea79058047deb0ea537af95c0e9200d0e4531c1896ff72bf1ac7b7f2a4146af4f8ee5b08a63bc8708f333feb87b7636,733,091 bytes · duration 127.99 sAvailable on R2 at https://data.truthbeam.com/pinata/PolieBotics.mp4, and
on IPFS at CID
QmP8JDfeBCunq4VQ8f6XUbiLJK55dG9jLav7k5q2HpnmxS
(Pinata-pinned; resolves on public gateways, availability varies). Verify with b3sum
PolieBotics.mp4 or sha256sum PolieBotics.mp4. (BLAKE3 is the project's canonical commitment - the
same hash family that commits each dataset frame in chain_log.csv.)
The PolieProboscis model - the demonstration-hardware rig, hinged 3-part DFCP SPHERIC (in-repo):
proboscis/PolieProboscis_DFCP_Spheric_hinged_20260116.zip
(~8.8 MB; body + top cover + hinge pins; thanks to Alfonso). Included for reference, not yet
test-printed.
(The PoliePals mask - a bauta worn by the masked PoliePals, Alfonso's execution, with novel reflective conical-hole eyes - lives in the PoliePals imaginal layer: poliepals.com → The mask.)
Start here:
README_BUNDLE.md - what the bundle containsCLAIMS.md - the claims this session supportsmanifest.json · pretty - full file manifest + digestsverification_bundle.json · pretty - the auditable evidence bundleverify_report.json - verifier outputLogs:
Frame data (5,992 each - see d2_files.txt for per-frame URLs):
Recordings/ - raw capturesderived/Emissions/ - emitted tilesderived/Recordings_previews/ - preview rendersStart here:
README_BUNDLE.md - what the bundle containsCLAIMS.md - the claims this session supportsmanifest.json · pretty - full file manifest + digestsverification_bundle.json · pretty - the auditable evidence bundleverify_report.json - verifier outputLogs:
Frame data (3,743 each - see v10_files.txt for per-frame URLs):
Recordings/ - raw capturesderived/Emissions/ - emitted tilesderived/Recordings_previews/ - preview rendersai_payloads/ - AI improv payloads (V10 only)All rights reserved. The dataset is published for inspection and verification under the
non-binding statement of intent in LICENSE; no licence is granted by this
publication.
This page is an LLM-mediated dataset: the same content as DOWNLOADS.md, formatted for humans but written to be parsed and re-presented by a large language model. Point your own LLM at it to explain, check, or summarise. The raw markdown twin is at DOWNLOADS.md (and a .txt copy).