RUBRIC STUDIO OPEN · SOURCE LISTED · RELEASE PROOF REQUIRED

Read the source. Then run it.

The rubric IDE behind Human Data OS, with source listed for review. Write criterion-level rubrics, score samples, calibrate against gold labels, and export a packet a reviewer can open. The rubrics and golden answers you author here become the reviewed datasets Human Data OS delivers. Hosted preview, package, and local-build proof require current external evidence before availability is marketed.

The same rubric machinery that powers Human Data OS. Source link listed; package, release, checksum, and desktop trust proof are required before binary availability is marketed.

Read the source Run it in the browser Read the docs

Local

build proof required

MIT

read the license

Trust proof

required

HOW IT WORKS

Three steps. On your machine.

Write the criteria. Score with a mock or your own judge. Diff the wording, see the score impact. After a competitor lost four terabytes, including who its workers were, nobody wants tooling that pools their data. This never does.

STEP 01

WRITE

Author the criteria

Criterion-level rubrics in a project folder with schema validation, examples, evidence requirements, and theme tags.

→

STEP 02

MEASURE

Calibrate against gold

Bring expert scores into the calibration tab. Compute agreement. Probe judge bias. Rank criteria that need work.

→

STEP 03

EXPORT

Diff and export

Semantic rubric changes next to score-impact overlays. Export rubric specs, manifests, and intake packets a reviewer can open.

THE PRODUCT SURFACE

The rubric IDE should look like the place the rubric is made.

Captured from Rubric Studio listed-source materials: authoring, scoring, calibration, semantic diffing, and export. Local-build proof is required before runtime availability is marketed.

AUTHOR CRITERIA

Author criteria inside the project, not a doc beside it.

Criterion metadata, examples, validation, search, and project navigation stay in one local rubric workbench.

PREVIEW SCORING

Preview scoring before the rubric leaves the machine.

Run local or BYO judges against sample responses and inspect criterion-level evidence before promotion.

CALIBRATE AGAINST GOLD

Calibration makes disagreement visible.

Gold labels, judge agreement, drift, and readiness checks show which criteria are stable enough to export.

DIFF SCORE IMPACT

A wording diff should show scoring consequences.

Review rubric changes next to score-impact signals so reviewers can tell whether edits changed behavior.

EXPORT PORTABLE ARTIFACTS

The export is a review packet, not a screenshot.

Ship rubric specs, manifests, judge cards, framework adapters, and intake packets with explicit provenance.

WHAT COMES OUT

Every run leaves a portable file.

Every project leaves a folder. Every export is portable. A reviewer opens it without an account, and your CI runs it without our servers.

The intake packet is the handoff to Human Data OS. Open is where you author and score locally and inspect every criterion yourself. Human Data OS is where the packet meets the verified expert bench, signs the evidence, and keeps the record that survives an audit. When the EU AI Act provenance rules take effect in August 2026, that signed record is what traces a training dataset back to who reviewed it.

rubric.toml

Portable rubric in the rubric-spec schema. Validated, linted, diffable, and adapter-ready.

↳ ARTIFACT

judge-card.md

Disclosure card for the judge prompt: calibration results, known bias, use envelope, limits.

↳ ARTIFACT

eval-run-manifest.json

Reproducible scoring envelope with provenance, hashes, and the exact data the run touched.

↳ ARTIFACT

framework adapters

Exports for Inspect, OpenAI Evals, Promptfoo, Hugging Face, and lm-eval-harness.

↳ ARTIFACT

Intake packets

Packaged .auraonepkg with a privacy preview before handoff to AuraOne reviewers.

↳ ARTIFACT

SOURCE AND RELEASE PROOF

Source listed. Release proof required.

Read the listed source and review the project format. Package, release, checksum, desktop trust, and hosted preview proof are required before install or binary claims are marketed.

Checksums, signing, and release artifacts require current external proof.

Read the source

RELATED OPEN SURFACES

Next to this in AuraOne Open.

AGENT STUDIO OPEN

Open the agent. Own the trace.

Local IDE for MCP and A2A agents. Replay a run, compare, export the trace.

See the page →

ROBOTICS STUDIO OPEN

Review robot datasets without uploading the robot.

Scrub sensor streams. Cluster failures. Export reviewed subsets, on disk.

See the page →

TRUST TOOLKIT

The provenance an audit asks for.

Eval manifests, dataset cards, contamination audits. The packages rubric-spec, iaa-kit, judge-bench, and judge-card.

See the page →

RUBRIC STUDIO OPEN

A folder a reviewer can diff.

The listed build is the single-user IDE; local-build proof is required before runtime availability is marketed. Bring Human Data OS in when the problem becomes shared state: shared authorship, approval queues, the verified expert bench, and the signed record that survives an audit. Until then, the tools are yours.

Read the source Source and proof status Read the docs

AuraOne · Rubric Studio

Author criteria

Calibrate 0.82 κ

Diff 14 changes

 git-friendly rubric project