Author the criteria
Criterion-level rubrics in a project folder with schema validation, examples, evidence requirements, and theme tags.
→The rubric IDE behind Human Data OS, with source listed for review. Write criterion-level rubrics, score samples, calibrate against gold labels, and export a packet a reviewer can open. The rubrics and golden answers you author here become the reviewed datasets Human Data OS delivers. Hosted preview, package, and local-build proof require current external evidence before availability is marketed.
The same rubric machinery that powers Human Data OS. Source link listed; package, release, checksum, and desktop trust proof are required before binary availability is marketed.
Write the criteria. Score with a mock or your own judge. Diff the wording, see the score impact. After a competitor lost four terabytes, including who its workers were, nobody wants tooling that pools their data. This never does.
Criterion-level rubrics in a project folder with schema validation, examples, evidence requirements, and theme tags.
→Bring expert scores into the calibration tab. Compute agreement. Probe judge bias. Rank criteria that need work.
→Semantic rubric changes next to score-impact overlays. Export rubric specs, manifests, and intake packets a reviewer can open.
Captured from Rubric Studio listed-source materials: authoring, scoring, calibration, semantic diffing, and export. Local-build proof is required before runtime availability is marketed.

Criterion metadata, examples, validation, search, and project navigation stay in one local rubric workbench.

Run local or BYO judges against sample responses and inspect criterion-level evidence before promotion.

Gold labels, judge agreement, drift, and readiness checks show which criteria are stable enough to export.

Review rubric changes next to score-impact signals so reviewers can tell whether edits changed behavior.

Ship rubric specs, manifests, judge cards, framework adapters, and intake packets with explicit provenance.
Every project leaves a folder. Every export is portable. A reviewer opens it without an account, and your CI runs it without our servers.
The intake packet is the handoff to Human Data OS. Open is where you author and score locally and inspect every criterion yourself. Human Data OS is where the packet meets the verified expert bench, signs the evidence, and keeps the record that survives an audit. When the EU AI Act provenance rules take effect in August 2026, that signed record is what traces a training dataset back to who reviewed it.
Portable rubric in the rubric-spec schema. Validated, linted, diffable, and adapter-ready.
Disclosure card for the judge prompt: calibration results, known bias, use envelope, limits.
Reproducible scoring envelope with provenance, hashes, and the exact data the run touched.
Exports for Inspect, OpenAI Evals, Promptfoo, Hugging Face, and lm-eval-harness.
Packaged .auraonepkg with a privacy preview before handoff to AuraOne reviewers.
Read the listed source and review the project format. Package, release, checksum, desktop trust, and hosted preview proof are required before install or binary claims are marketed.
Checksums, signing, and release artifacts require current external proof.
Local IDE for MCP and A2A agents. Replay a run, compare, export the trace.
See the page →Scrub sensor streams. Cluster failures. Export reviewed subsets, on disk.
See the page →Eval manifests, dataset cards, contamination audits. The packages rubric-spec, iaa-kit, judge-bench, and judge-card.
See the page →The listed build is the single-user IDE; local-build proof is required before runtime availability is marketed. Bring Human Data OS in when the problem becomes shared state: shared authorship, approval queues, the verified expert bench, and the signed record that survives an audit. Until then, the tools are yours.