Rubric Studio

Rubrics that become model memory.

Turn prompts, model outputs, and expert judgment into governed rubrics, graded evaluations, and model-performance memory.

What it is

The rules, the review, and the grading record in one place.

Rubric Studio gives teams a governed way to define how model outputs are judged. It keeps authoring, AI-drafted criteria, expert approval, worker grading, evidence, QA, adjudication, scorecards, and exports attached to the same evaluation record.

How it works

Prompt to scorecard, with evidence in the middle.

01

Start with the work

Bring the prompt, model output, risk level, and task type into one authoring surface.

02

Draft or author

Write criteria by hand or let AI draft the first spine for expert review.

03

Approve the version

AI-drafted rubrics stay blocked until an expert approves and activates them.

04

Grade with evidence

Workers score each criterion, attach required evidence, and see blockers before submit.

05

Feed the scorecard

Submitted grades contribute to model scorecards and regression memory.

Who it is for

Built for teams that need a defensible grade.

AI labs grading model behavior before release.
Enterprise AI teams replacing spreadsheet evals.
Regulated decisioning programs that need proof.

Roles

Author
Creates the rubric spine and criterion rules.
Approver
Reviews risk, evidence gates, and activation state.
Grader
Scores model output criterion by criterion.
QA
Checks submitted grades and requests rework when needed.
Adjudicator
Resolves disagreement with the evidence thread intact.

Evidence-first design

Criteria can require quotes, source citations, screenshots, test output, or reviewer notes. Submit stays blocked until required criteria and required evidence are complete.

Exports and governance

Versioned rubric JSON
Graded JSONL
Criteria CSV
Evidence packets
Blind-mode export controls
Part of Evaluation Studio

Write the rule. Grade the work. Keep the memory.