faustIndex — AI Reliability Standard

Research

faustIndex is a measurement framework for evaluating AI output reliability. It is a research standard, not a product.

What is the faustIndex?

The faustIndex defines measurable dimensions of reliability for LLM-generated outputs — separating rhetorical confidence from verification-backed quality. It is intended for research reproducibility and internal product discipline, not end-user marketing scores.

How it’s calculated

Full methodology will appear in our forthcoming paper on arXiv (link to follow). The calculation ties structured verification signals — tests, witnesses, and cross-model checks — to a compact reliability profile suitable for comparing systems under controlled conditions.

How MOBLUEHQ products use it

blueForge and downstream vertical terminals use faustIndex-style verification as a design constraint: agents propose actions, but the substrate records whether independent checks passed before outputs ship. Consumer-facing apps inherit that discipline indirectly through tooling, not through a public “faust score.”

Read the paper

Faust sections 4–6 (full treatment): arXiv link forthcoming. Until publication, treat this page as the canonical statement that faustIndex is research infrastructure, not a SKU.