← gallery

Testing & evaluation

A layered quality system: unit tests, evals, tracing, red-teaming, and benchmarking — in one harness.

Section: testing-evaluation · scene id testing-evaluation-overview · tutorial 04-testing-evaluation