Evaluation Wizard

The evaluation wizard walks you through configuring and launching an evaluation against your Target. The first choice you make determines how the rest of the wizard unfolds.

Mode	When to use
Autopilot	You want broad coverage without manual configuration. Spectral generates all test ingredients from your Knowledge Base.
Custom	You want full control over every aspect.

Autopilot

Select dimensions

Choose which dimensions to include. Your selection determines how Spectral generates the agents:

Compliance: agents attempt to elicit Principle violations through indirect phrasing, social engineering, and escalating requests.
Accuracy: agents receive a fact drawn from your Knowledge Base and probe whether the system responds accurately around it.
Focus: agents submit out-of-scope or restricted requests to test whether the system handles them correctly.

Your choice here also affects the dimensions that Spectral will examine after each conversation:

	Completion	Accuracy	Compliance	Focus	Responsiveness
Accuracy selected	✓	✓			✓
Compliance selected	✓	✓	✓		✓
Focus selected				✓

Set Knowledge Base scope

Choose which documents from your Knowledge Base should inform the evaluation. Narrowing the selection focuses attacks on a specific area of your system; using the full KB gives broader coverage.

Set depth

Control how extensively Spectral probes your system. Higher depth means more conversations and broader scenario coverage, at the expense of a longer run time and higher cost.

Depth	Description
Quick Scan	Fast, surface-level check. Best for smoke tests and quick regressions.
Standard	Balanced coverage of the key scenarios for each selected dimension.
Thorough	In-depth evaluation with broad scenario coverage.
Deep Dive	Coming soon.

Launch

Review your configuration and launch. Spectral generates the tasks, principles and personas that will drive its agents automatically, based on the mode you selected and your target description and knowledge base.

Autopilot​

Autopilot