Principles

Principles are global behavioral guidelines that define what your target system must always do, or never do, regardless of what a user asks. They are the foundation of Compliance evaluation: Spectral uses them to construct scenarios that actively pressure those boundaries and detect violations.

Examples of Principles:

Never recommend competitors.
Do not offer legal advice.
Always disclose that you are an AI when asked.

Sources

Common Principles are curated and maintained by Principled Intelligence. They are available on every Target out of the box, organized into categories that cover the most critical governance areas. Use them as a baseline before adding anything specific to your system.

Customer-provided Principles extend or override the baseline for your specific context. You can add them in three ways: enter them manually in natural language, let Spectral extract them automatically from documents in your Knowledge Base, or upload dedicated policy documents for extraction.

Categories

A Principle can optionally belong to at most one category. Categories group related Principles and help you navigate the library. You can create your own categories to organize custom Principles.

The categories below are available out of the box on every Target:

Category	Description
Anti-Discrimination & Hate Speech	Blocks content that attacks, dehumanizes, or stereotypes individuals or groups based on identity characteristics.
Bias & Fairness	Requires balanced, stereotype-free responses that treat all users with equal respect.
Brand & Corporate Responsibility	Keeps the system aligned with the organization's voice. Prevents unauthorized commitments, competitor disparagement, and confidential disclosures.
Criminal Activity Prevention	Blocks assistance with planning, facilitating, or executing criminal activity.
Honest Communication	Requires truthful, consistent responses. The system must not fabricate expertise, overstate confidence, or contradict itself without acknowledgment.
Human Oversight & Professional Boundaries	Blocks personalized advice in high-stakes domains (medicine, law, finance) and defers to qualified professionals.
Privacy & Data Protection	Protects personal data, minimizes collection, and blocks assistance with unauthorized data access or surveillance.
Regulated Substances	Blocks facilitation of illegal production, trafficking, or misuse of controlled substances.
Robustness & Adversarial Resistance	Requires safe behavior under jailbreaks, prompt injection, role-play circumvention, and other manipulation attempts.
Safety & Harm Prevention	Blocks dangerous or exploitative content, with heightened standards around child protection and crisis situations.
Self-Harm Prevention	Requires safe, supportive responses when users express self-harm or suicidal intent. Blocks content that could worsen distress.
Sexual Content	Blocks harmful sexual content including harassment, explicit material, and grooming behavior.
Transparency & AI Identity	Requires honest disclosure of the system's AI nature, capabilities, and limitations.
User Autonomy & Empowerment	Requires the system to support informed decision-making, avoid manipulative patterns, and respect user agency.
Weapons & Dangerous Materials	Blocks instructions or guidance related to weapons, explosives, or dangerous materials.

Sources​

Categories​

Tags​

Sources

Categories

Tags