Principles
Principles are global behavioral guidelines that define what your target system must always do, or never do, regardless of what a user asks. They are the foundation of Compliance evaluation: Spectral uses them to construct scenarios that actively pressure those boundaries and detect violations.
Examples of Principles:
- Never recommend competitors.
- Do not offer legal advice.
- Always disclose that you are an AI when asked.
Sources
Common Principles are curated and maintained by Principled Intelligence. They are available on every Target out of the box, organized into categories that cover the most critical governance areas. Use them as a baseline before adding anything specific to your system.
Customer-provided Principles extend or override the baseline for your specific context. You can add them in three ways: enter them manually in natural language, let Spectral extract them automatically from documents in your Knowledge Base, or upload dedicated policy documents for extraction.
Categories
A Principle can optionally belong to at most one category. Categories group related Principles and help you navigate the library. You can create your own categories to organize custom Principles.
The categories below are available out of the box on every Target:
| Category | Description |
|---|---|
| Anti-Discrimination & Hate Speech | Blocks content that attacks, dehumanizes, or stereotypes individuals or groups based on identity characteristics. |
| Bias & Fairness | Requires balanced, stereotype-free responses that treat all users with equal respect. |
| Brand & Corporate Responsibility | Keeps the system aligned with the organization's voice. Prevents unauthorized commitments, competitor disparagement, and confidential disclosures. |
| Criminal Activity Prevention | Blocks assistance with planning, facilitating, or executing criminal activity. |
| Honest Communication | Requires truthful, consistent responses. The system must not fabricate expertise, overstate confidence, or contradict itself without acknowledgment. |
| Human Oversight & Professional Boundaries | Blocks personalized advice in high-stakes domains (medicine, law, finance) and defers to qualified professionals. |
| Privacy & Data Protection | Protects personal data, minimizes collection, and blocks assistance with unauthorized data access or surveillance. |
| Regulated Substances | Blocks facilitation of illegal production, trafficking, or misuse of controlled substances. |
| Robustness & Adversarial Resistance | Requires safe behavior under jailbreaks, prompt injection, role-play circumvention, and other manipulation attempts. |
| Safety & Harm Prevention | Blocks dangerous or exploitative content, with heightened standards around child protection and crisis situations. |
| Self-Harm Prevention | Requires safe, supportive responses when users express self-harm or suicidal intent. Blocks content that could worsen distress. |
| Sexual Content | Blocks harmful sexual content including harassment, explicit material, and grooming behavior. |
| Transparency & AI Identity | Requires honest disclosure of the system's AI nature, capabilities, and limitations. |
| User Autonomy & Empowerment | Requires the system to support informed decision-making, avoid manipulative patterns, and respect user agency. |
| Weapons & Dangerous Materials | Blocks instructions or guidance related to weapons, explosives, or dangerous materials. |
Tags
Each Principle can carry one or more tags for finer-grained organization within a category. Tags are free-form: you can create and assign them freely. Use them to mark regulatory frameworks (e.g., EU AI Act, GDPR), business areas, or any other grouping relevant to your context.