Skip to main content

Principles

Principles are global behavioral guidelines that define what your target system must always do, or never do, regardless of what a user asks. They are the foundation of Compliance evaluation: Spectral uses them to construct scenarios that actively pressure those boundaries and detect violations.

Examples of Principles:

  • Never recommend competitors.
  • Do not offer legal advice.
  • Always disclose that you are an AI when asked.

Sources

Common Principles are curated and maintained by Principled Intelligence. They are available on every Target out of the box, organized into categories that cover the most critical governance areas. Use them as a baseline before adding anything specific to your system.

Customer-provided Principles extend or override the baseline for your specific context. You can add them in three ways: enter them manually in natural language, let Spectral extract them automatically from documents in your Knowledge Base, or upload dedicated policy documents for extraction.

Categories

A Principle can optionally belong to at most one category. Categories group related Principles and help you navigate the library. You can create your own categories to organize custom Principles.

The categories below are available out of the box on every Target:

CategoryDescription
Anti-Discrimination & Hate SpeechBlocks content that attacks, dehumanizes, or stereotypes individuals or groups based on identity characteristics.
Bias & FairnessRequires balanced, stereotype-free responses that treat all users with equal respect.
Brand & Corporate ResponsibilityKeeps the system aligned with the organization's voice. Prevents unauthorized commitments, competitor disparagement, and confidential disclosures.
Criminal Activity PreventionBlocks assistance with planning, facilitating, or executing criminal activity.
Honest CommunicationRequires truthful, consistent responses. The system must not fabricate expertise, overstate confidence, or contradict itself without acknowledgment.
Human Oversight & Professional BoundariesBlocks personalized advice in high-stakes domains (medicine, law, finance) and defers to qualified professionals.
Privacy & Data ProtectionProtects personal data, minimizes collection, and blocks assistance with unauthorized data access or surveillance.
Regulated SubstancesBlocks facilitation of illegal production, trafficking, or misuse of controlled substances.
Robustness & Adversarial ResistanceRequires safe behavior under jailbreaks, prompt injection, role-play circumvention, and other manipulation attempts.
Safety & Harm PreventionBlocks dangerous or exploitative content, with heightened standards around child protection and crisis situations.
Self-Harm PreventionRequires safe, supportive responses when users express self-harm or suicidal intent. Blocks content that could worsen distress.
Sexual ContentBlocks harmful sexual content including harassment, explicit material, and grooming behavior.
Transparency & AI IdentityRequires honest disclosure of the system's AI nature, capabilities, and limitations.
User Autonomy & EmpowermentRequires the system to support informed decision-making, avoid manipulative patterns, and respect user agency.
Weapons & Dangerous MaterialsBlocks instructions or guidance related to weapons, explosives, or dangerous materials.

Tags

Each Principle can carry one or more tags for finer-grained organization within a category. Tags are free-form: you can create and assign them freely. Use them to mark regulatory frameworks (e.g., EU AI Act, GDPR), business areas, or any other grouping relevant to your context.