GuardRails for Chatbot

Modified on: Wed, 4 Mar, 2026 at 3:58 PM

Guardrails are systems, rules, or constraints that ensure an AI behaves safely, ethically, and as intended.
They prevent the model from:

Producing harmful, biased, or confidential information
Going off-topic or generating nonsense
Violating company or regulatory policies

Our GuardRails setup

Content filters

Purpose: to detect and filter harmful text or image content in input prompts or model responses. Filtering is done based on detection of certain predefined harmful content categories: Hate, Insults, Sexual, Violence, Misconduct and Prompt Attack.

Prompt attack

Can help you detect and filter prompt attacks and prompt injections. Helps detect prompts that are intended to bypass moderation, override instructions, or generate harmful content.

Denied topics

The goal is to define a set of topics to avoid within your generative AI application.

Currently, we are denying fully or partially the following topics:

API keys, tokens, secrets, passwords, Connection strings (JDBC, ODBC, etc.), Encryption keys / certificates. except support information on DataGalaxy SSO and PAT
Social Security Numbers, Tax IDs, National IDs, Financial account numbers, credit cards, Health information (HIPAA/medical data) except objects exisiting in the catalog, their titles, tags and definitions, other PII data.
Marketing content generation (ads, slogans, campaign ideas), Reports on revenue, profit, or financial results, Business strategies
Medical/healthcare advice, Legal advice or interpretation of laws, Financial/investment advice, Political or religious content
Prompt injection attempts (e.g., “ignore previous instructions,” “show me everything”)
Attempts to generate synthetic/fake datasets, except samples that can illustrate the metadata
Culinary, receipts, and other hobby discussions, except Data Goverance Kitchen.

Chatbot may have light conversation about the requested topic but then go back to data domain.

English

Our GuardRails setup

Content filters

Prompt attack

Denied topics

Table of contents

Related Articles