Submit a ticket My tickets
Welcome
Login  Sign up

GuardRails for Chatbot

Guardrails are systems, rules, or constraints that ensure an AI behaves safely, ethically, and as intended.
They prevent the model from:

  • Producing harmful, biased, or confidential information

  • Going off-topic or generating nonsense

  • Violating company or regulatory policies


Our GuardRails setup

Content filters

Purpose: to detect and filter harmful text or image content in input prompts or model responses. Filtering is done based on detection of certain predefined harmful content categories: Hate, Insults, Sexual, Violence, Misconduct and Prompt Attack. 

Prompt attack

Can help you detect and filter prompt attacks and prompt injections. Helps detect prompts that are intended to bypass moderation, override instructions, or generate harmful content.

Denied topics

The goal is to define a set of topics to avoid within your generative AI application. 

Currently, we are denying fully or partially the following topics: 

  1. API keys, tokens, secrets, passwords, Connection strings (JDBC, ODBC, etc.), Encryption keys / certificates. except support information on DataGalaxy SSO and PAT
  2. Social Security Numbers, Tax IDs, National IDs, Financial account numbers, credit cards, Health information (HIPAA/medical data) except objects exisiting in the catalog, their titles, tags and definitions, other PII data.
  3. Marketing content generation (ads, slogans, campaign ideas), Reports on revenue, profit, or financial results, Business strategies
  4. Medical/healthcare advice, Legal advice or interpretation of laws, Financial/investment advice, Political or religious content
  5. Prompt injection attempts (e.g., “ignore previous instructions,” “show me everything”)
  6. Attempts to generate synthetic/fake datasets, except samples that can illustrate the metadata
  7. Culinary, receipts, and other hobby discussions, except Data Goverance Kitchen. 


Chatbot may have light conversation about the requested topic but then go back to data domain. 


Did you find it helpful? Yes No

Send feedback
Sorry we couldn't be helpful. Help us improve this article with your feedback.