Content Moderation and Security Estimated reading: 3 minutes 2 views Ensuring the integrity of AI-driven conversations is paramount for any professional web presence. The Content Moderation and Security features within Antimanual provide administrators with the tools necessary to maintain a safe, respectful, and brand-compliant environment. By implementing robust filtering and custom response protocols, you can prevent the chatbot from engaging in inappropriate topics or disclosing sensitive information. Table of Contents Overview of AI Moderation Managing Blocked Words and Phrases Customizing Block Response Messages Security Best Practices Frequently Asked Questions Overview of AI Moderation Moderation in Antimanual acts as a safety layer between the Large Language Model (LLM) and your end-user. While the defines how the bot speaks, the moderation settings define what the bot is strictly forbidden from discussing. This is a Pro feature designed to mitigate the risks associated with AI hallucinations or malicious user prompts. Managing Blocked Words and Phrases The primary mechanism for safeguarding your chatbot is the Blocked Words/Phrases filter. This system allows you to define a comma-separated list of restricted terms. When a user input contains any of these terms, or if the AI’s generated response triggers these keywords, the system intercepts the communication. Common categories for your blocklist include: Competitor Names: Prevent the bot from comparing your services to specific rivals. Profanity and Hate Speech: An essential layer of protection to maintain professional standards. Sensitive Topics: Restrict discussions on legal, medical, or financial advice that your company is not authorized to provide. Technical Jargon: Prevent the AI from discussing internal server paths or database structures if they are accidentally leaked in the context. Customizing Block Response Messages When a moderation filter is triggered, the chatbot does not simply crash. Instead, it delivers a “Block Response Message.” This is a customized, fallback message that informs the user that the current line of inquiry cannot be fulfilled. A well-crafted block message should be professional and redirect the user back to safe topics. For example, instead of a blunt “Access Denied,” consider using: “I am sorry, but I am not authorized to discuss that topic. Is there anything else I can help you with regarding our documentation?” This maintains the established in your initial setup. Security Best Practices Beyond simple word filtering, securing your AI involves a proactive approach to how data is handled. Antimanual encrypts all API keys before storage, ensuring that your OpenAI or Gemini credentials are never exposed in the browser environment. To further enhance security, consider the following: Regular Audits: Periodically review your to see if users are attempting to bypass your filters via “prompt injection.” Limit Scope: Only include the necessary data in your Knowledge Base to prevent the AI from having access to information it shouldn’t share. Visibility Rules: Use the widget display settings to hide the chatbot from sensitive pages where it isn’t needed. Frequently Asked Questions Does the moderation filter slow down the response time?The filtering process happens in milliseconds and typically has no perceptible impact on the user experience. Can I use Regex in the blocked words list?Currently, the system supports comma-separated string matching. It is recommended to include various forms of a word (e.g., singular and plural) for maximum coverage. Is moderation available in the free version?Advanced Content Moderation and custom block responses are exclusive to the Pro version of Antimanual. Content Moderation and Security - PreviousWidget Display and VisibilityNext - Content Moderation and SecurityContent Generation Wizard