Knowledge Base Management Estimated reading: 4 minutes 6 views Summary: The Knowledge Base is the central intelligence hub for the Antimanual plugin, where your business data is transformed into high-dimensional embeddings. By indexing diverse sources like WordPress content, PDFs, and web URLs, you provide the AI with essential context to generate accurate, brand-aligned responses through your site's chatbot interface today. The Knowledge Base serves as the central intelligence hub for the Antimanual plugin. It is the repository where your business-specific information is stored, processed, and transformed into high-dimensional embeddings. By indexing diverse data sources, you provide the AI with the necessary context to generate accurate, brand-aligned content and provide helpful, informed responses through the chatbot interface. Table of Contents Understanding the Statistics Dashboard Indexing WordPress Content Advanced Document Integration (PDF & OCR) Web Crawling and External Data Technical Integration via GitHub Frequently Asked Questions Understanding the Statistics Dashboard Upon accessing the Knowledge Base management screen, users are presented with a real-time statistics panel. This dashboard is critical for monitoring the volume of data your AI has ingested and ensuring you remain within your operational limits. The panel tracks four primary metrics: Total KB Items: The cumulative count of all indexed data points across all sources. WordPress Posts: The specific number of internal site pages and posts currently residing in the index. Website URLs: A count of external web pages crawled and stored. WP Posts Limit: A usage bar indicating your current consumption against your plan’s maximum threshold (e.g., 30/100 items). Before proceeding with heavy indexing, ensure your How to Import Demo Data in the Docy Theme is properly set, as the embedding process relies on your selected AI provider. Indexing WordPress Content The primary source of knowledge for most implementations is the existing WordPress database. Antimanual allows you to selectively or globally index posts, pages, and custom post types (such as documentation from EazyDocs). To maintain a high-quality index, the system provides advanced filtering tools in the left sidebar, allowing you to sort content by post type, status (Added vs. Not Added), or those requiring an update due to recent modifications. For Pro users, the Auto-Sync feature is a vital automation tool. When enabled, any newly published post or update to existing content is automatically re-indexed into the Knowledge Base without manual intervention, ensuring your AI assistant’s knowledge remains evergreen. Advanced Document Integration (PDF & OCR) Often, essential business knowledge is locked within static documents. The PDF Integration (Pro) module allows for the direct upload of external files. Unlike standard text import, this system utilizes a sophisticated extraction engine that supports multi-page documents and incorporates Optical Character Recognition (OCR). This ensures that even scanned PDFs or image-heavy documents are converted into searchable, indexable text for the AI. Once uploaded, these documents are managed in a centralized list where you can monitor which embedding model was used for each file or remove outdated documentation to keep the AI’s context window clean. Web Crawling and External Data To supplement your internal data, the Website URL (Pro) feature enables you to ingest content from any public-facing web page. By entering a target URL, the Antimanual crawler visits the page, extracts the primary content, and intelligently filters out “noise” such as navigation menus, sidebars, and footers. This ensures the AI only learns from the actual informative text on the page. This is particularly useful for companies that rely on third-party research, partner documentation, or news feeds to inform their . Technical Integration via GitHub For developer-focused websites or technical products, the GitHub Repository Synchronization (Pro) provides a direct pipeline to your source code documentation. By connecting your GitHub account, you can select specific repositories for the AI to index. The system focuses on README files and Markdown documentation, making it an ideal solution for generating technical support replies or developer guides. Frequently Asked Questions How does the AI “remember” my content?The system converts your text into numerical vectors (embeddings) using OpenAI technology. When a query is made, the system finds the most relevant vectors in your Knowledge Base to provide context to the AI model. Is there a limit to how much text I can add manually?Yes, the Manual Text Snippets tool allows for entries of up to 5,000 characters per snippet, which is ideal for specific business rules or short guidelines. Can I remove items from the Knowledge Base?Absolutely. Every source (WordPress, PDF, URL, etc.) has a management table where you can delete specific items to prevent them from influencing AI responses. ArticlesOverview and Statistics WordPress Content Indexing PDF Document Integration Web URL Crawling Manual Text Snippets GitHub Repository Synchronization Knowledge Base Management - PreviousGeneral Provider SettingsNext - Knowledge Base ManagementOverview and Statistics