WeKnora - LLM-Powered Document Understanding & Retrieval Framework
WeKnora is a cutting-edge, open-source framework engineered by Tencent to revolutionize how we interact with and extract value from documents using large language models (LLMs). At its heart, it embraces the Retrieval-Augmented Generation (RAG) paradigm, which enhances LLM outputs by grounding them in relevant, retrieved document chunks. This ensures responses are not only accurate but also deeply contextualized, making it ideal for scenarios involving intricate, multi-format documents like PDFs, Word files, text, Markdown, and even images (via OCR and captioning).
Core Architecture and Workflow
The framework's modular design is one of its standout features, allowing seamless integration and customization. Key components include:
- Document Parsing: Multimodal preprocessing extracts structured content from diverse formats, unifying them into semantic views for easier handling.
- Vector Processing and Indexing: Utilizes embedding models (e.g., local models, BGE/GTE APIs) to create semantic vectors, stored in vector databases like PostgreSQL (pgvector) or Elasticsearch.
- Intelligent Retrieval: Employs hybrid strategies such as BM25 for keyword matching, dense vector retrieval, and GraphRAG for knowledge graph-enhanced recall. It supports cross-knowledge base retrieval and configurable thresholds for precision.
- LLM Inference: Integrates with models like Qwen, DeepSeek, or local ones via Ollama, enabling reasoning, multi-turn conversations, and prompt-based control.
This pipeline is visualized in the architecture diagram, showcasing how raw documents flow through parsing, embedding, retrieval, and generation to produce insightful answers.
Key Features and Innovations
WeKnora distinguishes itself with advanced capabilities tailored for real-world applications:
- Agent Mode (ReACT): A novel ReACT-based agent that iteratively calls tools—including built-in knowledge base retrieval, MCP (Modular Capability Provider) tools, and web search (e.g., DuckDuckGo)—to generate comprehensive reports. This supports reflection and multi-iteration processes for complex queries.
- Multi-Type Knowledge Bases: Handles FAQ and document types with versatile import options: folder uploads, URL fetching, tag management, and online entry. Knowledge graphs can be built to visualize relationships, enhancing retrieval relevance.
- Conversation Strategy Control: Fine-tune behaviors via configurations for Agent/normal modes, retrieval thresholds, prompts, and model selection, ensuring precise multi-turn dialogues.
- Extensibility and Integration: MCP tool support (with uvx/npx launchers and transport methods like Stdio/HTTP/SSE) allows plugging in external services. Web search extensibility broadens access to real-time information.
- User-Friendly Interfaces: An intuitive web UI for knowledge base management, conversation (with mode switching and tool visualization), and settings. RESTful APIs enable programmatic access.
- Security and Deployment: Emphasizes local/private cloud setups with login authentication (from v0.1.3), async task management via MQ, and automatic DB migrations. Docker Compose simplifies deployment, with profiles for features like Neo4j graphs or Minio storage.
Recent updates in v0.2.0 introduce optimized UI, infrastructure upgrades, and enhanced multi-tenant model sharing, making it production-ready.
Application Scenarios
WeKnora shines in diverse domains:
- Enterprise Knowledge Management: Streamlines internal docs, policies, and manuals for efficient Q&A, cutting training and support costs.
- Academic Research: Aids paper retrieval, report analysis, and literature organization to speed up reviews.
- Technical Support: Powers product manuals and troubleshooting in customer service.
- Legal/Compliance: Retrieves contract clauses and regulations to mitigate risks.
- Medical Assistance: Supports literature searches and guideline analysis for better decisions.
It also integrates with the WeChat Dialog Open Platform for zero-code deployment in WeChat ecosystems like Official Accounts and Mini Programs.
Getting Started and Ecosystem
Setup is straightforward: Clone the repo, configure .env, and run Docker Compose (e.g., docker compose up -d for core services). Access the web UI at localhost, create knowledge bases, and configure models via an intuitive interface. For developers, fast dev mode skips Docker rebuilds, supporting hot-reloads and debugging.
The project is MIT-licensed, with active community contributions encouraged via GitHub. It boasts over 8,000 stars, reflecting its growing impact in the AI retrieval space. For deeper dives, check the API docs, troubleshooting FAQ, and knowledge graph guide.
In summary, WeKnora bridges the gap between raw documents and actionable insights, empowering users with a robust, extensible RAG framework that's secure, efficient, and adaptable to evolving AI needs.
