LightRAG: Simple and Fast Retrieval-Augmented Generation
LightRAG is an innovative open-source project from the HKU Data Science Lab (HKUDS) that redefines Retrieval-Augmented Generation (RAG) by emphasizing simplicity, speed, and effectiveness. Unlike traditional RAG systems that often rely on complex pipelines and heavy computational resources, LightRAG streamlines the process through a dual-level retrieval mechanism: local retrieval for entity-focused insights and global retrieval leveraging a dynamically constructed knowledge graph. This approach allows for superior handling of complex, high-level queries that require comprehensive understanding of entire datasets, outperforming baselines like NaiveRAG, RQ-RAG, HyDE, and even GraphRAG in benchmarks across domains such as agriculture, computer science, legal, and mixed corpora.
At its core, LightRAG employs a lightweight indexing pipeline that extracts entities and relations from documents using LLMs, builds a knowledge graph for global context, and uses vector embeddings for precise local searches. The system supports various storage backends including JSON, PostgreSQL, Neo4J, MongoDB, and more, enabling flexible deployment from local prototypes to production-scale applications. Key features include multimodal document processing via integration with RAG-Anything for handling PDFs, images, tables, and equations; reranking for improved retrieval accuracy; citation tracking; and entity merging/deletion for knowledge graph maintenance.
Installation is straightforward via PyPI (pip install lightrag-hku) or from source, with support for OpenAI, Hugging Face, Ollama, and LlamaIndex models. The project includes a Web UI and API server for easy interaction, observability with Langfuse, and evaluation tools using RAGAS. With over 24,000 GitHub stars and an upcoming EMNLP 2025 publication, LightRAG has gained significant traction in the AI community for its balance of performance and usability, making advanced RAG accessible to developers and researchers alike.
Related projects from HKUDS include RAG-Anything for multimodal RAG, VideoRAG for long-context video understanding, and MiniRAG for simplified small-model RAG, forming a robust ecosystem for next-generation retrieval systems.
