Memvid — Portable single-file memory layer for AI agents
Memvid is an infrastructure component that packages an AI system's memory (content, embeddings, indices, metadata and WAL) into a single append-only file (.mv2). Rather than relying on server-hosted vector databases or complex RAG pipelines, Memvid stores a compressed, versioned timeline of immutable Smart Frames that enable sub-millisecond retrieval, timeline inspection, branching and reproducible memory states.
核心理念
- Smart Frames: immutable frames that contain content, timestamps, checksums and metadata. Frames are grouped and compressed similarly to video encoding to achieve efficiency and parallel reads.
- Single-file (.mv2): everything required for search and retrieval lives in one portable file — data segments, WAL, lex/vec indexes, time index and table-of-contents footer.
- Append-only & crash-safe: writes append new frames; existing data is never mutated, enabling safe commits and easy rollback/time-travel debugging.
- Model-agnostic & offline-first: works with any model and can operate fully offline.
主要特性
- Persistent, versioned memory that is portable and shareable
- Fast local retrieval (sub-5ms) with predictive caching
- Time-travel debugging: rewind/replay/branch memory states for auditing and reproducibility
- Capsule context (.mv2): shareable memory capsules with rules and expiry; optional encryption (.mv2e)
- Codec intelligence: adaptive compression and upgradeable codecs
- Built-in vector (HNSW) + lexical (Tantivy/BM25) index support
- SDKs & CLI: Node, Python, Rust (memvid-core) and memvid-cli
典型用例
- 长时运行的 AI agents(持久记忆)
- 离线优先的智能助手或移动端 agent
- 企业知识库、客户支持语料检索
- 可审计的 AI 工作流与调试(时间线回放)
- 对代码库、文档库的快速向量/全文检索
文件与架构(概览)
┌────────────────────────────┐
│ Header (4KB) │ Magic, version, capacity
├────────────────────────────┤
│ Embedded WAL (1-64MB) │ Crash recovery
├────────────────────────────┤
│ Data Segments │ Compressed frames
├────────────────────────────┤
│ Lex Index │ Tantivy full-text
├────────────────────────────┤
│ Vec Index │ HNSW vectors
├────────────────────────────┤
│ Time Index │ Chronological ordering
├────────────────────────────┤
│ TOC (Footer) │ Segment offsets
└────────────────────────────┘
没有额外的 sidecar 文件(例如 .wal/.lock/.shm),所有东西都封装在 .mv2 文件中。
快速上手(Rust 示例)
use memvid_core::{Memvid, PutOptions, SearchRequest};
fn main() -> memvid_core::Result<()> {
let mut mem = Memvid::create("knowledge.mv2")?;
let opts = PutOptions::builder()
.title("Meeting Notes")
.uri("mv2://meetings/2024-01-15")
.tag("project", "alpha")
.build();
mem.put_bytes_with_options(b"Q4 planning discussion...", opts)?;
mem.commit()?;
let response = mem.search(SearchRequest { query: "planning".into(), top_k: 10, ..Default::default() })?;
for hit in response.hits { println!("{}: {}", hit.title.unwrap_or_default(), hit.text); }
Ok(())
}SDK 与集成
官方提供 CLI、Node.js、Python、Rust SDK;可按需开启特性(如 vec、clip、whisper、lex 等)以支持向量搜索、视觉/语音嵌入、全文检索与时间解析。
适合谁用
开发需要持久、可移植、可审计记忆层的 AI 团队、研究者和产品工程师,尤其是构建长期运行 agent、离线场景或希望避免集中式向量数据库的项目。
支持与许可
- 官方网站/文档: https://www.memvid.com
- 许可: Apache-2.0
- 联系: contact@memvid.com
