Most code searches surface filenames or a few matching symbols; they don't reveal the cross-file rationale, design motives, or multimodal signals buried in docs, diagrams, and videos. By building a single graph that combines AST-extracted code structure with LLM-extracted semantic edges, you get a persistent, queryable map of a project's concepts and why they exist — saving repeated token-heavy reads and surfacing surprising connections.
What Sets It Apart
- Deterministic code-first extraction: tree-sitter ASTs extract classes, functions, imports, call graphs and docstrings locally (no API calls for code), so structural edges are exact and reproducible.
- Multimodal semantic layer: documents, PDFs, images, and videos are processed with LLM subagents to produce semantic nodes and edges; every relation is tagged EXTRACTED / INFERRED / AMBIGUOUS with confidence metadata.
- Graph-native similarity (no embeddings): semantic edges are part of the NetworkX graph and influence Leiden community detection directly — no separate embedding step or vector DB required.
- Practical outputs and integrations: exports graph.html, graph.json, and a plain-language GRAPH_REPORT.md; runs as a CLI skill inside many assistants and can serve via an MCP/HTTP server for team access.
Who It's For
Great fit if you need faster, repeatable answers about architecture, design rationale, or cross-cutting concerns in a repo that mixes code and rich documentation — teams that want an always-on assistant integration (Claude Code, Codex, Gemini CLI, Copilot CLI, Cursor, etc.) will get the most value. Look elsewhere if you only need simple text search over a small set of files (overhead and LLM calls for non-code assets may not be worth it) or if you require a fully cloud-hosted managed service (graphify is designed to run locally/within your infra and can be headless in CI).
Practical notes
- Privacy: code + video transcription can run fully locally; docs/images require an LLM backend (configurable: Gemini, Claude, OpenAI, Ollama, Bedrock, etc.).
- Team workflow: graphify-out/ is intended to be committed so everyone shares the same map; hooks support auto-rebuilds and union-merge of graph.json on concurrent commits.
- Scale/tradeoffs: large graphs (>thousands of nodes) can be heavy for browser viz — use the JSON/MCP server or cluster-only exports for heavy corpora.
