Most code LMs struggle with repository-local details (imports, APIs, conventions) because those facts either bloat the input or require per-repo fine-tuning. Code2LoRA's core insight is to encode repository context as lightweight LoRA adapters produced by a hypernetwork — so you get repository-specific knowledge without longer prompts or per-repo full fine-tuning.
Key Findings
- Competitive accuracy with far less deployment cost: Code2LoRA-Static matches the per-repository LoRA upper bound on the static track (63.8% cross-repo / 66.2% in-repo exact match), showing hypernetwork-generated adapters can approach per-repo tuning. This means similar code-comprehension gains with a single adapter-generation step instead of a full fine-tune per repo.
- Handles active development: Code2LoRA-Evo maintains an adapter state updated by a GRU per code diff and achieves 60.3% cross-repo exact match on the evolution track (+5.2 percentage points over a single shared LoRA). In practice this trades a small runtime state update for staying current with code changes.
- Scale and benchmark: The authors release RepoPeftBench (604 Python repos; static: ~40K train / 12K test tasks; evolution: ~215K train / 87K test tasks), enabling reproducible comparisons for repo-aware parameter-efficient fine-tuning.
Suitable for and tradeoffs
Great fit if you need repository-aware code completion or synthesis at scale but want to avoid sending large context windows or maintaining full per-repo fine-tuned models. Code2LoRA is especially useful when many repositories share a base model but benefit from lightweight repo-specific adapters. Look elsewhere if you require absolute, immediate token-level grounding of extremely novel runtime data (RAG-style retrieval may still be preferable), or if your infra cannot accept the complexity of adapter generation + small state updates per commit.
How it works (brief)
A hypernetwork consumes a repository snapshot (or a diff-updated hidden state in the Evo variant) and generates LoRA adapter weights for a base code LM. Static mode produces one adapter per snapshot; Evo mode uses a GRU to update a hidden representation per commit and outputs updated adapter weights, avoiding token-length increases at inference time.
Practical notes
Code and an anonymous code release are linked by the paper; model checkpoints and the RepoPeftBench dataset are available on Hugging Face. The method emphasizes parameter-efficient adaptation and continuous updating for evolving codebases rather than retrieval-heavy prompt construction.
