Skip to content

Conversation

@kishore-s-15
Copy link

@kishore-s-15 kishore-s-15 commented Nov 23, 2025

Add Core Space Merge Method

Summary

This PR adds the Core Space merge method to mergekit, implementing the algorithm from "Accurate and Efficient Low-Rank Model Merging in Core Space" (Panariello et al., NeurIPS 2025).

Core Space enables efficient merging of LoRA-adapted models by operating in a compact, SVD-aligned subspace, providing significant computational savings while guaranteeing information preservation.

Motivation

LoRA (Low-Rank Adaptation) has become a popular technique for parameter-efficient fine-tuning. However, existing merge methods in mergekit operate in full parameter space, which:

  • Wastes computational resources when merging low-rank adaptations
  • Doesn't account for misaligned LoRA subspaces
  • Scales poorly with model dimension

Core Space addresses these issues by:

  • Operating in compact core space (rank² vs dimension²)
  • Using SVD to align LoRA subspaces before merging
  • Guaranteeing lossless information preservation
  • Supporting heterogeneous ranks across adapters

Changes Made

New Files

1. mergekit/merge_methods/core_space.py (~300 lines)

Complete implementation of the Core Space algorithm:

Classes:

  • CoreSpaceMerge(MergeMethod) - Main merge method class
  • CoreSpaceTask(Task[torch.Tensor]) - Task execution class

Key Methods:

  • _compute_reference_bases() - Computes U_B and V_A via SVD on concatenated LoRA matrices
  • _extract_lora_matrices() - Extracts or approximates A and B matrices from models
  • _core_space_merge() - Projects to core space, merges, and reconstructs
  • _weighted_average() - Fallback for non-LoRA weights

Algorithm:

# 1. Concatenate LoRA matrices
B_concat = torch.cat([B_1, B_2, ..., B_n], dim=1)  # Horizontal
A_concat = torch.cat([A_1, A_2, ..., A_n], dim=0)  # Vertical

# 2. Compute reference bases via SVD
U_B, _, _ = SVD(B_concat)
_, _, V_A = SVD(A_concat)

# 3. Project to core space
Core_i = U_B^T @ B_i @ A_i @ V_A

# 4. Merge in core space
Core_merged = mean(Core_1, Core_2, ..., Core_n)

# 5. Reconstruct
ΔW_merged = U_B @ Core_merged @ V_A^T
W_final = W_base + ΔW_merged

2. tests/test_core_space.py (10 tests)

Comprehensive test suite covering:

  • Method initialization and configuration
  • SVD reference basis computation
  • LoRA weight detection logic
  • Weighted averaging fallback
  • Low-rank approximation via SVD
  • Core space projection and reconstruction
  • Parameter handling
  • Multi-model merge workflow
  • Comparison with naive merging

All tests are passing.

3. examples/core_space.yml

Example configuration demonstrating basic usage with multiple LoRA adapters.

Configuration

Basic usage:

models:
  - model: gpt2
    parameters:
      weight: 0.5
  - model: gpt2
    parameters:
      weight: 1.0

merge_method: core_space
base_model: gpt2
dtype: float32

References

Paper:

@inproceedings{panariello2025accurate,
  title = {Accurate and Efficient Low-Rank Model Merging in Core Space},
  author = {Panariello, Aniello and Marczak, Daniel and Magistri, Simone and 
            Porrello, Angelo and Twardowski, Bart{\l}omiej and 
            Bagdanov, Andrew D. and Calderara, Simone and van de Weijer, Joost},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
  year = {2025}
}

Original Implementation: https://github.com/apanariello4/core-space-merging


Note

Introduces the Core Space (core_space) merge method for SVD‑aligned LoRA merging, registers it, and adds documentation, example config, and unit tests.

  • Merge Methods:
    • Add CoreSpaceMerge with CoreSpaceTask in mergekit/merge_methods/core_space.py implementing SVD‑aligned core‑space merging with fallback weighted average; exposes weight parameter and reference URL.
    • Register method in mergekit/merge_methods/registry.py.
  • Docs:
    • Update README.md method table to include Core Space entry.
    • Extend docs/merge_methods.md with a new Core Space section (concept, algorithm, inputs/params, example, reference).
  • Examples:
    • Add examples/core_space.yml demonstrating configuration.
  • Tests:
    • Add tests/test_core_space.py covering initialization, SVD bases, low‑rank approximation, core‑space projection, parameter handling, multi‑adapter merge simulation, and edge cases.

Written by Cursor Bugbot for commit 94c2baf. This will update automatically on new commits. Configure here.

@github-actions
Copy link

github-actions bot commented Nov 23, 2025

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@kishore-s-15
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

- Fix: Ensure rank >= 1 to prevent degenerate matrices
- Fix: Remove incorrect lora_A/lora_B identity pairing
- Fix: Update tests to match concatenation implementation
- Clarify: Works with full models, not separate lora weight files
- Add: Test for zero rank edge case
- Update: Documentation to explain model requirements
- Fix: Ensure rank >= 1 (prevent zero rank bug)
- Fix: Remove incorrect lora_A/lora_B pairing logic
- Fix: Apply weight parameter as global scaling factor
- Clarify: Equal weighting in core space is architectural constraint
- Update: Tests to match implementation
- Update: Documentation to accurately describe behavior
Critical fixes:
- Fix: Exclude base model from weighted average (task vector approach)
- Fix: Apply weight parameter as global scaling factor
- Fix: Ensure rank >= 1 to prevent degenerate matrices
- Add: Test for base model exclusion in averaging
- Update: Documentation to clarify weight parameter behavior

The weighted average now correctly computes task vectors,
averages them, then adds back to base - matching task
arithmetic principles.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant