add Mamba.swift #401

nwh · 2025-09-29T21:49:01Z

No description provided.

nwh · 2025-09-29T21:49:45Z

Libraries/MLXLLM/Models/Mamba.swift

+    public func loraLinearLayers() -> MLXLMCommon.LoRALinearLayers {
+        // TODO ???
+        return []
+    }


I wasn't sure what to do for this.

Normally the q and v projection layers from attention:

public func loraLinearLayers() -> LoRALinearLayers { model.layers.map { ($0.attention, ["q_proj", "v_proj"]) } }

but this doesn't seem to have an Attention layer. It works with any linear layers, so perhaps the x_proj and dt_proj layers in MambaBlock?

Otherwise maybe just remove the method. Also, FWIW this type would need to conform to LoRAModel for this to work.

davidkoski · 2025-09-29T22:20:12Z

Libraries/MLXLLM/Models/Mamba.swift

+
+import Foundation
+import MLX
+import MLXFast


This is included in MLX now -- you can remove this import.

davidkoski · 2025-09-29T22:21:02Z

Libraries/MLXLLM/Models/Mamba.swift

+
+// port of https://github.com/ml-explore/mlx-lm/blob/main/mlx_lm/models/mamba.py
+
+struct StringKey: CodingKey, ExpressibleByStringLiteral {


You don't need this, see below.

And if you have to keep it, please make it private

davidkoski · 2025-09-29T22:23:08Z

Libraries/MLXLLM/Models/Mamba.swift

+            try container
+            .decodeIfPresent(Int.self, forKey: .hiddenSize)
+            ?? fallback
+            .decode(Int.self, forKey: "d_model")


You can do this with:

enum CodingKeys: String, CodingKey { case modelType = "model_type" case vocabSize = "vocab_size" case hiddenSize = "hidden_size" case dModel = "d_model"

then:

hiddenSize = try container.decodeIfPresent(Int.self, forKey: .hiddenSize) ?? container.decode(Int.self, forKey: .dModel)

#316 will also provide a good solution to this, but it isn't merged yet.

I'll just wait for #316 and update the PR using the new macro.

add Mamba.swift

55d3311

nwh commented Sep 29, 2025

View reviewed changes

davidkoski reviewed Sep 29, 2025

View reviewed changes

remove TODO comment related to LORA

5d2b41e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

add Mamba.swift #401

add Mamba.swift #401

Uh oh!

nwh commented Sep 29, 2025

Uh oh!

nwh Sep 29, 2025

Uh oh!

davidkoski Sep 29, 2025

Uh oh!

davidkoski Sep 29, 2025

Uh oh!

davidkoski Sep 29, 2025

Uh oh!

davidkoski Sep 29, 2025

Uh oh!

davidkoski Sep 29, 2025

Uh oh!

nwh Sep 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		// port of https://github.com/ml-explore/mlx-lm/blob/main/mlx_lm/models/mamba.py

		struct StringKey: CodingKey, ExpressibleByStringLiteral {

Uh oh!

add Mamba.swift #401

Are you sure you want to change the base?

add Mamba.swift #401

Uh oh!

Conversation

nwh commented Sep 29, 2025

Uh oh!

nwh Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

davidkoski Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

davidkoski Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

davidkoski Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

davidkoski Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

davidkoski Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

nwh Sep 30, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants