Skip to content

Conversation

@nwh
Copy link
Contributor

@nwh nwh commented Sep 29, 2025

No description provided.

public func loraLinearLayers() -> MLXLMCommon.LoRALinearLayers {
// TODO ???
return []
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure what to do for this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally the q and v projection layers from attention:

    public func loraLinearLayers() -> LoRALinearLayers {
        model.layers.map { ($0.attention, ["q_proj", "v_proj"]) }
    }

but this doesn't seem to have an Attention layer. It works with any linear layers, so perhaps the x_proj and dt_proj layers in MambaBlock?

Otherwise maybe just remove the method. Also, FWIW this type would need to conform to LoRAModel for this to work.


import Foundation
import MLX
import MLXFast
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is included in MLX now -- you can remove this import.


// port of https://github.com/ml-explore/mlx-lm/blob/main/mlx_lm/models/mamba.py

struct StringKey: CodingKey, ExpressibleByStringLiteral {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this, see below.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And if you have to keep it, please make it private

try container
.decodeIfPresent(Int.self, forKey: .hiddenSize)
?? fallback
.decode(Int.self, forKey: "d_model")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do this with:

    enum CodingKeys: String, CodingKey {
        case modelType = "model_type"
        case vocabSize = "vocab_size"
        case hiddenSize = "hidden_size"
        case dModel = "d_model"

then:

hiddenSize = try container.decodeIfPresent(Int.self, forKey: .hiddenSize) ?? container.decode(Int.self, forKey: .dModel)

#316 will also provide a good solution to this, but it isn't merged yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll just wait for #316 and update the PR using the new macro.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants