You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/_build/html/_sources/homepage.md
+16-26
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
# Mambular: Tabular Deep Learning with Mamba Architectures
2
2
3
-
Mambular is a Python package that brings the power of Mamba architectures to tabular data, offering a suite of deep learning models for regression, classification, and distributional regression tasks. Designed with ease of use in mind, Mambular models adhere to scikit-learn's `BaseEstimator` interface, making them highly compatible with the familiar scikit-learn ecosystem. This means you can fit, predict, and transform using Mambular models just as you would with any traditional scikit-learn model, but with the added performance and flexibility of deep learning.
3
+
Mambular is a Python package that brings the power of Mamba architectures to tabular data, offering a suite of deep learning models for regression, classification, and distributional regression tasks. Designed with ease of use in mind, Mambular models adhere to scikit-learn's `BaseEstimator` interface, making them highly compatible with the familiar scikit-learn ecosystem. This means you can fit, predict, and evaluate using Mambular models just as you would with any traditional scikit-learn model, but with the added performance and flexibility of deep learning.
4
4
5
5
## Features
6
6
@@ -14,29 +14,26 @@ Mambular is a Python package that brings the power of Mamba architectures to tab
14
14
15
15
## Preprocessing
16
16
17
-
Mambular elevates the preprocessing stage of model development, employing a sophisticated suite of techniques to ensure your data is in the best shape for the Mamba architectures. Our preprocessing module is designed to be both powerful and intuitive, offering a range of options to transform your tabular data efficiently.
17
+
Mambular simplifies the preprocessing stage of model development with a comprehensive set of techniques to prepare your data for Mamba architectures. Our preprocessing module is designed to be both powerful and easy to use, offering a variety of options to efficiently transform your tabular data.
18
18
19
19
### Data Type Detection and Transformation
20
20
21
-
Mambular automatically identifies the type of each feature in your dataset, applying the most suitable transformations to numerical and categorical variables. This includes:
21
+
Mambular automatically identifies the type of each feature in your dataset and applies the most appropriate transformations for numerical and categorical variables. This includes:
22
22
23
23
-**Ordinal Encoding**: Categorical features are seamlessly transformed into numerical values, preserving their inherent order and making them model-ready.
24
24
-**One-Hot Encoding**: For nominal data, Mambular employs one-hot encoding to capture the presence or absence of categories without imposing ordinality.
25
25
-**Binning**: Numerical features can be discretized into bins, a useful technique for handling continuous variables in certain modeling contexts.
26
26
-**Decision Tree Binning**: Optionally, Mambular can use decision trees to find the optimal binning strategy for numerical features, enhancing model interpretability and performance.
27
27
-**Normalization**: Mambular can easily handle numerical features without specifically turning them into categorical features. Standard preprocessing steps such as normalization per feature are possible
28
28
-**Standardization**: Similarly, Standardization instead of Normalization can be used.
29
+
-**PLE**: Periodic Linear Encodings for numerical features can enhance performance for tabular DL methods.
29
30
30
31
31
32
### Handling Missing Values
32
33
33
-
Our preprocessing pipeline gracefully handles missing data, employing strategies like mean imputation for numerical features and mode imputation for categorical ones, ensuring that your models receive complete data inputs without manual intervention.
34
+
Our preprocessing pipeline effectively handles missing data by using mean imputation for numerical features and mode imputation for categorical features. This ensures that your models receive complete data inputs without needing manual intervention.
35
+
Additionally, Mambular can manage unknown categorical values during inference by incorporating classical <UNK> tokens in categorical preprocessing.
34
36
35
-
### Flexible and Customizable
36
-
37
-
While Mambular excels in automating the preprocessing workflow, it also offers flexibility. You can customize the preprocessing steps to fit the unique needs of your dataset, ensuring that you're not locked into a one-size-fits-all approach.
38
-
39
-
By integrating Mambular's preprocessing module into your workflow, you're not just preparing your data for deep learning; you're optimizing it for excellence. This commitment to data quality is what sets Mambular apart, making it an indispensable tool in your machine learning arsenal.
40
37
41
38
## Fit a Model
42
39
Fitting a model in mambular is as simple as it gets. All models in mambular are sklearn BaseEstimators. Thus the `.fit` method is implemented for all of them. Additionally, this allows for using all other sklearn inherent methods such as their built in hyperparameter optimization tools.
@@ -45,14 +42,14 @@ Fitting a model in mambular is as simple as it gets. All models in mambular are
45
42
from mambular.models import MambularClassifier
46
43
# Initialize and fit your model
47
44
model = MambularClassifier(
48
-
dropout=0.01,
49
-
d_model=128,
50
-
n_layers=6,
51
-
numerical_preprocessing="normalization",
45
+
d_model=64,
46
+
n_layers=8,
47
+
numerical_preprocessing="ple",
48
+
n_bins=50
52
49
)
53
50
54
51
# X can be a dataframe or something that can be easily transformed into a pd.DataFrame as a np.array
55
-
model.fit(X, y, max_epochs=500, lr=1e-03, patience=25)
52
+
model.fit(X, y, max_epochs=150, lr=1e-04)
56
53
```
57
54
58
55
Predictions are also easily obtained:
@@ -94,12 +91,6 @@ Mambular introduces a cutting-edge approach to distributional regression through
94
91
These distribution classes allow `MambularLSS` to flexibly model a wide variety of data types and distributions, providing users with the tools needed to capture the full complexity of their data.
95
92
96
93
97
-
### Use Cases for MambularLSS:
98
-
99
-
-**Risk Assessment**: In finance or insurance, understanding the range and likelihood of potential losses is as important as predicting average outcomes.
100
-
-**Demand Forecasting**: For inventory management, capturing the variability in product demand helps in optimizing stock levels.
101
-
-**Personalized Medicine**: In healthcare, distributional regression can predict a range of possible patient responses to a treatment, aiding in personalized therapy planning.
102
-
103
94
### Getting Started with MambularLSS:
104
95
105
96
To integrate distributional regression into your workflow with `MambularLSS`, start by initializing the model with your desired configuration, similar to other Mambular models:
@@ -110,17 +101,16 @@ from mambular.models import MambularLSS
110
101
# Initialize the MambularLSS model
111
102
model = MambularLSS(
112
103
dropout=0.2,
113
-
d_model=256,
114
-
n_layers=4,
115
-
104
+
d_model=64,
105
+
n_layers=8,
116
106
)
117
107
118
108
# Fit the model to your data
119
109
model.fit(
120
110
X,
121
111
y,
122
-
max_epochs=300,
123
-
lr=1e-03,
112
+
max_epochs=150,
113
+
lr=1e-04,
124
114
patience=10,
125
115
family="normal"# define your distribution
126
116
)
@@ -134,7 +124,7 @@ If you find this project useful in your research or in scientific publication, p
134
124
```BibTeX
135
125
@software{mambular2024,
136
126
title={Mambular: Tabular Deep Learning with Mamba Architectures},
137
-
author={Anton Frederik Thielmann, Soheila Samiee, Christoph Weisser, Benjamin Saefken, Manish Kumar},
127
+
author={Anton Frederik Thielmann, Manish Kumar, Christoph Weisser, Benjamin Saefken, Soheila Samiee},
0 commit comments