Release v1.2.0
Release v1.2.0
This update enhances the preprocessing and embedding layers in the mambular
package, introducing several key improvements:
- Feature-Specific Preprocessing: The
Preprocessor
class now includes a feature preprocessing dictionary, enabling different preprocessing strategies for each feature. - Support for Unstructured Data: The model can now handle a combination of tabular features and unstructured data, such as images and text.
- Latent Representation Generation: It is now possible to generate latent representations of the input data, improving downstream modeling and interpretability.
These changes enhance flexibility and extend mambular
's capabilities to more diverse data modalities.
Preprocessing improvements:
mambular/preprocessing/preprocessor.py
: Addedfeature_preprocessing
parameter to allow custom preprocessing techniques for individual columns. Updated thefit
method to use this parameter for both numerical and categorical features. [1] [2] [3] [4] [5]
Embedding layer updates:
mambular/arch_utils/layer_utils/embedding_layer.py
: Modified theforward
method to handle different dimensions of categorical embeddings and ensure they are properly processed. [1] [2]
Allow unstructured data as inputs:
mambular/arch_utils/layer_utils/embedding_layer.py
: Modified theforward
method to handle num_features, cat_features and pre-embedded unstructured data. [1] [2]
Get latent representation of tables
mambular/base_models/basemodel.py
: Updated theencode
method to accept a singledata
parameter instead of separatenum_features
andcat_features
parameters. [1] [2]