Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] list index out of range from dataloader #235

Open
YuanfengZhang opened this issue Mar 2, 2025 · 3 comments
Open

[BUG] list index out of range from dataloader #235

YuanfengZhang opened this issue Mar 2, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@YuanfengZhang
Copy link

Describe the bug
list index out of range when dealing with pandas dataframe.

To Reproduce
Steps to reproduce the behavior:

from mambular.models import MambularRegressor
import pandas as pd
from sklearn.model_selection import train_test_split
import torch

torch.set_float32_matmul_precision('high')

df = pd.read_csv('./df.csv')
(X_train,
 X_test,
 y_train,
 y_test) = train_test_split(df[[i for i in df.columns if 'f' in i]].values,
                            df[['y']].values,
                            random_state=42)

regressor = MambularRegressor()
regressor.fit(X_train, y_train, max_epochs=4)
print(regressor.evaluate(X_test, y_test))

Expected behavior
No error, regressor fitted.

Screenshots
Here the full output comes:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[14], line 7
      4 torch.set_float32_matmul_precision('high')
      6 regressor = MambularRegressor()
----> 7 regressor.fit(X_train, y_train, max_epochs=4)
      8 print(regressor.evaluate(X_test, y_test))

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/mambular/models/sklearn_base_regressor.py:398, in SklearnBaseRegressor.fit(self, X, y, val_size, X_val, y_val, embeddings, embeddings_val, max_epochs, random_state, batch_size, shuffle, patience, monitor, mode, lr, lr_patience, lr_factor, weight_decay, checkpoint_path, dataloader_kwargs, train_metrics, val_metrics, rebuild, **trainer_kwargs)
    388 # Initialize the trainer and train the model
    389 self.trainer = pl.Trainer(
    390     max_epochs=max_epochs,
    391     callbacks=[
   (...)    396     **trainer_kwargs,
    397 )
--> 398 self.trainer.fit(self.task_model, self.data_module)  # type: ignore
    400 best_model_path = checkpoint_callback.best_model_path
    401 if best_model_path:

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/lightning/pytorch/trainer/trainer.py:539, in Trainer.fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    537 self.state.status = TrainerStatus.RUNNING
    538 self.training = True
--> 539 call._call_and_handle_interrupt(
    540     self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
    541 )

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/lightning/pytorch/trainer/call.py:47, in _call_and_handle_interrupt(trainer, trainer_fn, *args, **kwargs)
     45     if trainer.strategy.launcher is not None:
     46         return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
---> 47     return trainer_fn(*args, **kwargs)
     49 except _TunerExitException:
     50     _call_teardown_hook(trainer)

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/lightning/pytorch/trainer/trainer.py:575, in Trainer._fit_impl(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    568 assert self.state.fn is not None
    569 ckpt_path = self._checkpoint_connector._select_ckpt_path(
    570     self.state.fn,
    571     ckpt_path,
    572     model_provided=True,
    573     model_connected=self.lightning_module is not None,
    574 )
--> 575 self._run(model, ckpt_path=ckpt_path)
    577 assert self.state.stopped
    578 self.training = False

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/lightning/pytorch/trainer/trainer.py:982, in Trainer._run(self, model, ckpt_path)
    977 self._signal_connector.register_signal_handlers()
    979 # ----------------------------
    980 # RUN THE TRAINER
    981 # ----------------------------
--> 982 results = self._run_stage()
    984 # ----------------------------
    985 # POST-Training CLEAN UP
    986 # ----------------------------
    987 log.debug(f"{self.__class__.__name__}: trainer tearing down")

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/lightning/pytorch/trainer/trainer.py:1024, in Trainer._run_stage(self)
   1022 if self.training:
   1023     with isolate_rng():
-> 1024         self._run_sanity_check()
   1025     with torch.autograd.set_detect_anomaly(self._detect_anomaly):
   1026         self.fit_loop.run()

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/lightning/pytorch/trainer/trainer.py:1053, in Trainer._run_sanity_check(self)
   1050 call._call_callback_hooks(self, "on_sanity_check_start")
   1052 # run eval step
-> 1053 val_loop.run()
   1055 call._call_callback_hooks(self, "on_sanity_check_end")
   1057 # reset logger connector

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/lightning/pytorch/loops/utilities.py:179, in _no_grad_context.<locals>._decorator(self, *args, **kwargs)
    177     context_manager = torch.no_grad
    178 with context_manager():
--> 179     return loop_run(self, *args, **kwargs)

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/lightning/pytorch/loops/evaluation_loop.py:119, in _EvaluationLoop.run(self)
    117 @_no_grad_context
    118 def run(self) -> list[_OUT_DICT]:
--> 119     self.setup_data()
    120     if self.skip:
    121         return []

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/lightning/pytorch/loops/evaluation_loop.py:201, in _EvaluationLoop.setup_data(self)
    198 self._max_batches = []
    199 for dl in combined_loader.flattened:
    200     # determine number of batches
--> 201     length = len(dl) if has_len_all_ranks(dl, trainer.strategy, allow_zero_length) else float("inf")
    202     limit_batches = getattr(trainer, f"limit_{stage.dataloader_prefix}_batches")
    203     num_batches = _parse_num_batches(stage, length, limit_batches)

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/lightning/pytorch/utilities/data.py:99, in has_len_all_ranks(dataloader, strategy, allow_zero_length_dataloader_with_multiple_devices)
     93 def has_len_all_ranks(
     94     dataloader: object,
     95     strategy: "pl.strategies.Strategy",
     96     allow_zero_length_dataloader_with_multiple_devices: bool = False,
     97 ) -> TypeGuard[Sized]:
     98     """Checks if a given object has ``__len__`` method implemented on all ranks."""
---> 99     local_length = sized_len(dataloader)
    100     if local_length is None:
    101         # __len__ is not defined, skip these checks
    102         return False

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/lightning/fabric/utilities/data.py:52, in sized_len(dataloader)
     49 """Try to get the length of an object, return ``None`` otherwise."""
     50 try:
     51     # try getting the length
---> 52     length = len(dataloader)  # type: ignore [arg-type]
     53 except (TypeError, NotImplementedError):
     54     length = None

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/torch/utils/data/dataloader.py:532, in DataLoader.__len__(self)
    530     return length
    531 else:
--> 532     return len(self._index_sampler)

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/torch/utils/data/sampler.py:365, in BatchSampler.__len__(self)
    363     return len(self.sampler) // self.batch_size  # type: ignore[arg-type]
    364 else:
--> 365     return (len(self.sampler) + self.batch_size - 1) // self.batch_size

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/torch/utils/data/sampler.py:128, in SequentialSampler.__len__(self)
    127 def __len__(self) -> int:
--> 128     return len(self.data_source)

File ~/mambaforge/envs/mambular/lib/python3.12/site-packages/mambular/data_utils/dataset.py:47, in MambularDataset.__len__(self)
     46 def __len__(self):
---> 47     return len(self.num_features_list[0])

IndexError: list index out of range

Desktop (please complete the following information):

  • OS: Ubuntu 24.04
  • Python 3.12.9
  • Mambular Version 1.2.1

Here is my conda.yaml to create the env:

name: mambular
channels:
  - rapidsai
  - conda-forge
  - nvidia
  - defaults
dependencies:
  - pip
  - pip:
    - mambular
    - 'polars[excel,pyarrow]'
  - conda-forge::jupyterlab
  - conda-forge::shap

Additional context
the df.csv has been attached.

df.csv

@YuanfengZhang YuanfengZhang added the bug Something isn't working label Mar 2, 2025
@AnFreTh
Copy link
Collaborator

AnFreTh commented Mar 2, 2025

Thanks for raising this. As a quick fix, removing the .values from the X, should solve the issue.
But lets leave the issue open such that we can implement a check/fix in the package.

(X_train, X_test, y_train, y_test) = train_test_split(
    df[[i for i in df.columns if "f" in i]], df[["y"]].values.squeeze(-1), random_state=42
)

As a side note: I would suggest normalizing your targets before training. This is not done internally, but should be done before training.

@YuanfengZhang
Copy link
Author

Thank you so much. I didn't expect the differnce between X and y.
When removing .values for both, erorr happens: ValueError: could not determine the shape of object type 'DataFrame'
When adding .values for both, the IndexError: list index out of range appears.

@AnFreTh
Copy link
Collaborator

AnFreTh commented Mar 2, 2025

Yes it's definitely a mistake from our side. We'll fix it in the next release. This shouldn't happen and you are absolutely correct that there should not be a difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants