You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It will be super helpful to let ParallelRunStep class to allow dataframes as inputs.
I understand that ParallelRunStep class only allows the input types - [DatasetConsumptionConfig, PipelineOutputTabularDataset,PipelineOutputTabularDataset, OutputFileDatasetConfig, OutputTabularDatasetConfig, LinkFileOutputDatasetConfig, LinkTabularOutputDatasetConfig]
Is it possible to let dataframes as inputs in ParallelRunStep. Could this be a usecase that Azure ML dev team would consider?
Exception Traceback (most recent call last)
<ipython-input-27-215e373515cb> in <module>
7 output=output_dir,
8 allow_reuse=False,
----> 9 arguments=None
10 )
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/pipeline/steps/parallel_run_step.py in __init__(self, name, parallel_run_config, inputs, output, side_inputs, arguments, allow_reuse)
155 side_inputs=side_inputs,
156 arguments=arguments,
--> 157 allow_reuse=allow_reuse,
158 )
159
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/pipeline/core/_parallel_run_step_base.py in __init__(self, name, parallel_run_config, inputs, output, side_inputs, arguments, allow_reuse)
259
260 self._process_inputs_output_dataset_configs()
--> 261 self._validate()
262 self._get_pystep_inputs()
263
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/pipeline/core/_parallel_run_step_base.py in _validate(self)
329 """Validate input params to init parallel run step class."""
330 self._validate_arguments()
--> 331 self._validate_inputs()
332 self._validate_output()
333 self._validate_parallel_run_config()
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/pipeline/core/_parallel_run_step_base.py in _validate_inputs(self)
410
411 if self._inputs:
--> 412 self._input_ds_type = self._get_input_type(self._inputs[0])
413 for input_ds in self._inputs:
414 if self._input_ds_type != self._get_input_type(input_ds):
/anaconda/envs/azureml_py36/lib/python3.6/site-packages/azureml/pipeline/core/_parallel_run_step_base.py in _get_input_type(self, in_ds)
399 ds_mapping_type = INPUT_TYPE_DICT[input_type]
400 else:
--> 401 raise Exception("Step input must be of any type: {}, found {}".format(ALLOWED_INPUT_TYPES, input_type))
402 return ds_mapping_type
403
Exception: Step input must be of any type: (<class 'azureml.data.dataset_consumption_config.DatasetConsumptionConfig'>, <class 'azureml.pipeline.core.pipeline_output_dataset.PipelineOutputFileDataset'>, <class 'azureml.pipeline.core.pipeline_output_dataset.PipelineOutputTabularDataset'>, <class 'azureml.data.output_dataset_config.OutputFileDatasetConfig'>, <class 'azureml.data.output_dataset_config.OutputTabularDatasetConfig'>, <class 'azureml.data.output_dataset_config.LinkFileOutputDatasetConfig'>, <class 'azureml.data.output_dataset_config.LinkTabularOutputDatasetConfig'>), found <class 'pandas.core.frame.DataFrame'>
The text was updated successfully, but these errors were encountered:
It will be super helpful to let ParallelRunStep class to allow dataframes as inputs.
I understand that ParallelRunStep class only allows the input types - [DatasetConsumptionConfig, PipelineOutputTabularDataset,PipelineOutputTabularDataset, OutputFileDatasetConfig, OutputTabularDatasetConfig, LinkFileOutputDatasetConfig, LinkTabularOutputDatasetConfig]
Is it possible to let dataframes as inputs in ParallelRunStep. Could this be a usecase that Azure ML dev team would consider?
The text was updated successfully, but these errors were encountered: