Skip to content

tfless_tfds requires TensorFlow #11058

@gafderks

Description

@gafderks

Short description
Code in documentation to demonstrate tfless TFDS does actually require tensorflow to run.

Environment information

  • Operating System: Windows 11

  • Python version: 3.12

  • tensorflow-datasets/tfds-nightly version: tfds-nightly==4.9.9.dev202505300044

  • tensorflow/tf-nightly version: Not installed

  • Does the issue still exists with the last tfds-nightly package (pip install --upgrade tfds-nightly) ? Yes

Reproduction instructions
Run the code from this page in the documentation: https://www.tensorflow.org/datasets/tfless_tfds#:~:text=%25%25writefile%20no_tensorflow,be%20decoded.%27)

import os

os.environ.pop("TFDS_DATA_DIR", None)

import tensorflow_datasets as tfds

try:
    import tensorflow as tf
except ImportError:
    print("No TensorFlow found...")

ds = tfds.data_source(
    "fashion_mnist",
    try_gcs=False,
)
print("...but the data source could still be loaded...")
ds["train"][0]
print("...and the records can be decoded.")

Link to logs
If applicable, <link to gist with logs, stack trace>

PS C:\Users\username\Code\myproject> & c:/Users/username/Code/myproject/build_venv/Scripts/python.exe c:/Users/username/Code/myproject/tryout.py
No TensorFlow found...
WARNING:absl:Variant folder C:\Users\username\tensorflow_datasets\fashion_mnist\3.0.1 has no dataset_info.json
Downloading and preparing dataset Unknown size (download: Unknown size, generated: Unknown size, total: Unknown size) to C:\Users\username\tensorflow_datasets\fashion_mnist\3.0.1...
Dl Completed...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 47.35 url/s]
Dl Size...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 4422102/4422102 [00:00<00:00, 209400396.35 MiB/s]
***************************************************************
Failed to import TensorFlow. Please note that TensorFlow is not installed by default when you install TFDS. This allows you to choose to install either `tf-nightly` or `tensorflow`. Please install the most recent version of TensorFlow, by following instructions at https://tensorflow.org/install.
***************************************************************


Dl Completed...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 64.53 url/s]
Dl Size...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 4427250/4427250 [00:00<00:00, 142845743.17 MiB/s]
***************************************************************
Failed to import TensorFlow. Please note that TensorFlow is not installed by default when you install TFDS. This allows you to choose to install either `tf-nightly` or `tensorflow`. Please install the most recent version of TensorFlow, by following instructions at https://tensorflow.org/install.
***************************************************************


Dl Completed...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 85.67 url/s]
Dl Size...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 30849130/30849130 [00:00<00:00, 880939482.81 MiB/s]
Dl Completed...:  75%|███████████████████████████████████████████████████████████████████████████████████████▊                             | 3/4 [00:00<00:00, 81.10 url/s]
Failed to import TensorFlow. Please note that TensorFlow is not installed by default when you install TFDS. This allows you to choose to install either `tf-nightly` or `tensorflow`. Please install the most recent version of TensorFlow, by following instructions at https://tensorflow.org/install.

***************************************************************

Extraction completed...: 0 file [00:00, ? file/s]█████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 102.53 url/s]
Dl Size...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████| 30878645/30878645 [00:00<00:00, 741591032.26 MiB/s]
Dl Completed...: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 96.07 url/s]


***************************************************************
Failed to import TensorFlow. Please note that TensorFlow is not installed by default when you install TFDS. This allows you to choose to install either `tf-nightly` or `tensorflow`. Please install the most recent version of TensorFlow, by following instructions at https://tensorflow.org/install.
Traceback (most recent call last):
***************************************************************


  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\download\extractor.py", line 106, in _extract
    for path, handle in iter_archive(from_path, method):
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\download\extractor.py", line 220, in iter_gzip
    with _open_or_pass(arch_f) as fobj:
  File "C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\contextlib.py", line 137, in __enter__
    return next(self.gen)
           ^^^^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\download\extractor.py", line 166, in _open_or_pass
    with tf.io.gfile.GFile(path_or_fobj, 'rb') as f_obj:
         ^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\etils\epy\lazy_imports_utils.py", line 118, in __getattr__
    return getattr(self._module, name)
                   ^^^^^^^^^^^^
  File "C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\functools.py", line 995, in __get__
    val = self.func(instance)
          ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\etils\epy\lazy_imports_utils.py", line 79, in _module
    module = importlib.import_module(self.module_name)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\importlib\__init__.py", line 90, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1381, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1354, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1318, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'tensorflow'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "c:\Users\username\Code\myproject\tryout.py", line 12, in <module>
    ds = tfds.data_source(
         ^^^^^^^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\logging\__init__.py", line 176, in __call__
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\load.py", line 836, in data_source
    _download_and_prepare_builder(dbuilder, download, download_and_prepare_kwargs)
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\load.py", line 518, in _download_and_prepare_builder
    dbuilder.download_and_prepare(**download_and_prepare_kwargs)
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\logging\__init__.py", line 176, in __call__
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\dataset_builder.py", line 763, in download_and_prepare
    self._download_and_prepare(
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\dataset_builder.py", line 1808, in _download_and_prepare       
    split_infos = self._generate_splits(dl_manager, download_config)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\dataset_builder.py", line 1758, in _generate_splits
    split_generators = self._split_generators(  # pylint: disable=unexpected-keyword-arg
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\image_classification\mnist.py", line 119, in _split_generators      
    mnist_files = dl_manager.download_and_extract(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\download\download_manager.py", line 754, in download_and_extract
    return _map_promise(self._download_extract, url_or_urls)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\download\download_manager.py", line 782, in _map_promise       
    res = tree.map_structure(lambda p: p.get(), all_promises)  # Wait promises
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tree\__init__.py", line 428, in map_structure
    [func(*args) for args in zip(*map(flatten, structures))])
     ^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\download\download_manager.py", line 782, in <lambda>
    res = tree.map_structure(lambda p: p.get(), all_promises)  # Wait promises
                                       ^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\promise\promise.py", line 512, in get
    return self._target_settled_value(_raise=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\promise\promise.py", line 516, in _target_settled_value
    return self._target()._settled_value(_raise)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\promise\promise.py", line 226, in _settled_value
    reraise(type(raise_val), raise_val, self._traceback)
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\six.py", line 724, in reraise
    raise value
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\promise\promise.py", line 844, in handle_future_result
    resolve(future.result())
            ^^^^^^^^^^^^^^^
  File "C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\concurrent\futures\_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\concurrent\futures\_base.py", line 401, in __get_result
    raise self._exception
  File "C:\Users\username\AppData\Local\Programs\Python\Python312\Lib\concurrent\futures\thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\username\Code\myproject\build_venv\Lib\site-packages\tensorflow_datasets\core\download\extractor.py", line 128, in _extract
    raise ExtractError(msg) from err
tensorflow_datasets.core.download.extractor.ExtractError: Error while extracting C:\Users\username\tensorflow_datasets\downloads\fashion_mnist\fashion-mnist_t10k-images-idx3-ubyteNG5VuUjZc6l-WNI1Hd4WpIS9QV1FlSl2M7sI8D22oHM.gz to C:\Users\username\tensorflow_datasets\downloads\extracted\GZIP.fashion-mnist_t10k-images-idx3-ubyteNG5VuUjZc6l-WNI1Hd4WpIS9QV1FlSl2M7sI8D22oHM.gz: No module named 'tensorflow'

Expected behavior
I expect the output from the documentation:

No TensorFlow found...
...but the data source could still be loaded...
WARNING:absl:OpenCV is not installed. We recommend using OpenCV because it is faster according to our benchmarks. Defaulting to PIL to decode images...
...and the records can be decoded.

Additional context
The problematic line seems to be here:

with tf.io.gfile.GFile(path_or_fobj, 'rb') as f_obj:

Here, tf.io is used.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions