Skip to content

Docs: Running unit-tests against a release candidate #804

Closed
@Michael-J-Ward

Description

@Michael-J-Ward

Goal: Document how to run the unit-tests against a release candidate

Previously, I had simply checked out the repo at the release tag, built datafusion-python with maturin develop, and ran the tests.

This morning, I tried running the unit-tests strictly against the released candidate of datafusion-python hosted on test.pypi.org.

The original attempt failed in a surprising way, so I'll share what I tried and found.

Please don't hesitate to correct me or suggest a better way to do something.


Setting up the attempt to verify

# clone a fresh repo - Important to start fresh instead of in your existing repo
git clone https://github.com/apache/datafusion-python.git
cd datafusion-python

# checkout the release commit
git fetch --tags
git checkout 40.0.0-rc1

# create the env
python3 -m venv venv
source venv/bin/activate

# install release candidate
pip install --extra-index-url https://test.pypi.org/simple/ datafusion==40.0.0
pip install pytest

Problem 1) The location of the tests causes it to look in the datafusion source code instead of the installed datafusion package

Notice, the error originates from our local repo code python/datafusion/__init__.py:29 instead of the installed candidate venv/lib/python3.11/site-packages/datafusion/__init__.py:29

❯ pytest python/datafusion/tests/test_wrapper_coverage.py 
ImportError while loading conftest '/home/mike/workspace/tmp/datafusion-python/python/datafusion/tests/conftest.py'.
python/datafusion/__init__.py:29: in <module>
    from .context import (
python/datafusion/context.py:22: in <module>
    from ._internal import SessionConfig as SessionConfigInternal
E   ModuleNotFoundError: No module named 'datafusion._internal'

This is why the setup starts with cloning a fresh repo instead of re-using your existing datafusion-python clone. Your local clone almost certainly has a ./python/datafusion/_internal.abi3.so created from a previous maturin develop run - this doesn't get cleaned up with cargo clean, and maturin doesn't appear to have a clean command that removes it.

This was the source of my original "surprising" test failure. My _internal.abi3.so was from my work on #802, which made it appear as if the candidate-release had been compiled against the wrong datafusion version.

Problem 2) typing_extensions is not installed as a dependency

After moving the tests repo, running the test uses the installed datafusion package (venv/lib/python3.11/site-packages/datafusion/__init__.py), but typing_extensions is not installed as a dependency when we ran pip install --extra-index-url https://test.pypi.org/simple/ datafusion==40.0.0.

❯ mv python/datafusion/tests python-tests
❯ pytest python-tests/test_wrapper_coverage.py
ImportError while loading conftest '/home/mike/workspace/tmp/datafusion-python/python-tests/conftest.py'.
python-tests/conftest.py:19: in <module>
    from datafusion import SessionContext
venv/lib/python3.11/site-packages/datafusion/__init__.py:29: in <module>
    from .context import (
venv/lib/python3.11/site-packages/datafusion/context.py:30: in <module>
    from datafusion.dataframe import DataFrame
venv/lib/python3.11/site-packages/datafusion/dataframe.py:26: in <module>
    from typing_extensions import deprecated
E   ModuleNotFoundError: No module named 'typing_extensions'

So we install the dependency, re-run the test and now it's green.

❯ pip install typing_extensions
Collecting typing_extensions
  Using cached typing_extensions-4.12.2-py3-none-any.whl.metadata (3.0 kB)
Using cached typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Installing collected packages: typing_extensions
Successfully installed typing_extensions-4.12.2
❯ pytest python-tests/test_wrapper_coverage.py 
========================================================================================== test session starts ===========================================================================================
platform linux -- Python 3.11.9, pytest-8.3.2, pluggy-1.5.0
rootdir: /home/mike/workspace/tmp/datafusion-python
configfile: pyproject.toml
collected 1 item                                                                                                                                                                                         

python-tests/test_wrapper_coverage.py .                                                                                                                                                            [100%]

=========================================================================================== 1 passed in 0.04s ============================================================================================

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions