You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Code that works fine under Ibis 8.0.0 no longer does under Ibis 9.5.0 and it appears due to specifying an INTEGER column type. My actual code is different but I was able to reduce it down to this:
>>> ibis.read_csv("doesntexist.csv", columns = {"whatever": "INTEGER"})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/expr/api.py", line 1437, in read_csv
return con.read_csv(sources, table_name=table_name, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/backends/duckdb/__init__.py", line 765, in read_csv
options.append(C.columns.eq(make_struct_argument(columns)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/backends/duckdb/__init__.py", line 752, in make_struct_argument
typ = dt.dtype(typ)
^^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/common/dispatch.py", line 140, in call
return impl(arg, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/expr/datatypes/core.py", line 77, in from_string
return DataType.from_string(value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/expr/datatypes/core.py", line 172, in from_string
return parse(value)
^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/expr/datatypes/parse.py", line 210, in parse
return ty.parse(text)
^^^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/parsy/__init__.py", line 98, in parse
(result, _) = (self << eof).parse_partial(stream)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/parsy/__init__.py", line 112, in parse_partial
raise ParseError(result.expected, stream, result.furthest)
parsy.ParseError: expected 'EOF' at 0:3
Some notes,
It doesn't matter if a file exists at the specified path
The error doesn't happen if I don't pass the columns kwarg
The error doesn't happen if I specify STRING instead of INTEGER
What version of ibis are you using?
8.0.0: Bug doesn't show up
9.5.0: Bug does show up
What backend(s) are you using, if any?
DuckDB
Relevant log output
$ uv python pin 3.12
Pinned `.python-version` to `3.12`~/tmp/ibis-read-csv-bug
$ uv venv
Using CPython 3.12.7
Creating virtual environment at: .venv
Activate with: source .venv/bin/activate.fish
~/tmp/ibis-read-csv-bug
$ source .venv/bin/activate.fish
~/tmp/ibis-read-csv-bug
.venv $ uv pip install "ibis-framework[duckdb]==8.0.0"
Resolved 25 packages in 85ms
Installed 25 packages in 109ms
+ atpublic==4.1.0
+ bidict==0.23.1
+ duckdb==0.10.3
+ duckdb-engine==0.15.0
+ ibis-framework==8.0.0
+ markdown-it-py==3.0.0
+ mdurl==0.1.2
+ multipledispatch==1.0.0
+ numpy==1.26.4
+ packaging==24.2
+ pandas==2.2.3
+ parsy==2.1
+ pyarrow==15.0.2
+ pyarrow-hotfix==0.6
+ pygments==2.19.1
+ python-dateutil==2.9.0.post0
+ pytz==2024.2
+ rich==13.9.4
+ six==1.17.0
+ sqlalchemy==2.0.37
+ sqlalchemy-views==0.3.2
+ sqlglot==20.11.0
+ toolz==0.12.1
+ typing-extensions==4.12.2
+ tzdata==2024.2
~/tmp/ibis-read-csv-bug
.venv $ python
Python 3.12.7 (main, Oct 16 2024, 07:12:08) [Clang 18.1.8 ] on darwin
Type "help", "copyright", "credits" or "license"for more information.
>>> import ibis
i>>> ibis.__version__
'8.0.0'
>>> ibis.read_csv("data.csv", columns = {"whatever": "INTEGER"})
DatabaseTable: ibis_read_csv_kz4cq22gqnbbha65cs2zejizfa
whatever int32
>>> ^D
~/tmp/ibis-read-csv-bug 25s
.venv $ uv pip install "ibis-framework[duckdb]==9.5.0"
Resolved 20 packages in 12ms
Uninstalled 2 packages in 400ms
Installed 2 packages in 24ms
- ibis-framework==8.0.0
+ ibis-framework==9.5.0
- sqlglot==20.11.0
+ sqlglot==25.20.2
~/tmp/ibis-read-csv-bug
.venv $ python
Python 3.12.7 (main, Oct 16 2024, 07:12:08) [Clang 18.1.8 ] on darwin
Type "help", "copyright", "credits" or "license"for more information.
>>> import ibis
>>> ibis.__version__
'9.5.0'
>>> ibis.read_csv("data.csv", columns = {"whatever": "INTEGER"})
Traceback (most recent call last):
File "<stdin>", line 1, in<module>
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/expr/api.py", line 1437, in read_csv
return con.read_csv(sources, table_name=table_name, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/backends/duckdb/__init__.py", line 765, in read_csv
options.append(C.columns.eq(make_struct_argument(columns)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/backends/duckdb/__init__.py", line 752, in make_struct_argument
typ = dt.dtype(typ)
^^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/common/dispatch.py", line 140, in call
return impl(arg, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/expr/datatypes/core.py", line 77, in from_string
return DataType.from_string(value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/expr/datatypes/core.py", line 172, in from_string
return parse(value)
^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/ibis/expr/datatypes/parse.py", line 210, in parse
return ty.parse(text)
^^^^^^^^^^^^^^
File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/parsy/__init__.py", line 98, in parse
(result, _) = (self <<eof).parse_partial(stream) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/bryce/tmp/ibis-read-csv-bug/.venv/lib/python3.12/site-packages/parsy/__init__.py", line 112, in parse_partial raise ParseError(result.expected, stream, result.furthest)parsy.ParseError: expected 'EOF' at 0:3>>> ibis.read_csv("data.csv", columns = {"whatever": "STRING"})DatabaseTable: ibis_read_csv_33s6j4mt55horeoglmqdct7jma whatever string
Code of Conduct
I agree to follow this project's Code of Conduct
The text was updated successfully, but these errors were encountered:
It looks like this was introduced along with the ability to pass the columns and types arguments in c1dcf67, and the parsing bit hasn't changed since it was merged.
This is good: it means that we won't break any code that wasn't already failing after 8.0.0 because the implementation has been slightly broken ever since it was introduced.
What happened?
Code that works fine under Ibis 8.0.0 no longer does under Ibis 9.5.0 and it appears due to specifying an
INTEGER
column type. My actual code is different but I was able to reduce it down to this:Some notes,
What version of ibis are you using?
What backend(s) are you using, if any?
DuckDB
Relevant log output
Code of Conduct
The text was updated successfully, but these errors were encountered: