Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: compilation failure in table expression mixing ibis.row_number(), .isin() and .sample() #8058

Closed
1 task done
ogrisel opened this issue Jan 22, 2024 · 10 comments
Closed
1 task done
Assignees
Labels
bug Incorrect behavior inside of ibis tes-required-for-release Things that must be addressed before *release* of `main` after merging in `the-epic-split`

Comments

@ogrisel
Copy link
Contributor

ogrisel commented Jan 22, 2024

What happened?

The following snippet that uses table.sample raises a low level SQLAlchemy exception.

import ibis

t = ibis.memtable({"a": range(30)})
t = t.mutate(_id=ibis.row_number())
test = t.sample(fraction=0.25, seed=0)
t = t.mutate(is_test=t._id.isin(test._id))
train = t.filter(~t.is_test)
ibis.to_sql(train)

raises:

Traceback (most recent call last):
  Cell In[2], line 9
    ibis.show_sql(train)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/expr/sql.py:336 in show_sql
    print(to_sql(expr, dialect=dialect), file=file)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/expr/sql.py:410 in to_sql
    sql = backend._to_sql(expr, **kwargs)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/__init__.py:187 in _to_sql
    sql = self.compile(expr, **kwargs).compile(
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/__init__.py:395 in compile
    return self.compiler.to_ast_ensure_limit(expr, limit, params=params).compile()
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/compiler/base.py:37 in compile
    compiled_queries = [q.compile() for q in self.queries]
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/compiler/base.py:37 in <listcomp>
    compiled_queries = [q.compile() for q in self.queries]
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/query_builder.py:209 in compile
    frag = self._compile_table_set()
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/query_builder.py:237 in _compile_table_set
    return self.table_set_formatter_class(self, self.table_set).get_result()
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/query_builder.py:37 in get_result
    self.join_tables.append(self._format_table(op))
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/query_builder.py:150 in _format_table
    result = ctx.get_compiled_expr(op)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/compiler/translator.py:74 in get_compiled_expr
    result = self._compile_subquery(node)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/compiler/translator.py:37 in _compile_subquery
    return self._to_sql(op, sub_ctx)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/compiler/translator.py:40 in _to_sql
    return self.compiler.to_sql(expr, ctx)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/query_builder.py:460 in to_sql
    return query.compile()
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/query_builder.py:219 in compile
    frag = step(frag)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/query_builder.py:251 in _add_select
    arg = self._translate(op, named=True)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/compiler/query_builder.py:239 in _translate
    return translator.get_result()
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/compiler/translator.py:224 in get_result
    translated = self.translate(self.node)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/compiler/translator.py:256 in translate
    return formatter(self, op)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/registry.py:228 in _alias
    return t.translate(op.arg)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/compiler/translator.py:256 in translate
    return formatter(self, op)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/registry.py:219 in _in_column
    options = t.translate(ops.TableArrayView(op.options.to_expr().as_table()))
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/compiler/translator.py:256 in translate
    return formatter(self, op)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/registry.py:141 in _table_array_view
    table = ctx.get_compiled_expr(op.table)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/compiler/translator.py:74 in get_compiled_expr
    result = self._compile_subquery(node)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/compiler/translator.py:37 in _compile_subquery
    return self._to_sql(op, sub_ctx)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/compiler/translator.py:40 in _to_sql
    return self.compiler.to_sql(expr, ctx)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/query_builder.py:460 in to_sql
    return query.compile()
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/query_builder.py:209 in compile
    frag = self._compile_table_set()
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/query_builder.py:237 in _compile_table_set
    return self.table_set_formatter_class(self, self.table_set).get_result()
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/query_builder.py:37 in get_result
    self.join_tables.append(self._format_table(op))
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/ibis/backends/base/sql/alchemy/query_builder.py:152 in _format_table
    result = alias if hasattr(alias, "name") else result.alias(alias)
AttributeError: 'str' object has no attribute 'alias'

EDIT: on main the SQL generation works but querying DuckDB fails: #8058 (comment)

while the same code with a non-sample based selection of the test set such as the following works as expected:

t = ibis.memtable({"a": range(30)})
t = t.mutate(_id=ibis.row_number())
test = t.filter(t._id < 10)
t = t.mutate(is_test=t._id.isin(test._id))
train = t.filter(~t.is_test)
ibis.to_sql(train)

or alternatively, keeping the call to sample but not using ibis.row_number:

t = ibis.memtable({"a": range(30)})
test = t.sample(fraction=0.25, seed=0)
t = t.mutate(is_test=t.a.isin(test.a))
train = t.filter(~t.is_test)
ibis.to_sql(train)

Note: the first snippet was suggested when discussing the possibility to pass a fixed integer seed to ibis.random:

What version of ibis are you using?

  • ibis 7.2.0
  • sqlalchemy 2.0.25

What backend(s) are you using, if any?

DuckDB

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@ogrisel ogrisel added the bug Incorrect behavior inside of ibis label Jan 22, 2024
@ogrisel ogrisel changed the title bug: AttributeError: 'str' object has no attribute 'alias' compiling a table expression that involves .isin() and .sample() bug: AttributeError: 'str' object has no attribute 'alias' compiling a table expression that involves ibis.row_number(), .isin() and .sample() Jan 22, 2024
@lostmygithubaccount
Copy link
Member

looks like this might already be fixed on main -- I am able to run your first code snippet (changing show_sql to to_sql per a deprecation) and it procudes:

WITH t0 AS (
  SELECT
    t2.a AS a,
    ROW_NUMBER() OVER () - 1 AS _id
  FROM ibis_pandas_memtable_w4odh7uo5nesdc3vigve5xmk7u AS t2
)
SELECT
  t1.a,
  t1._id,
  t1.is_test
FROM (
  SELECT
    t0.a AS a,
    t0._id AS _id,
    t0._id IN (
      SELECT
        t2_1._id
      FROM t2 AS t2_1 TABLESAMPLE BERNOULLI (25.0 PERCENT) REPEATABLE (0)
    ) AS is_test
  FROM t0
) AS t1
WHERE
  NOT t1.is_test

@lostmygithubaccount
Copy link
Member

lostmygithubaccount commented Jan 22, 2024

it does fail on the-epic-split branch with a different error

Show error
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[1], [line 8](vscode-notebook-cell:?execution_count=1&line=8)
      [6](vscode-notebook-cell:?execution_count=1&line=6) t = t.mutate(is_test=t._id.isin(test._id))
      [7](vscode-notebook-cell:?execution_count=1&line=7) train = t.filter(~t.is_test)
----> [8](vscode-notebook-cell:?execution_count=1&line=8) ibis.to_sql(train)

File [~/repos/ibis/ibis/expr/sql.py:380](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/expr/sql.py:380), in to_sql(expr, dialect, **kwargs)
    [377](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/expr/sql.py:377)     else:
    [378](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/expr/sql.py:378)         read = write = getattr(backend, "_sqlglot_dialect", dialect)
--> [380](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/expr/sql.py:380) sql = backend._to_sql(expr.unbind(), **kwargs)
    [381](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/expr/sql.py:381) (pretty,) = sg.transpile(sql, read=read, write=write, pretty=True)
    [382](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/expr/sql.py:382) return SQLString(pretty)

File [~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:99](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:99), in SQLGlotBackend._to_sql(self, expr, **kwargs)
     [98](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:98) def _to_sql(self, expr: ir.Expr, **kwargs) -> str:
---> [99](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:99)     return self.compile(expr, **kwargs)

File [~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:93](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:93), in SQLGlotBackend.compile(self, expr, limit, params, **kwargs)
     [89](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:89) def compile(
     [90](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:90)     self, expr: ir.Expr, limit: str | None = None, params=None, **kwargs: Any
     [91](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:91) ):
     [92](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:92)     """Compile an Ibis expression to a SQL string."""
---> [93](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:93)     query = self._to_sqlglot(expr, limit=limit, params=params, **kwargs)
     [94](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:94)     sql = query.sql(dialect=self.compiler.dialect, pretty=True)
     [95](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:95)     self._log(sql)

File [~/repos/ibis/ibis/backends/duckdb/__init__.py:112](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/duckdb/__init__.py:112), in Backend._to_sqlglot(self, expr, limit, params, **_)
    [109](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/duckdb/__init__.py:109) def _to_sqlglot(
    [110](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/duckdb/__init__.py:110)     self, expr: ir.Expr, limit: str | None = None, params=None, **_: Any
    [111](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/duckdb/__init__.py:111) ):
--> [112](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/duckdb/__init__.py:112)     sql = super()._to_sqlglot(expr, limit=limit, params=params)
    [114](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/duckdb/__init__.py:114)     table_expr = expr.as_table()
    [115](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/duckdb/__init__.py:115)     geocols = frozenset(
    [116](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/duckdb/__init__.py:116)         name for name, typ in table_expr.schema().items() if typ.is_geospatial()
    [117](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/duckdb/__init__.py:117)     )

File [~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:80](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:80), in SQLGlotBackend._to_sqlglot(self, expr, limit, params, **_)
     [77](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:77) if params is None:
     [78](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:78)     params = {}
---> [80](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:80) sql = self.compiler.translate(table_expr.op(), params=params)
     [81](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:81) assert not isinstance(sql, sge.Subquery)
     [83](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/__init__.py:83) if isinstance(sql, sge.Table):

File [~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:258](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:258), in SQLGlotCompiler.translate(self, op, params)
    [246](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:246) # substitute parameters immediately to avoid having to define a
    [247](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:247) # ScalarParameter translation rule
    [248](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:248) #
    [249](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:249) # this lets us avoid threading `params` through every `translate_val`
    [250](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:250) # call only to be used in the one place it would be needed: the
    [251](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:251) # ScalarParameter `translate_val` rule
    [252](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:252) params = {
    [253](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:253)     # remove aliases from scalar parameters
    [254](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:254)     param.op().replace(unwrap_scalar_parameter): value
    [255](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:255)     for param, value in (params or {}).items()
    [256](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:256) }
--> [258](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:258) op = op.replace(
    [259](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:259)     replace_scalar_parameter(params) | reduce(operator.or_, self.rewrites)
    [260](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:260) )
    [261](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:261) op, ctes = sqlize(op)
    [263](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/backends/base/sqlglot/compiler.py:263) aliases = {}

File [~/repos/ibis/ibis/common/graph.py:405](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:405), in Node.replace(self, replacer, filter, context)
    [383](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:383) """Match and replace nodes in the graph according to a given pattern.
    [384](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:384) 
    [385](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:385) The pattern matching system is used to match nodes in the graph and replace them
   (...)
    [402](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:402) The root node of the graph with the replaced nodes.
    [403](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:403) """
    [404](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:404) replacer = _coerce_replacer(replacer, context)
--> [405](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:405) results = self.map(replacer, filter=filter)
    [406](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:406) return results.get(self, self)

File [~/repos/ibis/ibis/common/graph.py:256](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:256), in Node.map(self, fn, filter)
    [250](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:250) for node in graph:
    [251](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:251)     # minor optimization to directly recurse into the children
    [252](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:252)     kwargs = {
    [253](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:253)         k: _recursive_lookup(v, results)
    [254](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:254)         for k, v in zip(node.__argnames__, node.__args__)
    [255](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:255)     }
--> [256](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:256)     results[node] = fn(node, results, **kwargs)
    [258](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:258) return results

File [~/repos/ibis/ibis/common/graph.py:180](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:180), in _coerce_replacer.<locals>.fn(node, _, **kwargs)
    [173](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:173) def fn(node, _, **kwargs):
    [174](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:174)     # need to first reconstruct the node from the possible rewritten
    [175](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:175)     # children, so we can match on the new node containing the rewritten
    [176](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:176)     # child arguments, this way we can propagate the rewritten nodes
    [177](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:177)     # upward in the hierarchy, using a specialized __recreate__ method
    [178](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:178)     # improves the performance by 17% compared node.__class__(**kwargs)
    [179](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:179)     recreated = node.__recreate__(kwargs)
--> [180](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:180)     if (result := obj.match(recreated, ctx)) is NoMatch:
    [181](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:181)         return recreated
    [182](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/graph.py:182)     else:

File [~/repos/ibis/ibis/common/patterns.py:909](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:909), in AnyOf.match(self, value, context)
    [907](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:907) def match(self, value, context):
    [908](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:908)     for pattern in self.patterns:
--> [909](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:909)         result = pattern.match(value, context)
    [910](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:910)         if result is not NoMatch:
    [911](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:911)             return result

File [~/repos/ibis/ibis/common/patterns.py:909](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:909), in AnyOf.match(self, value, context)
    [907](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:907) def match(self, value, context):
    [908](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:908)     for pattern in self.patterns:
--> [909](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:909)         result = pattern.match(value, context)
    [910](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:910)         if result is not NoMatch:
    [911](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:911)             return result

    [... skipping similar frames: AnyOf.match at line 909 (2 times)]

File [~/repos/ibis/ibis/common/patterns.py:909](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:909), in AnyOf.match(self, value, context)
    [907](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:907) def match(self, value, context):
    [908](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:908)     for pattern in self.patterns:
--> [909](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:909)         result = pattern.match(value, context)
    [910](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:910)         if result is not NoMatch:
    [911](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:911)             return result

File [~/repos/ibis/ibis/common/patterns.py:375](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:375), in Replace.match(self, value, context)
    [374](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:374) def match(self, value, context):
--> [375](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:375)     value = self.matcher.match(value, context)
    [376](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:376)     if value is NoMatch:
    [377](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:377)         return NoMatch

File [~/repos/ibis/ibis/common/patterns.py:1287](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:1287), in Object.match(self, value, context)
   [1284](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:1284) except AttributeError:
   [1285](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:1285)     return NoMatch
-> [1287](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:1287) result = pattern.match(attr, context)
   [1288](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:1288) if result is NoMatch:
   [1289](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:1289)     return NoMatch

File [~/repos/ibis/ibis/common/patterns.py:381](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:381), in Replace.match(self, value, context)
    [378](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:378) # use the `_` reserved variable to record the value being replaced
    [379](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:379) # in the context, so that it can be used in the replacer pattern
    [380](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:380) context["_"] = value
--> [381](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/patterns.py:381) return self.replacer.resolve(context)

File [~/repos/ibis/ibis/common/deferred.py:394](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:394), in Call.resolve(self, context)
    [392](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:392) func = self.func.resolve(context)
    [393](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:393) args = tuple(arg.resolve(context) for arg in self.args)
--> [394](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:394) kwargs = {k: v.resolve(context) for k, v in self.kwargs.items()}
    [395](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:395) return func(*args, **kwargs)

File [~/repos/ibis/ibis/common/deferred.py:394](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:394), in <dictcomp>(.0)
    [392](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:392) func = self.func.resolve(context)
    [393](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:393) args = tuple(arg.resolve(context) for arg in self.args)
--> [394](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:394) kwargs = {k: v.resolve(context) for k, v in self.kwargs.items()}
    [395](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:395) return func(*args, **kwargs)

File [~/repos/ibis/ibis/common/deferred.py:513](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:513), in Sequence.resolve(self, context)
    [512](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:512) def resolve(self, context):
--> [513](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:513)     return self.typ(v.resolve(context) for v in self.values)

File [~/repos/ibis/ibis/common/deferred.py:513](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:513), in <genexpr>(.0)
    [512](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:512) def resolve(self, context):
--> [513](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:513)     return self.typ(v.resolve(context) for v in self.values)

File [~/repos/ibis/ibis/common/deferred.py:222](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:222), in Variable.resolve(self, context)
    [221](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:221) def resolve(self, context):
--> [222](https://file+.vscode-resource.vscode-cdn.net/Users/cody/repos/ibis/~/repos/ibis/ibis/common/deferred.py:222)     return context[self.name]

KeyError: 'y'

@lostmygithubaccount lostmygithubaccount added tes-required-for-release Things that must be addressed before *release* of `main` after merging in `the-epic-split` tes-required-for-merge Issues that must addressed before merging the-epic-split branch into main labels Jan 22, 2024
@jcrist
Copy link
Member

jcrist commented Jan 22, 2024

looks like this might already be fixed on main

Yeah, this was fixed in main by #7961

@cpcloud
Copy link
Member

cpcloud commented Jan 22, 2024

cc @kszucs Can you take a look at this one for the-epic-split?

@ogrisel
Copy link
Contributor Author

ogrisel commented Jan 24, 2024

I tried with the main branch. Indeed to_sql now work, but I a tried to execute the query with duckdb, I got the following error:

>>> import ibis
... 
... t = ibis.memtable({"a": range(30)})
... t = t.mutate(_id=ibis.row_number())
... test = t.sample(fraction=0.25, seed=0)
... t = t.mutate(is_test=t._id.isin(test._id))
... train = t.filter(~t.is_test)
... train.execute()
Traceback (most recent call last):
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1910 in _execute_context
    self.dialect.do_execute(
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/sqlalchemy/engine/default.py:736 in do_execute
    cursor.execute(statement, parameters)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/duckdb_engine/__init__.py:160 in execute
    self.__c.execute(statement, parameters)
CatalogException: Catalog Error: Table with name t2 does not exist!
Did you mean "temp.information_schema.tables"?

The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  Cell In[28], line 8
    train.execute()
  File ~/code/ibis/ibis/expr/types/core.py:322 in execute
    return self._find_backend(use_default=True).execute(
  File ~/code/ibis/ibis/backends/base/sql/__init__.py:342 in execute
    with self._safe_raw_sql(sql, **kwargs) as cursor:
  File ~/miniforge3/envs/dev/lib/python3.11/contextlib.py:137 in __enter__
    return next(self.gen)
  File ~/code/ibis/ibis/backends/base/sql/alchemy/__init__.py:205 in _safe_raw_sql
    yield con.execute(*args, **kwargs)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1385 in execute
    return meth(self, multiparams, params, _EMPTY_EXECUTION_OPTS)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/sqlalchemy/sql/elements.py:334 in _execute_on_connection
    return connection._execute_clauseelement(
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1577 in _execute_clauseelement
    ret = self._execute_context(
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1953 in _execute_context
    self._handle_dbapi_exception(
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/sqlalchemy/engine/base.py:2134 in _handle_dbapi_exception
    util.raise_(
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/sqlalchemy/util/compat.py:211 in raise_
    raise exception
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/sqlalchemy/engine/base.py:1910 in _execute_context
    self.dialect.do_execute(
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/sqlalchemy/engine/default.py:736 in do_execute
    cursor.execute(statement, parameters)
  File ~/miniforge3/envs/dev/lib/python3.11/site-packages/duckdb_engine/__init__.py:160 in execute
    self.__c.execute(statement, parameters)
ProgrammingError: (duckdb.duckdb.CatalogException) Catalog Error: Table with name t2 does not exist!
Did you mean "temp.information_schema.tables"?
[SQL: WITH t0 AS 
(SELECT t2.a AS a, row_number() OVER () - ? AS _id 
FROM ibis_pandas_memtable_6rxmv3tjrjauvgll5a6mzbefse AS t2)
 SELECT t1.a, t1._id, t1.is_test 
FROM (SELECT t0.a AS a, t0._id AS _id, t0._id IN (SELECT t2_1._id 
FROM t2 AS t2_1 TABLESAMPLE bernoulli(25.0 PERCENT) REPEATABLE (0)) AS is_test 
FROM t0) AS t1 
WHERE NOT t1.is_test]
[parameters: (1,)]
(Background on this error at: https://sqlalche.me/e/14/f405)

@ogrisel
Copy link
Contributor Author

ogrisel commented Jan 26, 2024

@cpcloud @lostmygithubaccount in case you had not seen my last comment, this expression still fails on main, even if no longer at compile time.

@jcrist
Copy link
Member

jcrist commented Jan 26, 2024

Thanks @ogrisel. The part of the compiler this is related to is pretty complex. I'll take a look to see if there's a quick fix we can do on main that can get this to work. After the 8.0 release we're going to merge the-epic-split branch, which is a full rewrite of the SQL compiler - if the fix is more complex I suspect we'll want to just fix it once in the rewrite rather than investing time fixing it on main.

@jcrist
Copy link
Member

jcrist commented Jan 26, 2024

Ok, after looking at this for a bit I'm going to say this is too hairy a bug to fix in the SQLAlchemy compiler at this time (the issue has to do with aliases being used incorrectly when compiling subqueries, but only sometimes). I suggest we focus on fixing this bug in the-epic-split branch (we also get a failure on that branch, but before compilation completes).

A cleaned up reproducible example:

import ibis
from ibis import _

query = (
    ibis.memtable({"a": range(30)})
        .mutate(id=ibis.row_number())
        .sample(fraction=0.25, seed=0)
        .mutate(is_test=_.id.isin(_.id))
        .filter(~_.is_test)
)

# Generate SQL for duckdb (currently fails here)
print(ibis.to_sql(query))

# Execute on duckdb
query.compute()

@kszucs: this currently fails on TES in one of the patterns with a KeyError, can you take a look?

@jcrist jcrist changed the title bug: AttributeError: 'str' object has no attribute 'alias' compiling a table expression that involves ibis.row_number(), .isin() and .sample() bug: compilation failure in table expression mixing ibis.row_number(), .isin() and .sample() Jan 26, 2024
@ogrisel
Copy link
Contributor Author

ogrisel commented Jan 26, 2024

Sounds good to me.

@kszucs kszucs self-assigned this Jan 28, 2024
@jcrist jcrist moved this from backlog to review in Ibis planning and roadmap Jan 30, 2024
@jcrist jcrist removed the tes-required-for-merge Issues that must addressed before merging the-epic-split branch into main label Jan 30, 2024
@kszucs
Copy link
Member

kszucs commented Feb 3, 2024

Resolved by #8124

@kszucs kszucs closed this as completed Feb 3, 2024
@github-project-automation github-project-automation bot moved this from review to done in Ibis planning and roadmap Feb 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis tes-required-for-release Things that must be addressed before *release* of `main` after merging in `the-epic-split`
Projects
Archived in project
Development

No branches or pull requests

5 participants