Releases: ibis-project/ibis
Releases · ibis-project/ibis
5.0.0
5.0.0 (2023-03-15)
⚠ BREAKING CHANGES
- api: Snowflake identifiers are now kept as is from the database. Many table names and column names may now be in SHOUTING CASE. Adjust code accordingly.
- backend: Backends now raise
ibis.common.exceptions.UnsupportedOperationError
in more places during compilation. You may need to catch this error type instead of the previous type, which differed between backends. - ux:
Table.info
now returns an expression - ux: Passing a sequence of column names to
Table.drop
is removed. Replacedrop(cols)
withdrop(*cols)
. - The
spark
plugin alias is removed. Usepyspark
instead - ir: removed
ibis.expr.scope
andibis.expr.timecontext
modules, access them underibis.backends.base.df.<module>
- some methods have been removed from the top-level
ibis.<backend>
namespaces, access them on a connected backend instance instead. - common: removed
ibis.common.geospatial
, import the functions fromibis.backends.base.sql.registry.geospatial
- datatypes:
JSON
is no longer a subtype ofString
- datatype:
Category
,CategoryValue
/Column
/Scalar
are removed. Use string types instead. - ux: The
metric_name
argument tovalue_counts
is removed. UseTable.relabel
to change the metric column's name. - deps: the minimum version of
parsy
is now 2.0 - ir/backends: removed the following symbols:
ibis.backends.duckdb.parse_type()
functionibis.backends.impala.Backend.set_database()
methodibis.backends.pyspark.Backend.set_database()
methodibis.backends.impala.ImpalaConnection.ping()
methodibis.expr.operations.DatabaseTable.change_name()
methodibis.expr.operations.ParseURL
classibis.expr.operations.Value.to_projection()
methodibis.expr.types.Table.get_column()
methodibis.expr.types.Table.get_columns()
methodibis.expr.types.StringValue.parse_url()
method
- schema:
Schema.from_dict()
,.delete()
and.append()
methods are removed - datatype:
struct_type.pairs
is removed, usestruct_type.fields
instead - datatype:
Struct(names, types)
is not supported anymore, pass a dictionary toStruct
constructor instead
Features
- add
max_columns
option for table repr (a3aa236) - add examples API (b62356e)
- api: add
map
/array
accessors for easy conversion of JSON to stronger-typed values (d1e9d11) - api: add array to string join operation (74de349)
- api: add builtin support for relabeling columns to snake case (1157273)
- api: add support for passing a mapping to
ibis.map
(d365fd4) - api: allow single argument set operations (bb0a6f0)
- api: implement
to_pandas()
API for ecosystem compatibility (cad316c) - api: implement isin (ac31db2)
- api: make
cache
evaluate only once per session per expression (5a8ffe9) - api: make create_table uniform (833c698)
- api: more selectors (5844304)
- api: upcast pandas DataFrames to memtables in
rlz.table
rule (8dcfb8d) - backends: implement
ops.Time
for sqlalchemy backends (713cd33) - bigquery: add
BIGNUMERIC
type support (5c98ea4) - bigquery: add UUID literal support (ac47c62)
- bigquery: enable subqueries in select statements (ef4dc86)
- bigquery: implement create and drop table method (5f3c22c)
- bigquery: implement create_view and drop_view method (a586473)
- bigquery: support creating tables from in-memory tables (c3a25f1)
- bigquery: support in-memory tables (37e3279)
- change Rich repr of dtypes from blue to dim (008311f)
- clickhouse: implement
ArrayFilter
translation (f2144b6) - clickhouse: implement
ops.ArrayMap
(45000e7) - clickhouse: implement
ops.MapLength
(fc82eaa) - clickhouse: implement ops.Capitalize (914c64c)
- clickhouse: implement ops.ExtractMillisecond (ee74e3a)
- clickhouse: implement ops.RandomScalar (104aeed)
- clickhouse: implement ops.StringAscii (a507d17)
- clickhouse: implement ops.TimestampFromYMDHMS, ops.DateFromYMD (05f5ae5)
- clickhouse: improve error message for invalid types in literal (e4d7799)
- clickhouse: support asof_join (7ed5143)
- common: add abstract mapping collection with support for set operations (7d4aa0f)
- common: add support for variadic positional and variadic keyword annotations (baea1fa)
- common: hold typehint in the annotation objects (b3601c6)
- common: support
Callable
arguments and return types inValidator.from_annotable()
(ae57c36) - common: support positional only and keyword only arguments in annotations (340dca1)
- dask/pandas: raise OperationNotDefinedError exc for not defined operations (2833685)
- datafusion: implement ops.Degress, ops.Radians (7e61391)
- datafusion: implement ops.Exp (7cb3ade)
- datafusion: implement ops.Pi, ops.E (5a74cb4)
- datafusion: implement ops.RandomScalar (5d1cd0f)
- datafusion: implement ops.StartsWith (8099014)
- datafusion: implement ops.StringAscii (b1d7672)
- datafusion: implement ops.StrRight (016a082)
- datafusion: implement ops.Translate (2fe3fc4)
- datafusion: support substr without end (a19fd87)
- datatype/schema: support datatype and schema declaration using type annotated classes (6722c31)
- datatype: enable inference of
Decimal
type (8761732) - datatype: implement
Mapping
abstract base class forStructType
(5df2022) - deps: add Python 3.11 support and tests ([6f3f759](https://github.com/ibis-project/ibis/commit...
4.1.0
4.1.0 (2023-01-25)
Features
- add
ibis.get_backend
function (2d27df8) - add py.typed to allow mypy to type check packages that use ibis (765d42e)
- api: add
ibis.set_backend
function (e7fabaf) - api: add selectors for easier selection of columns (306bc88)
- bigquery: add JS UDF support (e74328b)
- bigquery: add SQL UDF support (db24173)
- bigquery: add to_pyarrow method (30157c5)
- bigquery: implement bitwise operations (55b69b1)
- bigquery: implement ops.Typeof (b219919)
- bigquery: implement ops.ZeroIfNull (f4c5607)
- bigquery: implement struct literal (c5f2a1d)
- clickhouse: properly support native boolean types (31cc7ba)
- common: add support for annotating with coercible types (ae4a415)
- common: make frozendict truly immutable (1c25213)
- common: support annotations with typing.Literal (6f89f0b)
- common: support generic mapping and sequence type annotations (ddc6603)
- dask: support
connect()
with no arguments (67eed42) - datatype: add optional timestamp scale parameter (a38115a)
- datatypes: add
as_struct
method to convert schemas to structs (64be7b1) - duckdb: add
read_json
function for consuming newline-delimited JSON files (65e65c1) - mssql: add a bunch of missing types (c698d35)
- mssql: implement inference for
DATETIME2
andDATETIMEOFFSET
(aa9f151) - nicer repr for Backend.tables (0d319ca)
- pandas: support
connect()
with no arguments (78cbbdd) - polars: allow ibis.polars.connect() to function without any arguments (d653a07)
- polars: handle casting to scaled timestamps (099d1ec)
- postgres: add
Map(string, string)
support via the built-inHSTORE
extension (f968f8f) - pyarrow: support conversion to pyarrow map and struct types (54a4557)
- snowflake: add more array operations (8d8bb70)
- snowflake: add more map operations (7ae6e25)
- snowflake: any/all/notany/notall reductions (ba1af5e)
- snowflake: bitwise reductions (5aba997)
- snowflake: date from ymd (035f856)
- snowflake: fix array slicing (bd7af2a)
- snowflake: implement
ArrayCollect
(c425f68) - snowflake: implement
NthValue
(0dca57c) - snowflake: implement
ops.Arbitrary
(45f4f05) - snowflake: implement
ops.StructColumn
(41698ed) - snowflake: implement
StringSplit
(e6acc09) - snowflake: implement
StructField
and struct literals (286a5c3) - snowflake: implement
TimestampFromUNIX
(314637d) - snowflake: implement
TimestampFromYMDHMS
(1eba8be) - snowflake: implement
typeof
operation (029499c) - snowflake: implement exists/not exists (7c8363b)
- snowflake: implement extract millisecond (3292e91)
- snowflake: make literal maps and params work (dd759d3)
- snowflake: regex extract, search and replace (9c82179)
- snowflake: string to timestamp (095ded6)
- sqlite: implement
_get_schema_using_query
in SQLite backend (7ff84c8) - trino: compile timestamp types with scale (67683d3)
- trino: enable
ops.ExistsSubquery
andops.NotExistsSubquery
(9b9b315) - trino: map parameters (53bd910)
- ux: improve error message when column is not found (b527506)
Bug Fixes
- backend: read the default backend setting in
_default_backend
(11252af) - bigquery: move connection logic to do_connect (42f2106)
- bigquery: remove invalid operations from registry (911a080)
- bigquery: resolve deprecation warnings for
StructType
andSchema
(c9e7078) - clickhouse: fix position call (702de5d)
- correctly visualize array type (26b0b3f)
- deps: make sure pyarrow is not an implicit dependency (10373f4)
- duckdb: make
read_csv
on URLs work (9e61816) - duckdb: only try to load extensions when necessary for csv (c77bde7)
- duckdb: remove invalid operations from registry (ba2ec59)
- fallback to default backend with
to_pyarrow
/to_pyarrow_batches
(a1a6902) - impala: remove broken alias elision (32b120f)
- ir: error for
order_by
on nonexistent column (57b1dd8) - ir: ops.Where output shape should consider all arguments (...
4.0.0
4.0.0 (2023-01-09)
⚠ BREAKING CHANGES
- functions, methods and classes marked as deprecated are removed now
- ir: replace
HLLCardinality
withApproxCountDistinct
andCMSMedian
withApproxMedian
operations. - backends: the datatype of returned execution results now more closely matches that of the ibis expression's type. Downstream code may need to be adjusted.
- ir: the
JSONB
type is replaced by theJSON
type. - dev-deps: expression types have been removed from
ibis.expr.api
. Useimport ibis.expr.types as ir
to access these types. - common: removed
@immutable_property
decorator, use@attribute.default
instead - timestamps: the
timezone
argument toto_timestamp
is gone. This was only supported in the BigQuery backend. Append%Z
to the format string and the desired time zone to the input column if necessary. - deps: ibis now supports at minimum duckdb 0.3.3. Please upgrade your duckdb install as needed.
- api: previously
ibis.connect
would return aTable
object when callingconnect
on a parquet/csv file. This now returns a backend containing a single table created from that file. When possible users may useibis.read
instead to read files into ibis tables. - api:
histogram()
'sclosed
argument no longer exists because it never had any effect. Remove it from yourhistogram
method calls. - pandas/dask: the Pandas and Dask backends now interpret casting ints to/from timestamps as seconds since the unix epoch, matching other backends.
- datafusion:
register_csv
andregister_parquet
are removed. Pass filename toregister
method instead. - ir:
ops.NodeList
andir.List
are removed. Use tuples to represent sequence of expressions instead. - api:
re_extract
now followsre.match
behavior. In particular, the0
th group is now the entire string if there's a match, otherwise the groups are 1-based. - datatypes: enums are now strings. Likely no action needed since no functionality existed.
- ir: Replace
t[t.x.topk(...)]
witht.semi_join(t.x.topk(...), "x")
. - ir:
ir.Analytic.type()
andir.TopK.type()
methods are removed. - api: the default limit for table/column expressions is now
None
(meaning no limit). - ir: join changes: previously all column names that collided between
left
andright
tables were renamed with an appended suffix. Now for the case of inner joins with only equality predicates, colliding columns that are known to be equal due to the join predicates aren't renamed. - impala: kerberos support is no longer installed by default for the
impala
backend. To add support you'll need to install thekerberos
package separately. - ir:
ops.DeferredSortKey
is removed. Useops.SortKey
directly instead. - ir:
ibis.common.grounds.Annotable
is mutable by default now - ir:
node.has_resolved_name()
is removed, useisinstance(node, ops.Named)
instead;node.resolve_name()
is removed usenode.name
instead - ir: removed
ops.Node.flat_args()
, directly usenode.args
property instead - ir: removed
ops.Node.inputs
property, use the multipledispatchedget_node_arguments()
function in the pandas backend - ir:
Node.blocks()
method has been removed. - ir:
HasSchema
mixin class is no longer available, directly subclassops.TableNode
and implement schema property instead - ir: Removed
Node.output_type
property in favor of abstractmethodNode.to_expr()
which now must be explicitly implemented - ir:
Expr(Op(Expr(Op(Expr(Op)))))
is now represented asExpr(Op(Op(Op)))
, so code using ibis internals must be migrated - pandas: Use timezone conversion functions to compute the original machine localized value
- common: use
ibis.common.validators.{Patameter, Signature}
instead - ir:
ibis.expr.lineage.lineage()
is now removed - ir: removed
ir.DestructValue
,ir.DestructScalar
andir.DestructColumn
, usetable.unpack()
instead - ir: removed
Node.root_tables()
method, useibis.expr.analysis.find_immediate_parent_tables()
instead - impala: use other methods for pinging the database
Features
- add experimental decorator (791335f)
- add to_pyarrow and to_pyarrow_batches (a059cf9)
- add unbind method to expressions (4b91b0b), closes #4536
- add way to specify sqlglot dialect on backend (f1c0608)
- alchemy: implement json getitem for sqlalchemy backends (7384087)
- api: add
agg
alias foraggregate
(907583f) - api: add
agg
alias togroup_by
(6b6367c) - api: add
ibis.read
top level API function (e67132c) - api: add JSON
__getitem__
operation (3e2efb4) - api: implement
__array__
(1402347) - api: make
drop
variadic (1d69702) - api: return object from
to_sql
to support notebook syntax highlighting (87c9833) - api: use
rich
for interactive__repr__
(04758b8) - backend: make
ArrayCollect
filterable (1e1a5cf) - backends/mssql: add backend support for Microsoft Sql Server (fc39323)
- bigquery: add ops.DateFromYMD, ops.TimeFromHMS, ops.TimestampFromYMDHMS (a4a7936)
- bigquery: add ops.ExtractDayOfYear (30c547a)
- bigquery: add support for correlation (4df9f8b)
- bigquery: implement
argmin
andargmax
(40c5f0d) - bigquery: implement
pi
ande
(b91370a) - bigquery: implement array repeat (09d1e2f)
- bigquery: implement JSON getitem functionality (9c0e775)
- bigquery: implement ops.ArraySlice (49414ef)
- bigquery: implement ops.Capitalize (5757bb0)
- bigquery: implement ops.Clip (5495d6d)
- bigquery: implement ops.Degrees, ops.Radians (5119b93)
- bigquery: implement ops.ExtractWeekOfYear (477d287)
- bigquery: implement ops.RandomScalar (5dc8482)
- bigquery: implement ops.StructColumn, ops.ArrayColumn (2bbf73c)
- bigquery: implement ops.Translate (77a4b3e)
- bigquery: implementt ops.NthValue (b43ba28)
- bigquery: move bigquery backend back into the main repo (cd5e881)
- clickhouse: handle more options in
parse_url
implementation (874c5c0) - clickhouse: implement
INTERSECT ALL
/EXCEPT ALL
(f65fbc3) - clickhouse: implement quantile/multiquantile (96d7d1b)
- common: support function annotations with both typehints and rules (7e23f3e)
- dask: implement
mode
aggregation (017f07a) - dask: implement json getitem (381d805)
- datafusion: convert column expressions to...
3.2.0
3.2.0 (2022-09-15)
Features
- add api to get backend entry points (0152f5e)
- api: add
and_
andor_
helpers (94bd4df) - api: add
argmax
andargmin
column methods (b52216a) - api: add
distinct
toIntersection
andDifference
operations (cd9a34c) - api: add
ibis.memtable
API for constructing in-memory table expressions (0cc6948) - api: add
ibis.sql
to easily get a formatted SQL string (d971cc3) - api: add
Table.unpack()
andStructValue.lift()
APIs for projecting struct fields (ced5f53) - api: allow transmute-style select method (d5fc364)
- api: implement all bitwise operators (7fc5073)
- api: promote
psql
to ashow_sql
public API (877a05d) - clickhouse: add dataframe external table support for memtables (bc86aa7)
- clickhouse: add enum, ipaddr, json, lowcardinality to type parser (8f0287f)
- clickhouse: enable support for working window functions (310a5a8)
- clickhouse: implement
argmin
andargmax
(ee7c878) - clickhouse: implement bitwise operations (348cd08)
- clickhouse: implement struct scalars (1f3efe9)
- dask: implement
StringReplace
execution (1389f4b) - dask: implement ungrouped
argmin
andargmax
(854aea7) - deps: support duckdb 0.5.0 (47165b2)
- duckdb: handle query parameters in
ibis.connect
(fbde95d) - duckdb: implement
argmin
andargmax
(abf03f1) - duckdb: implement bitwise xor (ca3abed)
- duckdb: register tables from pandas/pyarrow objects (36e48cc)
- duckdb: support unsigned integer types (2e67918)
- impala: implement bitwise operations (c5302ab)
- implement dropna for SQL backends (8a747fb)
- log: make BaseSQLBackend._log print by default (12de5bb)
- mysql: register BLOB types (1e4fb92)
- pandas: implement
argmin
andargmax
(bf9b948) - pandas: implement
NotContains
on grouped data (976dce7) - pandas: implement
StringReplace
execution (578795f) - pandas: implement Contains with a group by (c534848)
- postgres: implement bitwise xor (9b1ebf5)
- pyspark: add option to treat nan as null in aggregations (bf47250)
- pyspark: implement
ibis.connect
for pyspark (a191744) - pyspark: implement
Intersection
andDifference
(9845a3c) - pyspark: implement bitwise operators (33cadb1)
- sqlalchemy: implement bitwise operator translation (bd9f64c)
- sqlalchemy: make
ibis.connect
with sqlalchemy backends (b6cefb9) - sqlalchemy: properly implement
Intersection
andDifference
(2bc0b69) - sql: implement
StringReplace
translation (29daa32) - sqlite: implement bitwise xor and bitwise not (58c42f9)
- support
table.sort_by(ibis.random())
(693005d) - type-system: infer pandas' string dtype (5f0eb5d)
- ux: add duckdb as the default backend (8ccb81d)
- ux: use
rich
to formatTable.info()
output (67234c3) - ux: use
sqlglot
for pretty printing SQL (a3c81c5) - variadic union, intersect, & difference functions (05aca5a)
Bug Fixes
- api: make sure column names that are already inferred are not overwritten (6f1cb16)
- api: support deferred objects in existing API functions (241ce6a)
- backend: ensure that chained limits respect prior limits (02a04f5)
- backends: ensure select after filter works (e58ca73)
- backends: only recommend installing ibis-foo when foo is a known backend (ac6974a)
- base-sql: fix String-generating backend string concat implementation (3cf78c1)
- clickhouse: add IPv4/IPv6 literal inference (0a2f315)
- clickhouse: cast repeat
times
argument toUInt64
(b643544) - clickhouse: fix listing tables from databases with no tables (08900c3)
- compilers: make sure memtable rows have names in the SQL string compilers (18e7f95)
- compiler: use
repr
for SQL stringVALUES
data (75af658) - dask: ensure predicates are computed before projections (5cd70e1)
- dask: implement timestamp-date binary comparisons (48d5058)
- dask: set dask upper bound due to large scale test breakage (796c645), closes #9221
- decimal: add decimal type inference (3fe3fd8)
- deps: update dependency duckdb-engine to >=0.1.8,<0.4.0 (113dc8f)
- deps: update dependency duckdb-engine t...
3.1.0
3.1.0 (2022-07-26)
Features
- add
__getattr__
support toStructValue
(75bded1) - allow selection subclasses to define new node args (2a7dc41)
- api: accept
Schema
objects in publicibis.schema
(0daac6c) - api: add
.tables
accessor toBaseBackend
(7ad27f0) - api: add
e
function to public API (3a07e70) - api: add
ops.StructColumn
operation (020bfdb) - api: add cume_dist operation (6b6b185)
- api: add toplevel ibis.connect() (e13946b)
- api: handle literal timestamps with timezone embedded in string (1ae976b)
- api: ibis.connect() default to duckdb for parquet/csv extensions (ff2f088)
- api: make struct metadata more convenient to access (3fd9bd8)
- api: support tab completion for backends (eb75fc5)
- api: underscore convenience api (81716da)
- api: unnest (98ecb09)
- backends: allow column expressions from non-foreign tables on the right side of
isin
/notin
(e1374a4) - base-sql: implement trig and math functions (addb2c1)
- clickhouse: add ability to pass arbitrary kwargs to Clickhouse do_connect (583f599)
- clickhouse: implement
ops.StructColumn
operation (0063007) - clickhouse: implement array collect (8b2577d)
- clickhouse: implement ArrayColumn (1301f18)
- clickhouse: implement bit aggs (f94a5d2)
- clickhouse: implement clip (12dfe50)
- clickhouse: implement covariance and correlation (a37c155)
- clickhouse: implement degrees (7946c0f)
- clickhouse: implement proper type serialization (80f4ab9)
- clickhouse: implement radians (c7b7f08)
- clickhouse: implement strftime (222f2b5)
- clickhouse: implement struct field access (fff69f3)
- clickhouse: implement trig and math functions (c56440a)
- clickhouse: support subsecond timestamp literals (e8698a6)
- compiler: restore
intersect_class
anddifference_class
overrides in base SQL backend (2c46a15) - dask: implement trig functions (e4086bb)
- dask: implement zeroifnull (38487db)
- datafusion: implement negate (69dd64d)
- datafusion: implement trig functions (16803e1)
- duckdb: add register method to duckdb backend to load parquet and csv files (4ccc6fc)
- duckdb: enable find_in_set test (377023d)
- duckdb: enable group_concat test (4b9ad6c)
- duckdb: implement
ops.StructColumn
operation (211bfab) - duckdb: implement approx_count_distinct (03c89ad)
- duckdb: implement approx_median (894ce90)
- duckdb: implement arbitrary first and last aggregation (8a500bc)
- duckdb: implement NthValue (1bf2842)
- duckdb: implement strftime (aebc252)
- duckdb: return the
ir.Table
instance from DuckDB'sregister
API (0d05d41) - mysql: implement FindInSet (e55bbbf)
- mysql: implement StringToTimestamp (169250f)
- pandas: implement bitwise aggregations (37ff328)
- pandas: implement degrees (25b4f69)
- pandas: implement radians (6816b75)
- pandas: implement trig functions (1fd52d2)
- pandas: implement zeroifnull (48e8ed1)
- postgres/duckdb: implement covariance and correlation (464d3ef)
- postgres: implement ArrayColumn (7b0a506)
- pyspark: implement approx_count_distinct (1fe1d75)
- pyspark: implement approx_median (07571a9)
- pyspark: implement covariance and correlation (ae818fb)
- pyspark: implement degrees (f478c7c)
- pyspark: implement nth_value (abb559d)
- pyspark: implement nullifzero (640234b)
- pyspark: implement radians (18843c0)
- pyspark: implement trig functions (fd7621a)
- pyspark: implement Where (32b9abb)
- pyspark: implement xor (550b35b)
- pyspark: implement zeroifnull (db13241)
- pyspark: topk support (9344591)
- sqlalchemy: add degrees and radians (8b7415f)
- sqlalchemy: add xor translation rule (2921664)
- sqlalchemy: allow non-primitive arrays ([4e02918](4e02918...
3.0.2
3.0.1
3.0.0
3.0.0 (2022-04-25)
⚠ BREAKING CHANGES
- ir: The following are breaking changes due to simplifying expression internals
ibis.expr.datatypes.DataType.scalar_type
andDataType.column_type
factory
methods have been removed,DataType.scalar
andDataType.column
class
fields can be used to directly construct a corresponding expression instance
(though prefer to useoperation.to_expr()
)ibis.expr.types.ValueExpr._name
andValueExpr._dtype`` fields are not accassible anymore. While these were not supposed to used directly now
ValueExpr.has_name(),
ValueExpr.get_name()and
ValueExpr.type()` methods
are the only way to retrieve the expression's name and datatype.ibis.expr.operations.Node.output_type
is a property now not a method,
decorate those methods with@property
ibis.expr.operations.ValueOp
subclasses must defineoutput_shape
and
output_dtype
properties from now on (note the datatype abbreviationdtype
in the property name)ibis.expr.rules.cast()
,scalar_like()
andarray_like()
rules have been
removed
- api: Replace
t["a"].distinct()
witht[["a"]].distinct()
. - deps: The sqlalchemy lower bound is now 1.4
- ir: Schema.names and Schema.types attributes now have tuple type rather than list
- expr: Columns that were added or used in an aggregation or
mutation would be alphabetically sorted in compiled SQL outputs. This
was a vestige from when Python dicts didn't preserve insertion order.
Now columns will appear in the order in which they were passed to
aggregate
ormutate
- api:
dt.float
is nowdt.float64
; usedt.float32
for the previous behavior. - ir: Relation-based
execute_node
dispatch rules must now accept tuples of expressions. - ir: removed ibis.expr.lineage.{roots,find_nodes} functions
- config: Use
ibis.options.graphviz_repr = True
to enable - hdfs: Use
fsspec
instead of HDFS from ibis - udf: Vectorized UDF coercion functions are no longer a public API.
- The minimum supported Python version is now Python 3.8
- config:
register_option
is no longer supported, please submit option requests upstream - backends: Read tables with pandas.read_hdf and use the pandas backend
- The CSV backend is removed. Use Datafusion for CSV execution.
- backends: Use the datafusion backend to read parquet files
Expr() -> Expr.pipe()
- coercion functions previously in expr/schema.py are now in udf/vectorized.py
- api:
materialize
is removed. Joins with overlapping columns now have suffixes. - kudu: use impala instead: https://kudu.apache.org/docs/kudu_impala_integration.html
- Any code that was relying implicitly on string-y
behavior from UUID datatypes will need to add an explicit cast first.
Features
- add repr_html for expressions to print as tables in ipython (cd6fa4e)
- add duckdb backend (667f2d5)
- allow construction of decimal literals (3d9e865)
- api: add
ibis.asc
expression (efe177e), closes #1454 - api: add has_operation API to the backend (4fab014)
- api: implement type for SortExpr (ab19bd6)
- clickhouse: implement string concat for clickhouse (1767205)
- clickhouse: implement StrRight operation (67749a0)
- clickhouse: implement table union (e0008d7)
- clickhouse: implement trim, pad and string predicates (a5b7293)
- datafusion: implement Count operation (4797a86)
- datatypes: unbounded decimal type (f7e6f65)
- date: add ibis.date(y,m,d) functionality (26892b6), closes #386
- duckdb/postgres/mysql/pyspark: implement
.sql
on tables for mixing sql and expressions (00e8087) - duckdb: add functionality needed to pass integer to interval test (e2119e8)
- duckdb: implement _get_schema_using_query (93cd730)
- duckdb: implement now() function (6924f50)
- duckdb: implement regexp replace and extract (18d16a7)
- implement
force
argument in sqlalchemy backend base class (9df7f1b) - implement coalesce for the pyspark backend (8183efe)
- implement semi/anti join for the pandas backend (cb36fc5)
- implement semi/anti join for the pyspark backend (3e1ba9c)
- implement the remaining clickhouse joins (b3aa1f0)
- ir: rewrite and speed up expression repr (45ce9b2)
- mysql: implement _get_schema_from_query (456cd44)
- mysql: move string join impl up to alchemy for mysql (77a8eb9)
- postgres: implement _get_schema_using_query (f2459eb)
- pyspark: implement Distinct for pyspark (4306ad9)
- pyspark: implement log base b for pyspark (527af3c)
- pyspark: implement percent_rank and enable testing (c051617)
- repr: add interval info to interval repr (df26231)
- sqlalchemy: implement ilike (43996c0)
- sqlite: implement date_truncate (3ce4f2a)
- sqlite: implement ISO week of year (714ff7b)
- sqlite: implement string join and concat (6f5f353)
- support of arrays and tuples for clickhouse (db512a8)
- ver: dynamic version identifiers (408f862)
Bug Fixes
- added wheel to pyproject toml for venv users (b0b8e5c)
- allow major version changes in CalVer dependencies (9c3fbe5)
- annotable: allow optional arguments at any position (778995f), closes #3730
- api: add ibis.map and .struct (327b342), closes #3118
- api: map string multiplication with integer to repeat method (b205922)
- api: thread suffixes parameter to individual join methods (31a9aff)
- change TimestampType to Timestamp (e0750be)
- clickhouse: disconnect from clickhouse when computing version (11cbf08)
- clickhouse: use a context manager for execution ([a471225](https://github.com/ibis-project/ibi...
2.1.1
2.1.0
2.1.0 (2022-01-12)
Bug Fixes
- consider all packages' entry points (b495cf6)
- datatypes: infer bytes literal as binary #2915 (#3124) (887efbd)
- deps: bump minimum dask version to 2021.10.0 (e6b5c09)
- deps: constrain numpy to ensure wheels are used on windows (70c308b)
- deps: update dependency clickhouse-driver to ^0.1 || ^0.2.0 (#3061) (a839d54)
- deps: update dependency geoalchemy2 to >=0.6,<0.11 (4cede9d)
- deps: update dependency pyarrow to v6 (#3092) (61e52b5)
- don't force backends to override do_connect until 3.0.0 (4b46973)
- execute materialized joins in the pandas and dask backends (#3086) (9ed937a)
- literal: allow creating ibis literal with uuid (#3131) (b0f4f44)
- restore the ability to have more than two option levels (#3151) (fb4a944)
- sqlalchemy: fix correlated subquery compilation (43b9010)
- sqlite: defer db connection until needed (#3127) (5467afa), closes #64
Features
- allow column_of to take a column expression (dbc34bb)
- ci: More readable workflow job titles (#3111) (d8fd7d9)
- datafusion: initial implementation for Arrow Datafusion backend (3a67840), closes #2627
- datafusion: initial implementation for Arrow Datafusion backend (75876d9), closes #2627
- make dayofweek impls conform to pandas semantics (#3161) (9297828)
Reverts
- "ci: install gdal for fiona" (8503361)