AWS Data Wrangler 1.1.0
New Functionalities
- Support for nested arrays and structs on
wr.s3.to_parquet()#206 - Support for Read Parquet/Athena/Redshift chunked by number of rows #192
- Add
custom_classificationstowr.emr.create_cluster()#193 - Support for Docker on EMR #193
- Add
kms_key_id,max_file_size,regionarguments towr.db.unload_redshift()#197 - Add
catalog_versioningargument towr.s3.to_csv()andwr.s3.to_parquet()#198 - Add
keep_filesandctas_temp_table_namearguments towr.athena.read_sql_*()#203 - Add
replace_filenamesargument towr.s3.copy_objects()#215
Enhancements
wr.s3.to_csv()andwr.s3.to_parquet()no longer need delete table permission to overwrite catalog table #198- Added support for UUID on
wr.db.read_sql_query()(PostgreSQL) #200 - Refactoring of Athena encryption and workgroup support #212
Bug Fix
- Support for read full NULL columns from PostgreSQL, MySQL, and Redshift #218
Thanks
We thank the following contributors/users for their work on this release:
@robkano ,@luigift, @parasml, @OElesin, @jar-no1, @keatmin, @pmleveque, @sapientderek, @jadayn, @igorborgest.
P.S. Lambda Layer's zip-file and Glue's wheel/egg are available below. Just upload it and run!
P.P.S. AWS Data Wrangler counts on compiled dependencies (C/C++) so there is no support for Glue PySpark by now (Only Glue Python Shell).