Releases: aws/aws-sdk-pandas
AWS Data Wrangler 2.0.0
Breaking changes
- sqlalchemyand- psycopg2dependencies replaced by- redshift_connectorand- pg8000
- All wr.db.*functions was distributed intowr.redshift.*,wr.postgresql.*andwr.mysql.*(Tutorial)
- Redshift COPY and UNLOAD function was refactored into wr.redshift.*(Tutorial)
- wr.catalog.get_engine()was replaced by- wr.redshift.connect(),- wr.postgresql.connect(),- wr.mysql.connect()(Tutorial)
New Functionalities
Enhancements
- General performance improved for s3 I/O removing eventual consistency guardrails (Reference)
- Add retry with decorrelated jitter for Athena and Glue Catalog calls to overcome throttling in high concurrency scenarios.
Docs
- Updates regarding all new functionalities
- Add Amazon Timestream tutorial
- Add Amazon Timestream tutorial 2
AWS re:Invent related news
- AWS Lambda now supports up to 10 GB of memory and 6 vCPU cores
- Amazon S3 now delivers strong read-after-write consistency
- AWS Lambda now supports container images as a packaging format
- Serverless Batch Scheduling with AWS Batch and AWS Fargate
Thanks
We thank the following contributors/users for their work on this release:
@Brooke-white, @danielwo, @sapientderek, @pmleveque, @igorborgest.
P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
AWS Data Wrangler 1.10.1
New Functionalities
Enhancements
Bug Fix
- Fix Athena read with ctas_approach=Falseandchunksize=True#458
- Fix overwriting for not enforced configs #450
Docs
Thanks
We thank the following contributors/users for their work on this release:
@tuannguyen0901, @bryanyang0528, @czagoni, @jesusch, @danielwo, @DonghanYang, @eric-valente, @igorborgest.
P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
AWS Data Wrangler 1.10.0
New Functionalities
- Add configurable Endpoint URL for AWS services #418
- Add global environment configuration for Athena workgroups #437
Enhancements
- Support for Apache Arrow 2.0.0 #436
- Allow Decimal to float casting for wr.db.read_sql_query()#431
- Allow unsafe conversions for wr.db.read_sql_query()#427
Bug Fix
- QuickSight functions now allow usernames with "/" #434
- Fix duplicated carriage return for wr.s3.to_csv()running on Windows platform.
Thanks
We thank the following contributors/users for their work on this release:
@martinSpears-ECS, @imanebosch, @Eric-He-98, @brombach, @Thomas-Hirsch, @vuchetichbalint, @igorborgest.
P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
AWS Data Wrangler 1.9.6
Enhancements
- Add encrypted glue connection management #413
Bug Fix
- Double carriage return when using \r\n as line terminator (s3.to_csv()) #415
- s3.read_parquetfailing with some timezone aware columns #417
Thanks
We thank the following contributors/users for their work on this release:
@jeanbaptistepriez, @mike-at-upside, @Thiago-Dantas, @igorborgest.
P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
AWS Data Wrangler 1.9.5
Enhancements
Bug Fix
- [Parquet Read] Fix index recovery combined with columns filter #408
Docs
- Handling and documenting ctas_approach for custom data sources #392
Thanks
We thank the following contributors/users for their work on this release:
P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
AWS Data Wrangler 1.9.4
Enhancements
- Add s3_additional_kwargsforwr.s3.copy_objects()andwr.s3.merge_datasets()#388
- Add data_sourceargument for Athena queries #392
- Handling parquet tinyintcolumns on Redshift loads #400
Bug Fix
- Fix issue with Hive partitions compatibility. #397
- Fix missing catalog_id arguments in partitioned wr.s3.to_parquet()calls #399
- Remove adaptive retry for boto3 resource. #403
Docs
- Few updates.
Thanks
We thank the following contributors/users for their work on this release:
@timgates42, @bvsubhash, @DonghanYang, @sl-antoinelaborde, @Xiangyu-C, @tuannguyen0901, @JPFrancoia, @sapientderek, @igorborgest.
P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
AWS Data Wrangler 1.9.3
Bug Fix
- Fix bug for wr.s3.read_parquet()with timezone offset. #385
Thanks
We thank the following contributors/users for their work on this release:
P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
AWS Data Wrangler 1.9.2
Bug Fix
Thanks
We thank the following contributors/users for their work on this release:
@tasq-inc, @chrisrana, @igorborgest.
P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
AWS Data Wrangler 1.9.1
Enhancements
- Significant Amazon S3 I/O speed up for big files #377
- Create Parquet Datasets with columns with CamelCase names #380
Bug Fix
- Read Parquet error for some files created by DMS #376
Docs
- Few updates.
Thanks
We thank the following contributors/users for their work on this release:
@jarretg, @chrisrana, @vikramshitole, @igorborgest.
P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!
AWS Data Wrangler 1.9.0
Breaking changes
- Global configuration s3fs_block_sizewas replaced bys3_block_size#370
New Functionalities
- Automatic recovery of Pandas indexes from Parquet files. #366
- Automatic recovery of Pandas time zones from Parquet files. #366
- Optional schema evolution disabling through the new schema_evolutionargument. #353
Enhancements
- s3fsdependency was replaced by builtin code. #370
- Significant Amazon S3 I/O speed up for high latency environments (e.g. local, on-premises). #370
Bug Fix
Docs
- Few updates.
Thanks
We thank the following contributors/users for their work on this release:
@isrsal, @bppont, @weishao-aws, @alexifm, @Digma, @samcon, @TerrellV, @msantino, @alvaropc, @luigift, @igorborgest.
P.S. Lambda Layer zip file and Glue wheel/egg files are available below. Just upload it and run!