Skip to content

Option to disable hive partitioning wild cards #232

Open
@niydt

Description

@niydt

The avro files we are trying to load into RedShift are stored in folders with "=" in their names, i.e.

    event_type=users.behaviors.app.FirstSession/. 

When loading data from the following S3 prefix,

com.hoopladigital.brazecurrentsstaging/StagingCurrentFull/dataexport.prod-03.S3.integration.60d3692fcab9ca5f83919aab/event_type%3Dusers.behaviors.app.FirstSession

The lambda failed with this error:

            error: No Configuration Found for com.hoopladigital.brazecurrentsstaging/StagingCurrentFull/dataexport.prod-03.S3.integration.60d3692fcab9ca5f83919aab/event_type=*/date=*/399/prod-03

As shown in the error message above, the"event_type=/date=" portion of the error message was transformed assuming that we are taking advantage of the hive partitioning wildcards (https://github.com/awslabs/aws-lambda-redshift-loader#hive-partitioning-style-wildcards) and replaces the event_type value with *.

We don't want to use this feature- I need the lambda to use the exact folder name that I provided in the prefix. Is there a way for me to configure the lambda to not use hive partitioning wild cards?

line 1584 of index.js:
inputInfo.prefix = inputInfo.bucket + '/' + searchKey.transformHiveStylePrefix();

line 78 of index.js
transformHiveStylePrefix()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions