-
Notifications
You must be signed in to change notification settings - Fork 2
Convert tab delimited files to CSV format
Karen is a data analyst with a wide range of data files stored on Amazon S3 in different formats. Since some files are tab-delimited and others are CSV, she's constantly updating her scripts to handle different formats. She wants to convert all existing tab-delimited files to a CSV format and set up a process to automatically convert newly added files to a CSV format going forward.
Create a pipe to do batch processing of tab-delimited files. Select a folder in Amazon S3 that contains the source files and a separate folder in S3 as the destination. Add a file selection step that filters input files to only include the ones in a tab-delimited format. Add a convert step to change the file format to CSV, and output the converted files to the destination folder.
- Select and authenticate input connection
- Select and authenticate output connection
- Select source folders with the files to process
- Select destination folder where the converted files will be stored
- Add a file filter step to select only files in a tab-delimited format
- Add a convert step to change the file format from tab-delimited to CSV
- Run a test of the pipe to verify the file conversion works
- Schedule pipe to run periodically
- Turn the pipe "on" to put it into production
- Add file selection filter functionality, if this doesn't current exist
- Add "Test" button to confirm file transfer will work
- Add control to update pipe scheduling options
- Provide command-level feedback on whether the pipe steps are running successfully