Skip to content

SQL scripts for automatically importing data from various data management systems into EXASOL.

License

Notifications You must be signed in to change notification settings

ucnwafor/database-migration

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Database migration

Build Status

Please note that this is an open source project which is not officially supported by Exasol. We will try to help you as much as possible, but can't guarantee anything since this is not an official Exasol product.

Table of Contents

  1. Overview
  2. Migration source:
  3. Post-load optimization
  4. Delta import

Overview

This project contains SQL scripts for automatically importing data from various data management systems into Exasol.

You'll find SQL scripts which you can execute on Exasol to load data from certain databases or database management systems. The scripts try to extract the meta data from the source system and create the appropriate IMPORT statements automatically so that you don't have to care about table names, column names and types.

If you want to optimize existing scripts or create new scripts for additional systems, we would be very glad if you share your work with the Exasol user community.

Migration source

CSV

The method of importing a CSV file depends on the location of the file.

  • Import a file stored on your local machine via EXAplus:
IMPORT INTO <table> FROM LOCAL CSV FILE '<filename>' <options>;

Example:

IMPORT INTO MY_SCHEMA.MY_TABLE
FROM LOCAL CSV FILE 'C:\Users\my_user\Downloads\data.csv'
COLUMN SEPARATOR = ',' 
COLUMN DELIMITER = '"' 
ROW SEPARATOR = 'CRLF' -- CR when file was generated on a unix systems, CRLF when created on windows
SKIP = 1 -- skip the header
;
  • Import from HDFS: See SOL-322

  • Import from S3: See Exasol-1774 for single file import, for importing multiple files scroll down to S3

For more details on IMPORT see paragraph 2.2.2 in the User Manual. For further help on typical CSV-formatting issues, see

DB2

See script db2_to_exasol.sql

Exasol

See script exasol_to_exasol.sql

MySQL

Create a connection:

CREATE CONNECTION <name_of_connection>
TO 'jdbc:mysql://192.168.137.5:3306'
USER '<user>'
IDENTIFIED BY '<password>';

Test your connection:

SELECT * FROM
(
IMPORT FROM JDBC AT <name_of_connection>
STATEMENT 'SELECT 42 FROM DUAL'
);

Then you're ready to use the migration script: mysql_to_exasol.sql

Oracle

When importing from Oracle, you have two options. You could import via JDBC or the native Oracle interface (OCI).

  • OCI: Log in to EXAoperation. Go to Configuration -> Software. Download the instant client from Oracle and select it at Software Update File. Click Submit to upload.

    Create a connection:

    CREATE CONNECTION <name_of_connection>
    	TO 'jdbc:oracle:thin:@//192.168.99.100:1521/xe'
    	USER '<user>'
      IDENTIFIED BY '<password>';
  • JDBC: If you are using the community edition, you need to upload a JDBC driver in EXAoperation before being able to establish a connection, see SOL-179.

    Create a connection:

    CREATE CONNECTION <name_of_connection>
    	TO '192.168.99.100:1521/xe'
      USER '<user>'
      IDENTIFIED BY '<password>';

Test your connection:

SELECT * FROM
(
IMPORT FROM <conn_type> AT <name_of_connection>
STATEMENT 'SELECT 42 FROM DUAL'
);

<con_type> is eitherJDBC or ORA, depending on your connection

Then you're ready to use the migration script: oracle_to_exasol.sql

PostgreSQL

See script postgres_to_exasol.sql

Redshift

See script redshift_to_exasol.sql

S3

The script s3_to_exasol.sql looks different than the other import scripts. It's made to load data from S3 in parallel and needs some preparation before you can use it. See SOL-594 for detailed instructions. If you just want to import a single file, see 'Import from CSV' above.

SQL Server

See script sqlserver_to_exasol.sql

Teradata

The first thing you need to do is add the Teradata JDBC driver to Exasol. The driver can be downloaded from Teradata's Download site. You need to register first, it's free. Make sure that you download the right version of the JDBC driver, matching the version of the Teradata database.

The downloaded package contains two files:

  • terajdbc4.jar contains the actual Java classes of the driver
  • tdgssconfig.jar includes configuration information for Teradata's Generic Security Services (GSS)

Both files will need to be uploaded when you add the Teradata JDBC driver for Exasol. To do this, log into EXAoperations, then select Software, then the JDBC Drivers tab.

Click Add then specify the following details:

  • Driver Name: Teradata (or something similar)
  • Main Class: com.teradata.jdbc.TeraDriver
  • Prefix: jdbc:teradata:
  • Comment: Version 15.10 (or something similar)

After clicking Apply, you will see the newly added driver's details on the top section of the driver list. Select the Teradata driver (the radio button in the first column) and then locate the terajdbc4.jar and upload it; do the same with tdgssconfig.jar. When done, both .jar files should be listed in the Files column for the Teradata driver.

Next step is to test the connectivity. First, create a connection to the remote Teradata database:

    CREATE OR REPLACE CONNECTION <name_of_connection>
        TO 'jdbc:teradata://<host_name_or_ip_address>'
        USER '<td_username>'
        IDENTIFIED BY '<td_password>';

You need to have CREATE CONNECTION privilege granted to the user used to do this. Additional JDBC connection parameters (such as CHARSET might need to be specified in the connection string/URL); see information on these here.

Now, test the connectivity with a simple query:

    SELECT *
    FROM   (
               IMPORT FROM JDBC AT <name_of_connection>
               STATEMENT 'SELECT 1'
           );

For the actual data-migration, see script teradata_to_exasol.sql

Vectorwise

See script vectorwise_to_exasol.sql

Vertica

See script vertica_to_exasol.sql

Google BigQuery

See script bigquery_to_exasol.sql

Upload the following files into a bucket called "bqmigration" in the default BucketFS service:

  • JSON key file for your BigQuery service account
  • jar files of the BigQuery JDBC driver (tested with Simba v1.1.6)

The migration script creates DDL and IMPORT statements together with a CREATE CONNECTION statement. In order to perform these imports the JDBC driver has also to be installed in ExaOperation, see SOL-194 for details.

Post-load optimization

This folder contains scripts that can be used after having imported data from another database via the scripts above. What they do:

  • Optimize the column's datatypes to minimize storage space on disk
  • Import primary keys from other databases

Delta import

This folder contains a script that can be used if you want to import data on a regular basis. What it does:

  • Import only data that hasn't been imported yet by performing a delta import based on a given column (further explaination inside the folder)

About

SQL scripts for automatically importing data from various data management systems into EXASOL.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • PLpgSQL 39.8%
  • PLSQL 34.3%
  • SQLPL 17.8%
  • Shell 7.3%
  • Python 0.8%