This Terraform project is designed to simplify (where possible) the process of setting up an SFTP server on GCP. Once in place, the SFTP server can be used with GA4's Data Import function. Running these scripts will set up the server as a VM instance on Compute Engine, and the files will be stored in a Cloud Storage bucket. The server will also be provisioned with a service account to use when connecting with the GCS bucket.
The user running this Terraform file should have sufficient GCP permissions for managing Compute Engine and Cloud Storage as well as for creating service accounts.
The SFTP server can support a fixed hostname. If this is desired, you will also need a domain and the ability to update its DNS configuration.
Terraform will also need to be installed on your device.
- Create a bucket in Google Cloud Storage for the Terraform state file. Bucket names must be globally unique, so name it your GCP Project ID plus
-ga4-data-import-sftp-tf-state
.- For example, an ID of
adswerves-cool-project
should result in a bucket name ofadswerves-cool-project-ga4-data-import-sftp-tf-state
.
- For example, an ID of
- In the
backend.tf
file, update the value ofbucket
to contain the state file name set in step 1. - Optionally, choose a custom domain to use for the SFTP server. Otherwise, the server can be accessed by its fixed IP address.
- Optionally, choose a username for the user who will be uploading and managing files. Otherwise, the default username of
sftpuser
will be used. - Generate a public key file for the user in step 4, name it
id_sftp.pub
, and save it within the parent folder of these scripts. - If using a custom domain, follow the steps under Using a Custom Domain. Otherwise, jump to Using a Fixed IP Address.
- Create the SFTP data source in GA4's admin UI. The Server Url will be
sftp://
+ the domain from step 3 + the file path/uploads/ga4data.csv
. The username for GA4 will bega4-importer
.- For example, if you had chosen
ga4sftp.example.com
, the Server Url value would besftp://ga4sftp.example.com/uploads/ga4data.csv
.
- For example, if you had chosen
- Get the public key file generated by GA4 following step 7, name it
ga4_sftp.pub
, and place it into the parent folder of these scripts. - Update the Configuration Settings file.
- In a terminal within this parent folder, run
terraform init
to complete the initial setup of Terraform. - In the same terminal, run
terraform plan
to preview the planned changes. - In the same terminal, run
terraform apply
, and enteryes
when prompted in order to complete the changes. - Once complete, Terraform will return an IPv4 address, which will need to be added to the DNS config for the domain chosen in step 3.
- After the DNS configuration has been updated, you should be able to upload files to the SFTP server.
- Create an empty file named
ga4_sftp.pub
, and place it into the parent folder of these scripts. - Update the Configuration Settings file.
- In a terminal within this parent folder, run
terraform init
to complete the initial setup of Terraform. - In the same terminal, run
terraform plan
to preview the planned changes. - In the same terminal, run
terraform apply
, and enteryes
when prompted to complete the changes. - Once complete, Terraform will return an IPv4 address. Take note as it is needed for the next step (and for connecting to the server).
- Create the SFTP data source in GA4's admin UI. The Server Url will be
sftp://
+ the IP address from step 12 + the file path/uploads/ga4data.csv
. The username for GA4 will bega4-importer
.- For example, if the IP address is
1.1.1.1
, the Server Url value would besftp://1.1.1.1/uploads/ga4data.csv
.
- For example, if the IP address is
- Get the public key file generated by GA4 following step 13, and copy its contents into the
ga4_sftp.pub
file from step 7, saving the updated file. - In a terminal within this parent folder, run
terraform plan
again. You should see a note along the lines ofgoogle_compute_instance.sftp_server must be replaced
. - In the same terminal, run
terraform apply
, and enteryes
when prompted in order to update GA4's public key file on the server. - After completing these steps, you should be able to upload files to the SFTP server.
The config.auto.tfvars
file should be updated prior to running these Terraform scripts. See table below for details about each variable and whether it requires an update:
Variable | Update? | Description |
---|---|---|
project_id |
Required | The ID of the GCP project that will contain the SFTP server and its related infrastructure. |
name |
Optional | A string used when naming resources set up in GCP. Updating will be necessary if configuring multiple parallel SFTP servers in the same GCP project. |
server_hostname |
Required | Set to an empty string to skip setting up a custom domain. Otherwise, this will be the custom domain assigned to the SFTP server. Recommended format is ga4sftp.<your website's eTLD+1> |
username |
Optional | The username for the file uploader/manager accessing the SFTP server. |
machine_type |
Optional | The type of Compute Engine machine to use for the SFTP server. |
os_family |
Optional | The Operating System used by the virtual machine running the SFTP server. |
compute_region |
Optional | The region where the Compute Engine machine will run. Make sure that this is aligned with compute_zone . |
compute_zone |
Optional | The zone where the Compute Engine machine will run. Make sure that this is aligned with compute_region . |
2 users will be given access to the SFTP server:
- one for GA4's data import connection
- one for the user managing/uploading files
Both users will authenticate using a public key instead of a password, so two .pub
files will need to be added to the parent folder prior to running these Terraform scripts:
- "ga4_sftp.pub" (for the GA4 user)
- "id_sftp.pub" (for the file managing user)
Note: .gitignore
has been set to ignore these files.
Your device probably already has the OpenSSH version of ssh-keygen
available, so follow these steps:
- Open up a new Terminal.
- Run the command
ssh-keygen
. - Press
Enter
to accept the default file location (/Users/<username>/.ssh/
) or, optionally, input your own preferred location. - Optionally, input a password to encrypt the private key file or press
Enter
to skip this step. A second confirmation will be needed either way. - Find the private and public key files in the location from step 3.
Note: the file ending in .pub
is the public key file that will be registered with the SFTP server. The other file is your private key that you will provide when authenticating with the server.
If something goes wrong when setting up the SFTP server or trying to connect to it, you can get more information for debugging by connecting to the server and running some commands. You can easily connect to the server by logging in to GCP, then navigating to Compute Engine > VM instances
. Find the instance with a name matching the name
value set in your config.auto.tfvars
file plus "-server", and click SSH
toward the right side of the same row. Follow the prompts to open and authenticate via GCP's SSH-in-browser.
Once in a command line interface, you can:
- see the logs related to the startup script (see also the
startup.sh
file) by running:sudo grep "startup-script" /var/log/syslog
- check and monitor the authentication logs by running:
sudo tail -f /var/log/auth.log
- Note: stop monitoring the logs by pressing
Ctrl
+c
on your keyboard.
- Note: stop monitoring the logs by pressing
- check the contents of the public key file by running:
sudo cat /home/<username>/.ssh/authorized_keys
- Note: replace
<username>
with the appropriate value (e.g.,sftpuser
for the default file managing user)
- Note: replace
No, that value isn't hardcoded anywhere in these Terraform scripts. Just make sure that the file name matches what you've told GA4 to look for when configuring the Data Import data source.
Yes and no. The names are hardcoded, but if you really feel the need to change them, you can modify the file names being passed as options to metadata_startup_script
in the main.tf
file. However, it's not recommended to modify any of the .tf
files or the startup.sh
file.
Yes, it should be able to. Use a different file name when configuring the new data source in GA4, but keep the other details the same, including the username.
When GA4 returns the public key, it should match what you already have saved in the file ga4_sftp.pub
. Compare the two values to confirm this.
If you're strictly using SFTP, the easiest way to upload files is by using an SFTP client like FileZilla, which is free and open-source. If you prefer the command line, you could also connect via sftp
.
That said, the only hard requirement is that GA4 can access the files via SFTP. Once the setup is complete, Terraform automatically mounts the server's /var/sftp/uploads
folder to a Google Cloud Storage bucket, named <project_id>-<name>
, using the variable values set in the config.auto.tfvars
file.
If you're comfortable with GCP, you can upload files directly to this bucket or set up an automated process to feed files into it.