The Firestore Import is Apify integration Actor that import data into Firebase Firestore (NoSQL cloud database build on Google Cloud infrastructure) from Apify dataset. It allows you to configure various options, such as the target collection, handling conflicts in data, and transforming the dataset item before importing it into Firestore.
The Firestore Import Actor takes a dataset, applies transformations, and imports the data into a Firestore database. This Actor is highly customizable, you can control how the data are imported such as:
- Selecting Firestore database and collection.
- Automatically generating document IDs or using a field from the dataset for the document ID.
- Handling document conflicts by either overwriting, merging, or skipping documents with existing ID.
- Transforming data before it gets imported using a customizable JavaScript function.
- One dataset item can lead to multiple Firestore inserts/updates.
- Each document can have its own configuration, such as a custom collection or document ID.
The actor requires several input fields to work correctly. Below is a detailed description of each input field:
Field Name | Type | Description |
---|---|---|
serviceAccountKey |
string (secret, required) |
Service account key in JSON format. You can get it from Firebase Console -> Project Settings -> Service accounts -> Generate new private key. Paste the whole JSON string here, don't worry this is secret input which store the value in encrypted form. |
datasetId |
string (required) |
ID of the Apify dataset to import data from. |
collection |
string (required) |
Firestore collection to import data to. If it doesn't exist, it will be created. Note: you can customize the collection for each record by using the transformFunction input. This can be useful when you want to import data to sub-collections. |
databaseName |
string (optional) |
Name of the Firestore database. If not provided, the default database ( "(default)" ) will be used. |
idField |
string (optional) |
Field in the dataset item that will be used as a Firestore document ID. It must be string or number .If not provided, all documents will be created with a random ID generated by Firestore (it means that value of documentConflictResolution is ignored in that case).This is useful when you want to update existing documents in Firestore. Note: you can customize the ID for each document independently using the transformFunction input field. |
documentConflictResolution |
enum : overwrite , merge , skip (required) |
How to handle conflicts when importing data to Firestore: - overwrite: replace existing Firestore documents with the same ID. - merge: merge data from the dataset items with existing Firestore documents. - skip: documents with existing IDs will be skipped. |
transformFunction |
string (javascript, optional) |
Javascript function that transforms each item from the dataset before importing it to Firestore. The function must return an object (or array of objects) with the data key that contains the transformed record and other optional fields. See examples below. |
batchSize |
number (optional) |
Number of items to import in a single batch. Lower values are safer but slower, see Firestore limits (10 MiB batch write). Please note that skip conflict resolution does not use batch writes and will always import one item at a time. Defaults to 500 . |
The option transformFunction
input field allows you to transform each dataset item before importing it to Firestore.
The field accepts a JavaScript function that takes one dataset item as a parameter and returns an object (or array of objects) with the following keys:
data
(required): transformed document that will be imported to Firestore.id
(optional): custom document ID. If not provided, theidField
input field will be used to resolve document id or if not provided the document will be created with a random ID generated by Firestore.collection
(optional): custom collection name. If not provided, thecollection
input field will be used.documentConflictResolution
(optional): custom conflict resolution for the document. If not provided, thedocumentConflictResolution
input field will be used.
(item) => {
return {
data: item, // transformed document
id: item.id, // custom document ID
collection: "customCollection", // custom collection name
documentConflictResolution: "merge", // custom conflict resolution
};
}
-
Simple transformation function:
The function below increments the value of the
oldField
by 1 and removes theunused
field from the dataset item.(item) => { item.newField = item.oldField + 1; delete item.unused; return { data: item }; }
-
Nested objects:
The function below transforms the dataset item into a Firestore document with nested objects. It updates the
subdocument.field
field and overwrites the wholeauthor
sub-document.(item) => { return { data: { title: item.title, "subdocument.field": item.name, // update single field of subdocument author: item.author // overwrite whole subdocument }, }; }
-
Field value functions:
The function below demonstrates how to use Firestore
FieldValue
functions. It adds new IDs to the existingids
array, removes values from thevalues
array, increments thecount
field, and deletes theold
field.(item) => { return { data: { ids: FieldValue.arrayUnion(item.ids), // add new ids to existing ids array values: FieldValue.arrayRemove(item.values), // remove new values from existing array count: FieldValue.increment(item.count), // increment existing count field by provided value old: FieldValue.delete(), // removes field }, }; }
-
Data types:
The function below demonstrates how to create Firestore data types such as
Timestamp
,Vector
,GeoPoint
, andDocumentReference
.(item) => { return { data: { updatedAt: Timestamp.fromDate(Date.parse(item.date)), // create Timestamp data type vector: FieldValue.VectorValue(item.values), // create vector data type position: GeoPoint(item.lat, item.lon), // create geopoint data type reference: DocumentReference("collection", "referenceDocId"), // create reference type }, }; }
-
Subcollection:
The function below demonstrates how to import data to sub-collections. It returns an array where the first item is the main document and other items are documents for sub-collection.
(item) => { const subDocuments = item.items.map((subItem) => ({ id: subItem.id, collection: `records/${item.customId}/items`, documentConflictResolution: "skip", data: { weight: subItem.weight, length: subItem.length, name: subItem.name, }, })); return [ { id: item.customId, collection: "records", documentConflictResolution: "merge", data: { title: item.title, description: item.description, }, }, ...subDocuments, ]; }
The Actor outputs statistics about the import to Key-Value store key Statistics
with the following structure:
imported
: total number of processed Firestore documents (either created, updated or skipped).skipped
: number of skipped Firestore documents.overwritten
: number of overwritten Firestore documents.merged
: number of merged Firestore documents.created
: number of created Firestore documents (counts written document ifdocumentConflictResolution
isskip
).failed
: number of failed writes to Firestore documents.itemsProcessed
: total number of processed dataset items (including failed items).itemsFailed
: number of failed dataset items.executionTimeMs
: time in milliseconds it took to import the data.startTime
: timestamp when the import started.endTime
: timestamp when the import ended.
{
"imported": 59278,
"skipped": 0,
"overwritten": 0,
"merged": 59278,
"created": 0,
"failed": 0,
"itemsProcessed": 1136,
"itemsFailed": 0,
"executionTimeMs": 19725,
"startTime": "2025-02-26T17:56:22.652Z",
"endTime": "2025-02-26T17:56:42.377Z"
}