Bulk Updates on Firebase

Bulk Updates on Firebase

Once upon a time, there was a developer who needed to update over 60,000 documents in a Firebase collection.

The developer, however, did not wish to make 60,000 individual calls to Firebase.

So, what did the developer do?

They opted for batch commits, reducing the 60,000 calls to just 130.

Now, let's delve into the developer's strategy.

In Firebase, batch commits allow you to update a maximum of 500 documents at once. However, to execute batch commits, you must possess the IDs of the relevant documents.

The initial step involves gathering the document IDs that require updating and storing them in a somewhere.

def get_document_ids():
    db = firestore.client()
    query = db.collection(collection_name).where("status", "!=", "inactive")

    documents = query.stream()

    document_data_list = [
        {
            "userId": doc.id,
            "<field_name>": doc.get("<field_you_wanna_fetch>"),
            "<field_name_2>": doc.get("<field_you_wanna_fetch>"),
        }
        for doc in documents
    ]
    print(len(document_data_list))
    with open("bure_bure.json", "w") as output_file:
        json.dump(document_data_list, output_file)

    return document_data_list

Now that you have stored the document ids in a single place, you are good to make batch commits.

import firebase_admin
from firebase_admin import credentials, firestore
import json

# Replace with your Firebase project credentials JSON file
cred = credentials.Certificate("your_key.json")
firebase_admin.initialize_app(cred)

# Replace 'your_collection' with the name of your Firebase collection
collection_name = "your_collection"

def update_from_json(json_file_path, batch_size=500):
    db = firestore.client()

    # Read data from the JSON file
    with open(json_file_path, "r") as file:
        user_data_list = json.load(file)

    total_documents = len(user_data_list)
    batches = (
        total_documents + batch_size - 1
    ) // batch_size  # Calculate number of batches

    for i in range(batches):
        start_index = i * batch_size
        end_index = (i + 1) * batch_size

        current_batch = user_data_list[start_index:end_index]

        # Create a batch to perform batch updates
        batch = db.batch()

        # Update each document in the batch
        for user_data in current_batch:
            user_id = user_data.get("userId")
            # Your logic to wrangle and update data
            updated_dict = {"field_name": "field_value"}
            user_ref = db.collection(collection_name).document(user_id)
            batch.update(user_ref, updated_dict)

        # Commit the batch to update the current batch of documents
        batch.commit()

        print(f"Processed batch {i + 1}/{batches}")

    print(f"Updated {total_documents} documents from JSON file.")

Please don't judge the code quality, it is a script to get shit done. :')

And folks, that is how you make bulk updates on firebase collection without pulling your hair out.