Branching and Merging Data: Collaborative Workflows

Collaborative data development often feels like navigating a minefield. How do multiple data engineers, analysts, or developers work on the same database schema or data simultaneously without overwriting each other’s changes or causing production outages? This is where Dolt’s Git-for-Data paradigm truly shines.

In this chapter, we’ll dive deep into the fundamental Git concepts of branching and merging, but applied directly to your SQL database. You’ll learn how to create isolated environments for data experimentation, safely integrate changes, and resolve conflicts when parallel work diverges. By the end, you’ll be equipped to enable robust, auditable, and collaborative data workflows using Dolt, setting the stage for more advanced team coordination.

Before we begin, ensure you have a working Dolt installation (version 1.x or later, checked as of 2026-06-06) and a basic understanding of Dolt’s CLI commands (dolt init, dolt sql) from previous chapters. We’ll primarily use dolt sql-server --postgres for our examples, leveraging Doltgres for PostgreSQL compatibility.

Core Concepts: The Git-for-Data Paradigm in Action

Dolt’s power lies in bringing Git-style version control directly to your SQL tables. This means that every table, every row, and every schema definition can be branched, committed, diffed, and merged just like source code. This paradigm fundamentally changes how teams interact with and evolve data.

Branches: Isolated Workspaces for Your Data

Imagine you’re developing a new feature that requires significant changes to your product catalog data. You don’t want to mess up the current production data while you’re experimenting. This is where branches come in.

What is a branch? A branch in Dolt is an independent line of development for your database. It’s like creating a complete copy of your database at a specific point in time, allowing you to make changes without affecting other branches.

Why use branches for data?

  • Isolation: Work on new features, experiments, or bug fixes without impacting main (your production branch).
  • Parallel Development: Multiple teams or individuals can work on different data changes simultaneously.
  • Experimentation: Try out new data models or transformations without fear of corrupting your core dataset.
  • Hotfixes: Quickly create a branch to fix critical issues in production data.

Think of a branch as a personal sandbox for your database. You can build, break, and rebuild within it, and only when you’re satisfied, do you integrate those changes back into the main line of development.

How to Manage Branches

Let’s start by initializing a Dolt database and creating our first branch.

First, ensure Dolt is installed. For the latest stable version (e.g., 1.20.0 as of early 2026), you can often download binaries or use Docker. Refer to the official DoltHub documentation for precise instructions.

We’ll use Doltgres for PostgreSQL-style data. Start Dolt in PostgreSQL-compatible server mode:

dolt sql-server --postgres

This command will start a Dolt SQL server listening on port 5432 (default for PostgreSQL). You can now connect to it using any PostgreSQL client (like psql or DBeaver).

Open a separate terminal window for Dolt CLI commands.

# Initialize a new Dolt database
dolt init my_product_catalog

# Enter the database directory
cd my_product_catalog

# Create a sample table for products
dolt sql -q "CREATE TABLE products (id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name TEXT, price DECIMAL(10, 2), stock_quantity INT);"

# Insert some initial data
dolt sql -q "INSERT INTO products (name, price, stock_quantity) VALUES ('Laptop Pro', 1200.00, 50), ('Wireless Mouse', 25.00, 200);"

# Check the status of your changes
dolt status

You’ll see output indicating products is an untracked table. Now, let’s commit these initial changes.

# Add all changes to the staging area
dolt add .

# Commit the changes with a descriptive message
dolt commit -m "Initial product catalog setup"

Now, let’s create a new branch for a pricing update feature.

# Create a new branch named 'feature/pricing-update'
dolt branch feature/pricing-update

# Switch to the new branch
dolt checkout feature/pricing-update

# Verify your current branch
dolt branch

The output of dolt branch will show an asterisk next to feature/pricing-update, indicating you are now on that branch. Any changes you make to the database now will only affect this branch until you switch back or merge.

Commits: Snapshots of Your Data’s History

Just like in Git, a commit in Dolt is an atomic snapshot of your database at a particular point in time. It captures all changes (data and schema) that have been staged.

What is a commit? A commit is a record of changes to your database, along with metadata like the author, timestamp, and a commit message. Each commit has a unique identifier (a hash).

Why commit frequently?

  • Granular History: Creates a detailed, auditable history of every change.
  • Easy Rollback: Allows you to revert to any previous state of your database.
  • Context: Descriptive commit messages explain why changes were made, aiding collaboration and debugging.

Making and Committing Changes

Let’s update some prices on our feature/pricing-update branch.

# Update prices on the current branch
dolt sql -q "UPDATE products SET price = 1250.00 WHERE name = 'Laptop Pro';"
dolt sql -q "UPDATE products SET price = 28.00 WHERE name = 'Wireless Mouse';"

# Check the changes
dolt diff

The dolt diff command will show you the changes you’ve made to the products table on your current branch. It’s similar to git diff for code, but it shows row-level data differences.

Now, commit these changes:

# Stage the changes
dolt add .

# Commit with a message
dolt commit -m "Increased Laptop Pro price by $50 and Wireless Mouse by $3"

You’ve successfully made and committed changes on an isolated branch. The main branch remains untouched.

Merging: Bringing Data Changes Together

Once changes on a feature branch are complete and tested, they need to be integrated back into the main line of development. This process is called merging.

What is merging? Merging combines the history of one branch into another. Dolt intelligently applies the changes from the source branch to the target branch.

Why merge?

  • Integration: Incorporate new features or fixes into the main database.
  • Synchronization: Keep different development lines up-to-date with each other.

Performing a Merge

Let’s switch back to main and merge our pricing updates.

# Switch back to the main branch
dolt checkout main

# Verify the prices on main (they should be the original ones)
dolt sql -q "SELECT name, price FROM products;"

You’ll see the original prices. Now, merge the feature/pricing-update branch into main.

# Merge the feature branch into main
dolt merge feature/pricing-update

Dolt will perform the merge. If there are no conflicts, it will simply apply the changes.

# Verify the prices on main again
dolt sql -q "SELECT name, price FROM products;"

Now, main reflects the updated prices. The feature/pricing-update branch can now be deleted if no longer needed, or kept for future iterations.

Diffs: Seeing What Changed

dolt diff is your best friend for understanding what has changed between commits, branches, or even specific tables. It provides a clear, row-level view of additions, deletions, and modifications.

Why diff?

  • Auditing: Understand precisely what data changed and when.
  • Code Review for Data: Review data changes before merging.
  • Debugging: Pinpoint when and how a specific data point changed.

Using dolt diff

We already saw dolt diff to view uncommitted changes. You can also diff between branches or commits:

# Diff the current branch against the 'main' branch (before our merge, this would show differences)
dolt diff main

# Diff a specific table between two branches
dolt diff main feature/pricing-update products

# View the history of commits
dolt log

dolt log is like git log, showing commit IDs, authors, dates, and messages. You can use these commit IDs with dolt diff to compare any two points in history.

Conflict Resolution: When Data Diverges

Merge conflicts occur when two branches modify the same piece of data in different ways. Dolt, like Git, cannot automatically decide which change to keep, so it flags the conflict for manual resolution.

What are conflicts? A merge conflict is a situation where Dolt cannot automatically reconcile divergent changes to the same data cell.

Why do they occur? Multiple users or processes edit the same row/column concurrently on different branches.

How Dolt handles them: Dolt identifies conflicts at the cell level. This means if two branches modify different columns in the same row, it’s often a clean merge. Conflicts arise when the exact same cell (row and column) is modified differently.

Strategies for Resolution

When a merge conflict occurs, Dolt will stop the merge and inform you. You’ll typically follow these steps:

  1. dolt status: Identify which tables have conflicts.
  2. dolt diff: Inspect the conflicting changes. Dolt will show you “ours” (your current branch’s version) and “theirs” (the incoming branch’s version).
  3. Resolve:
    • Manual Edit: Open your SQL client, view the conflicting table, and manually update the conflicting cells to the desired state.
    • dolt checkout --ours <table_name>: Accept your current branch’s version for the entire table.
    • dolt checkout --theirs <table_name>: Accept the incoming branch’s version for the entire table.
  4. dolt add .: Stage the resolved changes.
  5. dolt commit -m "Resolved merge conflict": Complete the merge.

This process is very similar to how you’d resolve code conflicts in Git.

flowchart TD Start[Start] --> CreateBranch[Create Branch] CreateBranch --> EditCommit[Edit Commit] EditCommit --> AttemptMerge[Attempt Merge] AttemptMerge --> Conflicts{Conflicts} Conflicts -->|Yes| Resolve[Resolve] Resolve --> CommitResolved[Commit Merge] CommitResolved --> End[End] Conflicts -->|No| End

Step-by-Step Implementation: Building a Versioned Product Catalog (PostgreSQL-style)

Let’s put these concepts into practice with a hands-on scenario. We’ll simulate two developers working on a product catalog, introducing and resolving a conflict.

Ensure your dolt sql-server --postgres is running in one terminal. Use another terminal for dolt CLI commands.

Scenario Setup: Initializing the Catalog

If you haven’t already, navigate to your my_product_catalog directory or create a new one:

# If starting fresh, remove existing db and init again
# rm -rf my_product_catalog
# dolt init my_product_catalog
# cd my_product_catalog

# Create the products table with PostgreSQL-style UUID primary key
dolt sql -q "CREATE TABLE products (id UUID PRIMARY KEY DEFAULT gen_random_uuid(), name TEXT NOT NULL, description TEXT, price DECIMAL(10, 2) NOT NULL, stock_quantity INT DEFAULT 0);"

# Insert some initial data
dolt sql -q "INSERT INTO products (name, description, price, stock_quantity) VALUES ('Laptop Pro', 'High-performance laptop for professionals.', 1200.00, 50), ('Wireless Mouse', 'Ergonomic mouse with long battery life.', 25.00, 200), ('Mechanical Keyboard', 'Tactile switches for an optimal typing experience.', 150.00, 75);"

# Stage and commit the initial catalog
dolt add .
dolt commit -m "Initial product catalog with Laptop Pro, Mouse, Keyboard"

Step 1: Feature Branch for Pricing Update

Developer A needs to adjust pricing for some products. They’ll create a feature branch.

# Create a new branch for pricing adjustments
dolt branch feature/pricing-update

# Switch to the new branch
dolt checkout feature/pricing-update

# Update prices
dolt sql -q "UPDATE products SET price = 1250.00 WHERE name = 'Laptop Pro';"
dolt sql -q "UPDATE products SET price = 28.00 WHERE name = 'Wireless Mouse';"

# Stage and commit the price updates
dolt add .
dolt commit -m "Feature: Increased Laptop Pro price by $50 and Wireless Mouse by $3"

Step 2: Parallel Work - Adding a New Product (on main)

While Developer A works on pricing, Developer B (or you, switching roles) adds a new product to the main catalog.

# Switch back to the main branch
dolt checkout main

# Add a new product
dolt sql -q "INSERT INTO products (name, description, price, stock_quantity) VALUES ('USB-C Hub', 'Compact hub with multiple ports.', 49.99, 150);"

# Stage and commit the new product
dolt add .
dolt commit -m "Main: Added new product 'USB-C Hub'"

Step 3: Merging the Pricing Update into main

Developer A finishes their pricing work and wants to merge it into main.

# Ensure you are on the main branch to perform the merge
dolt checkout main

# Merge the feature/pricing-update branch
dolt merge feature/pricing-update

Dolt will successfully merge these changes because the feature/pricing-update branch only modified existing rows, while main added a new row. These are distinct changes, so no conflict occurs.

# Verify all products and their prices on main
dolt sql -q "SELECT name, price FROM products;"

You should see Laptop Pro at $1250.00, Wireless Mouse at $28.00, and USB-C Hub at $49.99, along with the Mechanical Keyboard.

Step 4: Introducing and Resolving a Conflict

Now, let’s intentionally create a conflict to learn how to resolve it.

Conflict Scenario:

  • Developer A (on a new bugfix/description branch) fixes a typo in the Laptop Pro description.
  • Developer B (on main) simultaneously updates the Laptop Pro description to something else.
# Create a new branch for a description bugfix
dolt branch bugfix/description

# Switch to the bugfix branch
dolt checkout bugfix/description

# Developer A's change: Fix a typo in Laptop Pro description
dolt sql -q "UPDATE products SET description = 'High-performance laptop for professionals, optimized for creative tasks.' WHERE name = 'Laptop Pro';"

# Stage and commit Developer A's change
dolt add .
dolt commit -m "Bugfix: Improved Laptop Pro description with creative tasks detail"

Now, switch back to main and make a different change to the same cell.

# Switch back to main
dolt checkout main

# Developer B's change: Update Laptop Pro description for general audience
dolt sql -q "UPDATE products SET description = 'A powerful laptop designed for everyday productivity and advanced applications.' WHERE name = 'Laptop Pro';"

# Stage and commit Developer B's change
dolt add .
dolt commit -m "Main: Updated Laptop Pro description for broad appeal"

Now, try to merge bugfix/description into main:

# Ensure you are on main
dolt checkout main

# Attempt to merge the bugfix branch
dolt merge bugfix/description

You will get a merge conflict! Dolt will report something like:

Automatic merge failed; fix conflicts and then commit the result.

Resolving the Conflict

  1. Check Status: See which tables are in conflict.

    dolt status

    You’ll see products listed under “Tables with conflicts”.

  2. Inspect Conflicts: Use dolt diff to see the conflicting versions.

    dolt diff products

    Dolt will show you the HEAD (your current branch, main) version and the MERGE_BRANCH (the incoming bugfix/description) version for the conflicting cell.

    --- a/products
    +++ b/products
    @@ -1,6 +1,6 @@
     | id                                   | name                | description                                                              | price    | stock_quantity |
     |--------------------------------------|---------------------|--------------------------------------------------------------------------|----------|----------------|
    -| 32f3f9e4-6c3d-4c3e-8b1a-2d3e4f5a6b7c | Laptop Pro          | A powerful laptop designed for everyday productivity and advanced applications. | 1250.00  | 50             |
    +| 32f3f9e4-6c3d-4c3e-8b1a-2d3e4f5a6b7c | Laptop Pro          | High-performance laptop for professionals, optimized for creative tasks. | 1250.00  | 50             |
     | 5c6d7e8f-0a1b-2c3d-4e5f-6a7b8c9d0e1f | Wireless Mouse      | Ergonomic mouse with long battery life.                                  | 28.00    | 200            |
     | 7a8b9c0d-1e2f-3a4b-5c6d-7e8f9a0b1c2d | Mechanical Keyboard | Tactile switches for an optimal typing experience.                       | 150.00   | 75             |
     | d1e2f3a4-b5c6-d7e8-f9a0-b1c2d3e4f5a6 | USB-C Hub           | Compact hub with multiple ports.                                         | 49.99    | 150            |

    (Note: id UUIDs will be different in your output.)

    The dolt diff output highlights the differences. In a real scenario, you’d discuss with your team which description is more accurate or combine them. For this exercise, let’s decide to keep a combined version.

  3. Resolve Manually (Example): You can directly edit the table using dolt sql. Find the conflicting row and update the description to your desired final version.

    dolt sql -q "UPDATE products SET description = 'A powerful, high-performance laptop for professionals, optimized for both productivity and creative tasks.' WHERE name = 'Laptop Pro';"
  4. Stage the Resolution: After manually updating, tell Dolt that the conflict is resolved.

    dolt add .
  5. Commit the Merge: Finally, commit the merge with a message.

    dolt commit -m "Merged bugfix/description and resolved conflict for Laptop Pro description"

    The merge is now complete, and your main branch contains both the pricing updates, the new product, and the resolved description for Laptop Pro.

📌 Key Idea: Cell-Level Conflict Resolution

Dolt’s ability to resolve conflicts at the cell level is incredibly powerful. Unlike schema-only version control systems or full database snapshots, Dolt understands the granular changes to your data, allowing for more precise and less destructive merges.

Mini-Challenge: Schema and Data Evolution During Merges

This challenge will test your understanding of how Dolt handles both schema and data changes during merges.

Challenge:

  1. Ensure you’re on the main branch.
  2. Create a new branch named experiment/category-feature.
  3. On experiment/category-feature:
    • Add a new column category TEXT to the products table.
    • Update the category for existing products (e.g., ‘Electronics’, ‘Peripherals’).
    • Commit these changes.
  4. Switch back to the main branch.
  5. On main:
    • Add a new product with a name, description, price, stock_quantity, but without a category (since main doesn’t have that column yet).
    • Commit this new product.
  6. Try to merge experiment/category-feature into main.

What to observe/learn: What kind of conflict, if any, does Dolt report? How do you resolve it? Pay close attention to dolt status and dolt diff after the merge attempt. This scenario highlights how Dolt intelligently handles schema evolution alongside data changes.

Hint: Dolt is smart about schema changes. If a column is added on one branch and a row is added on another, the merge will often succeed, with the new column being NULL for the rows added on the branch without the column. However, if both branches modified the same cell or the schema change introduces a NOT NULL constraint that conflicts with existing data, you might encounter conflicts.

Common Pitfalls & Troubleshooting

  1. Forgetting dolt add . before dolt commit: Just like Git, Dolt requires you to stage your changes (dolt add .) before they can be committed. If you try to commit without adding, Dolt will tell you “nothing to commit”. Always check dolt status first!
  2. Ignoring Merge Conflicts: If a merge results in conflicts, Dolt will stop and require manual intervention. Don’t ignore the conflict messages. If you try to commit without resolving, Dolt will prevent it. Use dolt status and dolt diff to understand the conflict.
  3. Lack of a Clear Branching Strategy: While Dolt gives you the tools, your team needs a strategy. Will you use feature branches? Release branches? Hotfix branches? Without a clear plan, your Dolt history can become messy and hard to navigate.
  4. Performance with Large Diff/Merge Operations: For extremely large tables (millions or billions of rows), dolt diff or complex merges can take time. This is because Dolt is comparing actual data. Optimize your queries and consider committing smaller, more frequent changes rather than massive, infrequent ones. Use dolt diff --tables <table_name> to focus on specific tables.

Summary

In this chapter, you’ve mastered the critical concepts of branching and merging within Dolt, applying Git’s powerful version control directly to your data:

  • Branches provide isolated sandboxes for developing new features, running experiments, or fixing bugs without affecting your main dataset.
  • Commits create atomic, auditable snapshots of your database’s state, enabling granular history and easy rollbacks.
  • Merging allows you to safely integrate changes from different branches, bringing divergent data histories together.
  • Diffs are indispensable for reviewing, auditing, and understanding precisely what data and schema changes have occurred.
  • Conflict Resolution equips you to handle situations where parallel data development diverges, ensuring data integrity through manual or programmatic reconciliation at the cell level.

By leveraging these capabilities, you can establish highly collaborative, auditable, and resilient data workflows, moving beyond the limitations of traditional database management.

Next, we’ll explore how to extend these collaborative workflows to remote repositories and DoltHub, enabling seamless synchronization and team-wide data versioning across distributed environments.

References

  1. DoltHub Documentation - Branches
  2. DoltHub Documentation - Commits
  3. DoltHub Documentation - Merges
  4. DoltHub Documentation - Diff
  5. DoltHub Documentation - Resolving Conflicts
  6. DoltHub Blog - Introducing Doltgres

This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.