Introduction: The “Who, What, When” of Your Data
Imagine trying to debug an issue in a traditional database. A critical value changed, but when? Who changed it? And what was it before? These questions often lead to digging through application logs, backups, or worse, shrugging your shoulders in frustration. Traditional databases often lack the built-in capabilities to answer these fundamental questions about data evolution.
This is where Dolt shines. By bringing Git-style version control to your SQL database, Dolt fundamentally changes how you interact with data history. In this chapter, we’ll dive into the core Dolt commands that allow you to track, inspect, and even “time travel” through your database’s past. This knowledge is crucial for auditing, debugging, and understanding data evolution in any production environment.
You’ll learn to:
- Understand and use
dolt diffto see precise changes in your data and schema. - Navigate your database’s commit history with
dolt log. - Master time travel queries to retrieve data from any point in the past.
We’ll build directly upon the foundational Dolt concepts and setup from previous chapters. Get ready to unlock a superpower for your data!
The Essence of Data Versioning: dolt diff
At the heart of version control lies the ability to see what changed. In Git, you use git diff to compare file versions. Dolt brings this exact capability to your database tables, offering unparalleled transparency into data modifications.
What is dolt diff?
dolt diff is a command that shows you the differences between two states of your Dolt database. These states could be:
- Your current working set of changes (uncommitted modifications) versus the last committed version.
- Two different commits in your history.
- A specific commit versus another commit.
- Changes within a specific table.
It’s like a magnifying glass for your data, highlighting exactly which rows were added, deleted, or modified, and which cells within those rows changed.
Why is dolt diff Crucial?
dolt diff is indispensable for several reasons, making it a cornerstone of data governance and robust data pipelines:
- Auditing: Easily see who changed what and when (assuming commit messages are descriptive). This is vital for compliance, accountability, and regulatory requirements.
- Debugging: Pinpoint the exact data change that might have introduced a bug or unexpected behavior in an application.
- Data Review: Just as developers review code changes, data professionals can review data changes before merging them into a main branch or production environment.
- Understanding Data Evolution: Get a clear, granular picture of how your data has transformed over time, which is invaluable for data analysis and lineage tracking.
How dolt diff Works (Conceptually)
When you run dolt diff, Dolt intelligently compares the primary key values of rows between the two states you specify.
- If a primary key exists in one state but not the other, it’s identified as an added or deleted row.
- If a primary key exists in both, Dolt then compares the values in each column for that specific row. Any differing values are highlighted as modifications. This cell-level tracking is key to Dolt’s power.
Practical Application: Seeing Your First Data Diff
Let’s imagine you have a Dolt database set up (from Chapter 1) and have committed some initial data (from Chapter 2). If not, quickly create a new Dolt database and populate it as follows:
# Initialize a new Dolt database
dolt init my_inventory_db
cd my_inventory_db
# Create a products table
dolt sql -q "CREATE TABLE products (id INT PRIMARY KEY, name VARCHAR(255), price DECIMAL(10, 2));"
dolt add .
dolt commit -m "Initial schema for products table"
# Insert initial product data
dolt sql -q "INSERT INTO products (id, name, price) VALUES (1, 'Laptop', 1200.00), (2, 'Mouse', 25.00);"
dolt add .
dolt commit -m "Added initial products"Now, let’s make a change and see the diff.
Modify a product’s price:
dolt sql -q "UPDATE products SET price = 1250.00 WHERE id = 1;"Run
dolt diff: This command shows you the changes between your working set (what you just updated) and the last commit.dolt diffYou should see output similar to this (exact formatting may vary slightly with Dolt versions as of 2026-06-06):
--- a/products +++ b/products @@ -1,2 +1,2 @@ | id | name | price | |----|--------|---------| -|- 1 | Laptop | 1200.00 | +|- 1 | Laptop | 1250.00 | |- 2 | Mouse | 25.00 |--- a/products: Represents theproductstable in the old state (before your change).+++ b/products: Represents theproductstable in the new state (with your change).- Lines starting with
-show data that was removed from the old state. - Lines starting with
+show data that was added in the new state. - In this case, the row for
Laptopwithprice = 1200.00was “removed” (conceptually, from the old state), and a row withprice = 1250.00was “added” (to the new state). This indicates an update on an existing row.
Add a new product:
dolt sql -q "INSERT INTO products (id, name, price) VALUES (3, 'Keyboard', 75.00);"Run
dolt diffagain:dolt diffNow you’ll see both the price update and the new row:
--- a/products +++ b/products @@ -1,2 +1,3 @@ | id | name | price | |----|----------|---------| -|- 1 | Laptop | 1200.00 | +|- 1 | Laptop | 1250.00 | |- 2 | Mouse | 25.00 | ++|- 3 | Keyboard | 75.00 |The
++indicates a completely new row was added. This level of detail is exactly what makesdolt diffso powerful for understanding data changes.
Exploring Data History with dolt log
Once you start making changes and using dolt diff, the next logical step is to track these changes over time in a structured, auditable way. This is where dolt log comes in, providing a complete history of your database’s evolution.
What is a “Commit” in Dolt?
Just like in Git, a “commit” in Dolt is a permanent snapshot of your entire database (both schema and data) at a specific point in time. Each commit is a distinct record with several key pieces of information:
- A unique identifier: A cryptographic hash (e.g.,
f00dcafe...) that uniquely identifies this state. - An author: The user who made the commit.
- A timestamp: The exact date and time the commit was made.
- A commit message: A descriptive text explaining the purpose of the changes.
- Pointers to its parent commit(s): This forms a chronological history chain, allowing Dolt to reconstruct any past state.
Why dolt log?
dolt log displays the commit history of your Dolt database. It’s your window into the entire evolution of your data and schema, providing a comprehensive audit trail. You can use it to:
- Trace Changes: See precisely who made changes and when, across the entire history of the database.
- Review Context: Quickly review commit messages to understand the purpose and business context of each change.
- Identify States: Locate specific commits to investigate further with
dolt diffor for time travel queries. - Data Lineage: Understand the lineage of your data, seeing how it transformed from its initial state to its current form.
dolt log is analogous to git log, providing a chronological list of all commits, making database history as navigable as code history.
Practical Application: Logging Our Changes
Let’s commit the changes we made in the previous section (the price update and new product) and then view the log.
Commit your changes:
dolt add . dolt commit -m "Updated Laptop price and added Keyboard"dolt add .stages all changes (both schema and data) in the current working directory.dolt committhen saves them as a new, immutable commit.View the commit log:
dolt logYou’ll see output similar to this, showing your recent commit along with the initial ones. The commit hashes will be unique to your system.
commit f00dcafe1234567890abcdef12345678 Author: Your Name <your.email@example.com> Date: Mon, 06 Jun 2026 10:30:00 -0700 Updated Laptop price and added Keyboard commit beefcafe1234567890abcdef12345678 Author: Your Name <your.email@example.com> Date: Mon, 06 Jun 2026 10:25:00 -0700 Added initial products commit deadbeef1234567890abcdef12345678 Author: Your Name <your.email@example.com> Date: Mon, 06 Jun 2026 10:20:00 -0700 Initial schema for products tableEach
commitentry shows the hash, author, date, and commit message. This history is immutable and provides a complete audit trail, a critical feature for compliance and debugging.⚡ Quick Note: You can customize
dolt logoutput. For example,dolt log -pshows the diff for each commit directly in the log, anddolt log -n 1shows only the latest commit.
Time Travel Queries: Accessing Past Data
This is where Dolt truly extends the traditional database paradigm. With conventional databases, querying historical data often means complex backups, snapshots, or custom audit tables. With Dolt, it’s a built-in feature, as simple as adding a clause to your standard SQL query.
What are Time Travel Queries?
Time travel queries allow you to query the state of your database (or specific tables) at any past commit. Instead of just seeing the current data, you can ask, “What did this table look like last week?” or “What was the price of ‘Laptop’ two commits ago?” This capability is fundamental to building robust, auditable data systems.
Why are Time Travel Queries Powerful?
Time travel queries unlock a new dimension of data interaction, making them incredibly powerful for various use cases:
- Auditing and Compliance: Easily reconstruct the exact state of data for regulatory checks or to prove data integrity at any given point.
- Historical Analysis: Analyze trends or changes in your data over time without needing separate data warehousing solutions for historical views. This simplifies complex analytical tasks.
- Debugging and Rollback: Quickly identify the data state before an error was introduced, or even revert to it, significantly reducing recovery time.
- AI/ML Model Training: Versioning data for machine learning models is crucial for reproducibility and debugging model performance. Time travel queries allow you to retrieve the exact dataset used for a specific model training run, ensuring model integrity.
- A/B Testing Analysis: Compare key metrics from different periods, correlating them with changes in your application or data, providing precise insights into experiment outcomes.
Syntax for Time Travel Queries
Dolt extends standard SQL with a powerful AS OF clause.
SELECT * FROM <table_name> AS OF <commit_ref>;<commit_ref> can be highly flexible, allowing you to specify the exact point in time or state you wish to query:
- A full or partial commit hash (e.g.,
'f00dcafe','f00dcafe123'). - A branch name (e.g.,
'main','feature/new-product'). - A tag name (e.g.,
'v1.0','release-2026-05-01'). 'HEAD'(the latest commit on the current branch).'WORKING'(your uncommitted changes, useful for previewing).'STAGED'(your staged changes, also for previewing).- A timestamp (e.g.,
'2026-06-05 14:30:00').
Practical Application: Querying Our Past
Let’s use the commit hashes from our dolt log output to query past states.
Find your commit hashes: Run
dolt logand copy the second to last commit hash (the one for “Added initial products”). Let’s assume it wasbeefcafe1234567890abcdef12345678(you’ll use your actual hash).Query the
productstable as of that commit: Replace'beefcafe123'with a partial or full commit hash from yourdolt logoutput.dolt sql -q "SELECT * FROM products AS OF 'beefcafe123';"You should see only the initial products, with the Laptop’s original price:
+----+--------+---------+ | id | name | price | +----+--------+---------+ | 1 | Laptop | 1200.00 | | 2 | Mouse | 25.00 | +----+--------+---------+Notice the “Keyboard” is missing, and “Laptop” is at its original price. This is the database state before our last commit.
Query the current
HEAD(latest committed state):dolt sql -q "SELECT * FROM products AS OF 'HEAD';"This will show the latest committed state:
+----+----------+---------+ | id | name | price | +----+----------+---------+ | 1 | Laptop | 1250.00 | | 2 | Mouse | 25.00 | | 3 | Keyboard | 75.00 | +----+----------+---------+🧠 Important: When you run a standard
SELECT * FROM products;query withoutAS OF, Dolt queries theHEADof your current branch by default, providing the most current committed data.
Step-by-Step Implementation: Tracking Our Inventory Data Through Time
Let’s walk through a complete scenario to solidify these concepts. This hands-on exercise will demonstrate the power of dolt diff, dolt commit, dolt log, and time travel queries in a practical workflow.
Ensure you’re in your
my_inventory_dbdirectory. If not,cd my_inventory_db. We’ll assume your database is in the state after adding the Keyboard and committing.Update an existing product’s name and price:
dolt sql -q "UPDATE products SET name = 'Gaming Laptop', price = 1500.00 WHERE id = 1;"Add a new product:
dolt sql -q "INSERT INTO products (id, name, price) VALUES (4, 'Webcam', 49.99);"Delete an old product:
dolt sql -q "DELETE FROM products WHERE id = 2;"See all pending changes with
dolt diff: Before committing, always check your changes.dolt diffObserve the output carefully. You should see:
- The
Laptoprow updated toGaming Laptopand1500.00(represented by a-and+pair). - The
Mouserow marked as deleted (-). - The
Webcamrow marked as added (++).
This single
dolt diffcommand summarizes all your uncommitted changes across the entire database, providing a clear pre-commit review.- The
Commit these changes: Stage all changes and commit them with a descriptive message.
dolt add . dolt commit -m "Renamed Laptop to Gaming Laptop, updated price, added Webcam, deleted Mouse."Review the entire history with
dolt log: Check that your new commit is at the top of the history.dolt logYou’ll now see your latest, detailed commit message at the top of the log, providing a complete audit trail.
Time travel to before the deletion of the ‘Mouse’: Find the commit hash before your last commit (the one that added “Keyboard”). Let’s assume it was
beefcafe123(use your actual hash fromdolt log).dolt sql -q "SELECT * FROM products AS OF 'beefcafe123';"You should see the
Mousestill present, theLaptopat its previous price, and theKeyboardpresent, but noGaming LaptoporWebcam. This demonstrates retrieving a specific historical state.Time travel to your very first product commit: Find the commit hash for “Added initial products”. Let’s say it was
deadbeef123(use your actual hash).dolt sql -q "SELECT * FROM products AS OF 'deadbeef123';"This will show only
Laptop(at its original price) andMouse, effectively rewinding your database to its early state.
This sequence of dolt diff, dolt commit, dolt log, and AS OF queries demonstrates the powerful auditing and historical analysis capabilities Dolt provides right out of the box, making data history an integral part of your workflow.
Mini-Challenge: The Price Fluctuation
Your turn! Let’s simulate a common scenario: a product’s price changes frequently. This will reinforce your understanding of granular data versioning.
Challenge:
- Add a new product:
id = 5,name = 'Headphones',price = 99.99. - Commit this change with a descriptive message like “Added Headphones at initial price”.
- Update the
Headphonesprice to89.99. - Commit this change with a message like “Reduced Headphones price for sale”.
- Update the
Headphonesprice to109.99. - Commit this change with a message like “Adjusted Headphones price to new MSRP”.
- Using
dolt logto find the relevant commit hashes, thendolt diff, prove the entire price history forHeadphones. Specifically, usedolt diff <old_commit_hash> <new_commit_hash> -- data headphonesto show the price change between each step. - Finally, use a time travel query to find the price of ‘Headphones’ after its first price update (i.e., when it was
89.99).
Hint:
- Remember to
dolt add .before eachdolt committo stage your changes. dolt logwill be your primary tool to retrieve the necessary commit hashes fordolt diffandAS OFqueries.- The
-- data <table_name>flag fordolt diffis useful for focusing on changes within a specific table.
What to Observe/Learn: You’ll see how each price change is a distinct, auditable event in Dolt’s history. This granularity is invaluable for understanding business decisions, data integrity, and for reconstructing past states for analysis or debugging.
Common Pitfalls & Troubleshooting
Even with powerful tools like Dolt, there are a few common stumbling blocks that new users encounter. Understanding these can save you significant time and frustration.
Forgetting to
dolt add .beforedolt commit: This is perhaps the most common mistake. If you make SQL changes (e.g.,INSERT,UPDATE,DELETE) but then rundolt commitwithoutdolt add .first, Dolt will tell you there are no changes to commit. Your changes are in the “working set” but not yet “staged” for commit. Solution: Always rundolt add .(to stage all changes) ordolt add <table_name>(to stage changes in a specific table) to prepare your modifications beforedolt commit. For convenience, if you want to stage and commit all changes in one go, usedolt commit -a -m "Your message".Misinterpreting
dolt diffoutput: Thediffformat, while standard for Git, can sometimes be confusing when applied to tabular data. Remember:- Lines starting with
-indicate data that was present in the old state but removed in the new state. - Lines starting with
+indicate data that was added in the new state. - For row updates (where a primary key exists in both states but column values differ), you’ll often see a pair of
-and+lines representing the old and new versions of the same row. Solution: Pay close attention to the--- a/and+++ b/headers to understand which state is “old” and which is “new.” Focus on the primary key to identify which row is being affected by an update, deletion, or addition.
- Lines starting with
Performance with large historical queries: While Dolt is highly optimized for versioning, querying very large tables
AS OFa very old commit, especially with complex joins or aggregations, can sometimes be slower than queryingHEAD. This is because Dolt might need to reconstruct the table state at that specific commit by applying the history of changes. Solution: For critical performance paths in production, consider if you truly need historical data or if the latest state suffices. When querying historical data for analytical purposes, ensure your SQL queries are well-indexed. Dolt’s architecture is designed for efficient diffs and merges, but extensive time travel on massive datasets still involves computational overhead. For example, retrieving the state of a 100 million-row table from 50,000 commits ago will naturally take longer than querying the current state.
Summary
In this chapter, you’ve gained a fundamental understanding of how Dolt empowers you to track and inspect every change to your database, bringing a robust version control paradigm to your data. This capability transforms how you manage, audit, and understand your database’s evolution.
Here are the key takeaways:
dolt diffprovides a granular, line-by-line view of changes between any two database states, essential for auditing, debugging, and data review processes.dolt logpresents an immutable, chronological history of all commits, detailing who, what, and when changes were made, forming a complete audit trail.- Time travel queries with the
AS OFclause allow you to query your database at any historical commit or timestamp, a powerful feature for historical analysis, compliance, AI/ML reproducibility, and debugging.
You now have the tools to observe your database’s evolution with unprecedented clarity and control. This foundation is critical for building reliable and auditable data systems. In the next chapter, we’ll build on this by exploring Dolt’s branching capabilities, allowing you to experiment with data, develop features, and manage schema changes in isolation, just like you would with application code.
References
- DoltHub Documentation:
dolt diffcommand - DoltHub Documentation:
dolt logcommand - DoltHub Documentation: Time Travel Queries (SQL
SELECTwithAS OF) - DoltHub Blog: Getting Started with Dolt – A comprehensive introduction
This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.