Dev ToolsMarch 28, 2026

Text Diff Guide: How to Compare Text Files and Find Differences (2026)

By The hakaru Team·Last updated March 2026

Quick Answer

  • *A text diff compares two versions of a file line by line and reports what was added, removed, or changed.
  • *The unified diff format is the standard output: lines starting with - were removed, lines starting with + were added.
  • *Git, GNU diff, and VS Code all use the Myers diff algorithm (1986) by default to find the minimum number of edits.
  • *Over 420 million pull requests were merged on GitHub in 2024 — each one involving a diff comparison.

What Is a Text Diff?

A text diff is a structured comparison of two versions of a text file. It identifies exactly which lines (or characters) changed between the two versions — what was added, what was removed, and what stayed the same.

The word "diff" comes directly from the Unix diff utility, created by Doug McIlroy at Bell Labs in 1974. Fifty years later, it remains one of the most-used Unix utilities in existence. Every time you open a pull request, review a document revision, or run git status, something resembling that original utility is at work.

Diffs solve a simple problem: when a file changes, you want to know what changed, not just that it changed. Reading the full file twice and mentally comparing is impractical. A diff does that work for you and presents only the relevant deltas.

How the Unified Diff Format Works

The unified diff format is the standard output format you will encounter in Git, GitHub, GitLab, and most diff tools. Here is what it looks like:

--- original.txt
+++ modified.txt
@@ -1,4 +1,4 @@
 Line 1 (unchanged)
-Line 2 (removed)
+Line 2 modified (added)
 Line 3 (unchanged)
+Line 4 added

Breaking it down:

  • --- original.txt — the original (old) file
  • +++ modified.txt — the modified (new) file
  • @@ -1,4 +1,4 @@ — a "hunk header" showing which line numbers the following block covers in each file (-1,4 means starting at line 1, showing 4 lines from the original; +1,4 means the same for the modified file)
  • Lines starting with a space — unchanged context lines (shown for readability)
  • Lines starting with - — lines that exist in the original but not the modified version (deletions)
  • Lines starting with + — lines that exist in the modified version but not the original (additions)

By default, most tools show 3 lines of context around each change. This helps you understand where in the file the change occurred without showing the entire document.

The Myers Diff Algorithm

Computing a diff is not as simple as iterating through both files in parallel. Two files can diverge and converge in many ways, and there are often multiple valid diffs. The goal is to find the minimum edit script— the smallest set of insertions and deletions that transforms one file into the other.

Eugene Myers solved this problem elegantly in his 1986 paper "An O(ND) Difference Algorithm and Its Variations." His algorithm runs in O(ND) time, where N is the total number of lines in both files and D is the number of differences. When files are mostly similar (small D), the algorithm is extremely fast.

The Myers algorithm is the default in Git, GNU diff, and most modern diff tools. It produces human-readable diffs by finding the shortest edit path, which tends to group related changes together rather than scattering deletions and insertions across the output.

Git also supports alternative algorithms via --diff-algorithm: patience and histogram are popular alternatives for certain code patterns. The patience algorithm, for example, handles cases with many identical lines (like repeated function signatures) more cleanly than Myers.

Types of Diff: Line, Word, and Character

Not all diffs operate at the line level. The right granularity depends on what you are comparing.

Diff TypeUnit of ComparisonBest For
Line diffEntire linesCode review, config files
Word diffIndividual wordsProse editing, documentation
Character diffIndividual charactersLegal documents, precise text

Git supports word diff natively: run git diff --word-diff to highlight changed words within a line rather than flagging the entire line as changed. This is particularly useful for documentation commits where a single word changed in a long paragraph.

Character-level diff is the most precise but also the noisiest. It is most useful when reviewing legal contracts or regulatory filings where a single changed character can have significant meaning.

Use Cases for Text Diff

Use CaseToolWhy
Code reviewgit diff, GitHub PRTrack changes between versions
Legal docsOnline diff toolCompare contract versions
Content editingWord Track ChangesEditorial review
Config filesdiff, VS Code compareServer configuration management
Plagiarism detectionSpecialized toolsAcademic integrity
Data validationCSV diff toolsSpreadsheet comparison

Code Review

Code review is the dominant use case. According to a 2023 JetBrains developer survey, 43% of developers perform code review daily. GitHub reported over 420 million pull requests merged in 2024 — each one centered around a diff. Git, which uses diff internally, is used by approximately 98% of professional developers according to Stack Overflow’s 2024 Developer Survey.

Legal Document Comparison

Contract negotiations produce multiple versions of a document. A diff tool makes it immediately clear which clauses were added, removed, or modified between drafts. Online diff tools handle this well since they work with plain text pasted directly in — no need to have both files on the same machine.

Configuration Management

Server configuration files change over time. Diffing the current version against a known-good backup can quickly surface unauthorized changes or deployment errors. Tools like Ansible and Puppet use diff internally to show what would change before applying a configuration update.

Content Editing and Plagiarism Detection

Editorial workflows use diff to track changes between drafts. Academic integrity tools use diff-like algorithms to measure similarity between submitted work and reference material — though specialized plagiarism detection systems go further by handling paraphrasing and sentence reordering.

Diff Tools: Your Options

git diff

The most widely used diff tool in the world, built into Git. Run git diff to see unstaged changes, git diff --staged to see staged changes, or git diff HEAD~1 to compare against the previous commit. Git diff outputs unified diff format by default.

GNU diff (command line)

Available on any Unix/Linux/macOS system. Basic usage:

diff original.txt modified.txt

For unified format (the standard readable output):

diff -u original.txt modified.txt

VS Code Compare

VS Code has a built-in diff viewer. Right-click a file in the Explorer and select Select for Compare, then right-click the second file and choose Compare with Selected. VS Code renders a side-by-side view with additions in green and deletions in red. The inline diff view is also available for a unified single-pane experience.

vimdiff

For terminal users, vimdiff file1.txt file2.txt opens both files side by side with differences highlighted. Navigation between changes uses ]c and [c. It is a powerful option for those already comfortable in Vim.

Online Diff Tools

Online tools like our Text Diff Checker are ideal when you do not have both files locally, when you want a quick visual comparison without setting up a development environment, or when you are comparing non-code text like emails, article drafts, or legal documents. Paste both versions and the tool does the rest.

How to Read a Diff Output

When you first encounter diff output, it can look like noise. Here is a practical example: suppose you have a configuration file and you changed a port number and added a new setting.

--- config.yml
+++ config.yml.new
@@ -3,6 +3,7 @@
 server:
-  port: 8080
+  port: 9090
   host: localhost
+  timeout: 30
 database:

Reading this: line 4 changed from port: 8080 to port: 9090, and a new line timeout: 30 was inserted after line 5. Everything else stayed the same. The @@ -3,6 +3,7 @@ header tells you this hunk starts at line 3 in both files, spans 6 lines in the original, and 7 lines in the modified version (one line was added).

Applying a Patch

A diff file can double as a patch file. If you save unified diff output to a file, you can apply it to the original to reproduce the modified version:

diff -u original.txt modified.txt > my.patch
patch original.txt < my.patch

This is how open-source projects historically distributed small fixes before Git became universal — maintainers would accept patch files via email. The patch utility is still commonly used in build systems and package management.

Compare two texts instantly

Use our free Text Diff Checker →

Frequently Asked Questions

What is a text diff?

A text diff is the result of comparing two versions of a text file to identify what changed between them. It shows which lines were added, removed, or modified. The term comes from the Unix diff utility, created at Bell Labs in 1974, which outputs a structured summary of differences between two files.

How does git diff work?

Git diff compares versions of files stored in Git’s object database. When you run git diff, Git retrieves the two versions being compared and applies the Myers diff algorithm to find the minimum set of edits — additions and deletions — that transforms one version into the other. Output is shown in unified diff format with lines prefixed by + (added), - (removed), or a space (unchanged).

What is the unified diff format?

Unified diff format is the standard output format for the diff utility. It shows both the original and modified versions of a file in a single block. Lines starting with --- indicate the original file, +++ the modified file, @@ headers show line numbers, lines starting with - are removals, lines starting with + are additions, and lines with a leading space are unchanged context lines.

How do I compare two files in VS Code?

In VS Code, right-click any file in the Explorer panel and choose Select for Compare. Then right-click the second file and choose Compare with Selected. VS Code opens a side-by-side diff view highlighting additions in green and deletions in red. You can also use the command palette (Ctrl+Shift+P / Cmd+Shift+P) and search for File: Compare Active File With...

What is the difference between character diff and line diff?

A line diff compares files line by line, reporting which entire lines were added or removed. A character diff goes further and highlights the exact characters or words within a changed line that are different. Line diff is standard for code review; character diff is more useful when reviewing prose, legal documents, or configuration values where small in-line changes matter. Git supports word-level diff via git diff --word-diff.

What is the Myers diff algorithm?

The Myers diff algorithm, published by Eugene Myers in 1986, finds the minimum edit distance between two sequences of lines — the fewest possible insertions and deletions to transform one text into another. It runs in O(ND) time, where N is the total number of lines and D is the number of differences. Git, GNU diff, and most modern diff tools use Myers as their default algorithm because it produces the smallest, most readable diffs. Myers’ paper is titled An O(ND) Difference Algorithm and Its Variations (1986).