jguillaumesio
devtoolsai

AI-powered git branch diff summary: give your AI agent real context

How to generate a smart summary of branch diffs. Highlights deleted files and renames to give AI agents the context they need to review, refactor, or document code.

Raw git diff vs. AI-friendly branch summary

You ask an AI agent to review your branch. It reads the diff. It misses that you deleted three files, refactored a core module, and changed the database schema. It gives you feedback on the 50 lines of visible changes and ignores the structural shifts.

The problem: raw git diff is a terrible context format for AI agents. It shows line changes but buries the important structural information.

What AI Agents Need from a Branch Diff

When an AI reviews a branch, it needs to know:

  1. What files were deleted and what their role was
  2. What files were renamed structural reorganization matters
  3. What modules changed not just line counts
  4. What the intent was commit messages and PR description
  5. What the blast radius is which other files depend on the changed ones?

A raw git diff main...feature gives you none of this context. It’s just lines.

Building a Smart Diff Summarizer

Here’s a script that generates an AI-friendly branch summary:

#!/usr/bin/env python3
"""branch-summary.py — Generate an AI-friendly summary of a branch diff."""

import subprocess
import json
import sys
from dataclasses import dataclass, field
from typing import Optional


@dataclass
class FileChange:
    path: str
    status: str          # added, modified, deleted, renamed
    additions: int = 0
    deletions: int = 0
    old_path: Optional[str] = None  # for renames
    summary: str = ""


@dataclass
class BranchSummary:
    branch: str
    base: str
    commits: list[str] = field(default_factory=list)
    files: list[FileChange] = field(default_factory=list)
    total_additions: int = 0
    total_deletions: int = 0
    deleted_files: list[FileChange] = field(default_factory=list)
    renamed_files: list[FileChange] = field(default_factory=list)
    new_files: list[FileChange] = field(default_factory=list)
    modified_files: list[FileChange] = field(default_factory=list)


def run_git(*args) -> str:
    result = subprocess.run(
        ["git", *args],
        capture_output=True, text=True, check=True
    )
    return result.stdout.strip()


def get_summary(branch: str, base: str = "main") -> BranchSummary:
    summary = BranchSummary(branch=branch, base=base)

    # Get commits on this branch (not in base)
    log = run_git("log", f"{base}..{branch}", "--oneline", "--reverse")
    summary.commits = [line.strip() for line in log.split("\n") if line.strip()]

    # Get file stats
    diff_stat = run_git("diff", f"{base}...{branch}", "--numstat")
    for line in diff_stat.split("\n"):
        if not line.strip():
            continue
        parts = line.split("\t")
        if len(parts) >= 3:
            additions = int(parts[0]) if parts[0] != "-" else 0
            deletions = int(parts[1]) if parts[1] != "-" else 0
            path = parts[2]

            # Determine status
            status_check = run_git("diff", f"{base}...{branch", "--name-status", "--", path)
            status = "modified"
            old_path = None

            if status_check.startswith("A"):
                status = "added"
            elif status_check.startswith("D"):
                status = "deleted"
            elif status_check.startswith("R"):
                status = "renamed"
                old_path = status_check.split("\t")[1] if "\t" in status_check else None

            fc = FileChange(
                path=path, status=status,
                additions=additions, deletions=deletions,
                old_path=old_path,
            )
            summary.files.append(fc)
            summary.total_additions += additions
            summary.total_deletions += deletions

            if status == "deleted":
                summary.deleted_files.append(fc)
            elif status == "renamed":
                summary.renamed_files.append(fc)
            elif status == "added":
                summary.new_files.append(fc)
            else:
                summary.modified_files.append(fc)

    return summary


def format_for_ai(summary: BranchSummary) -> str:
    """Format the summary as a prompt-friendly context block."""
    lines = [
        f"## Branch: {summary.branch} (vs {summary.base})",
        f"",
        f"**Stats:** +{summary.total_additions}/-{summary.total_deletions} lines across {len(summary.files)} files",
        f"**Commits:** {len(summary.commits)}",
        f"",
    ]

    # Commits
    lines.append("### Commits")
    for commit in summary.commits:
        lines.append(f"- {commit}")
    lines.append("")

    # Deleted files (critical context!)
    if summary.deleted_files:
        lines.append("### ⚠️ Deleted Files")
        lines.append("These files were removed. Consider what depended on them:")
        for f in summary.deleted_files:
            lines.append(f"- `{f.path}` ({f.deletions} lines removed)")
        lines.append("")

    # Renamed files
    if summary.renamed_files:
        lines.append("### 🔄 Renamed Files")
        for f in summary.renamed_files:
            lines.append(f"- `{f.old_path}` → `{f.path}`")
        lines.append("")

    # New files
    if summary.new_files:
        lines.append("### ✨ New Files")
        for f in summary.new_files:
            lines.append(f"- `{f.path}` (+{f.additions} lines)")
        lines.append("")

    # Modified files grouped by directory
    if summary.modified_files:
        lines.append("### 📝 Modified Files")
        by_dir: dict[str, list[FileChange]] = {}
        for f in summary.modified_files:
            dir_path = "/".join(f.path.split("/")[:-1]) or "(root)"
            by_dir.setdefault(dir_path, []).append(f)

        for dir_path, files in sorted(by_dir.items()):
            lines.append(f"**{dir_path}/**")
            for f in sorted(files, key=lambda x: -(x.additions + x.deletions)):
                lines.append(f"  - `{f.path.split('/')[-1]}` (+{f.additions}/-{f.deletions})")
        lines.append("")

    return "\n".join(lines)


if __name__ == "__main__":
    branch = sys.argv[1] if len(sys.argv) > 1 else "HEAD"
    base = sys.argv[2] if len(sys.argv) > 2 else "main"

    summary = get_summary(branch, base)
    print(format_for_ai(summary))

Usage

# Summarize current branch vs main
python branch-summary.py HEAD main

# Summarize a specific branch
python branch-summary.py feature/user-auth develop

# Feed it directly to an AI
python branch-summary.py HEAD main | \
  llm "Review this branch for potential issues and suggest improvements"

Example Output

## Branch: feature/user-auth (vs main)

**Stats:** +347/-182 lines across 12 files
**Commits:** 5

### Commits
- a1b2c3d Add password reset flow
- d4e5f6a Refactor auth middleware
- g7h8i9j Add user session model
- j0k1l2m Remove legacy cookie auth
- m3n4o5p Update API routes for new auth

### ⚠️ Deleted Files
These files were removed. Consider what depended on them:
- `src/middleware/cookie-auth.ts` (89 lines removed)
- `src/utils/token-legacy.ts` (34 lines removed)

### 🔄 Renamed Files
- `src/auth/handler.ts``src/auth/oauth-handler.ts`

### ✨ New Files
- `src/auth/password-reset.ts` (+124 lines)
- `src/models/session.ts` (+67 lines)

### 📝 Modified Files
**src/auth/**
  - `middleware.ts` (+45/-23)
  - `oauth-handler.ts` (+30/-56)
**src/routes/**
  - `api.ts` (+28/-12)
**src/models/**
  - `user.ts` (+15/-8)

Why This Matters for AI Context

Compare what the AI sees:

Raw diff (what most people paste):

-const cookieAuth = require('./cookie-auth');
+const oauthAuth = require('./oauth-auth');
@@ -45,12 +45,8 @@
-  legacyTokenCheck(req);
+  sessionCheck(req);

Structured summary (what the AI actually needs):

  • “This branch replaces cookie-based auth with OAuth + sessions”
  • “Two auth-related files were fully deleted”
  • “The auth handler was renamed to reflect its new purpose”
  • “Password reset is a new feature (+124 lines)”

The structured summary lets the AI reason about what changed and why, not just which lines moved.

Integration with AI Agents

Add this to your agent’s pre-read step:

# In your agent's context preparation
def prepare_branch_context(branch: str, base: str = "main") -> str:
    """Generate context for AI review of a branch."""
    result = subprocess.run(
        ["python", "scripts/branch-summary.py", branch, base],
        capture_output=True, text=True
    )
    return result.stdout

# Agent prompt
context = prepare_branch_context("feature/user-auth")
response = ai.complete(f"""
{context}

Based on this branch summary:
1. What are the potential risks?
2. What should be tested?
3. Are there any files that need updating but weren't changed?
""")

The Bottom Line

AI agents are only as good as the context you give them. A raw git diff is like asking someone to review a book by showing them the edit history without telling them what the chapters are about.

A structured branch summary (with deleted files highlighted, renames called out, and changes grouped by module) gives the AI the structural understanding it needs to give useful feedback.

The script above takes 5 minutes to set up and makes every AI code review dramatically better.