Improve diff output by alex-slynko · Pull Request #163 · github/KustoSchemaTools

alex-slynko · 2026-04-13T11:01:48Z

Do only changed columns for tables and join changes for same database/table and different clusters

sample output:

https://github.com/github/warehouse-config/pull/4231#issuecomment-4236107408

Instead of showing full .create-merge table lines, show only added/changed/live-only columns. Uses ColumnDiffHelper to compare column dictionaries and render a concise diff. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

When multiple clusters have the same changes, show them once with a combined header instead of repeating the full diff for each cluster. Uses a canonical fingerprint (markdown + comments + validity) to detect identical diffs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

This PR improves the generated diff markdown by (1) showing more focused, column-level diffs for table .create-merge changes and (2) grouping clusters that have identical diff output into a single combined markdown section.

Changes:

Added cluster grouping logic in markdown generation by computing a per-cluster fingerprint and merging identical outputs.
Introduced ColumnDiffHelper and updated ScriptCompareChange to render column-level diffs for CreateMergeTable changes (fallback to line diff otherwise).
Added unit tests for column diff rendering and cluster fingerprint grouping.

Show a summary per file

File	Description
KustoSchemaTools/KustoSchemaHandler.cs	Computes per-cluster fingerprints and groups identical cluster diffs into a combined markdown header/output.
KustoSchemaTools/Changes/ScriptCompareChange.cs	Adds a column-diff rendering path for `CreateMergeTable` and refactors line-diff rendering into a helper method.
KustoSchemaTools/Changes/ColumnDiffHelper.cs	New helper that produces a concise column-level diff (added/type-changed + note for live-only columns).
KustoSchemaTools.Tests/Changes/ColumnDiffHelperTests.cs	Unit tests validating column diff output for added/type-changed/removed/no-change cases.
KustoSchemaTools.Tests/Changes/ClusterGroupingTests.cs	Unit tests validating cluster fingerprint equality/inequality behavior across scenarios.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 5/5 changed files
Comments generated: 4

Copilot · 2026-04-13T11:05:30Z

KustoSchemaTools/KustoSchemaHandler.cs

+            foreach (var comment in comments.OrderBy(c => c.Kind).ThenBy(c => c.Text))
+            {
+                sb.Append($"comment:{comment.Kind}:{comment.FailsRollout}:{comment.Text};");
+            }


BuildClusterFingerprint constructs the grouping key by concatenating raw comment.Text (and later change.Markdown) with :/; delimiters. If any comment/markdown contains these delimiter sequences (e.g. ";change:"), two different diffs can produce the same fingerprint and get incorrectly grouped. Consider generating the fingerprint from a structured representation (e.g., JSON) and hashing it (SHA256), or length-prefix/escape the fields before concatenation.

Copilot · 2026-04-13T11:05:31Z

KustoSchemaTools/KustoSchemaHandler.cs

+            foreach (var change in changes)
+            {
+                sb.Append($"change:{change.Markdown};");
+            }


BuildClusterFingerprint appends full change.Markdown into the fingerprint string. For large diffs and/or many clusters this can create very large strings and increase memory/CPU usage during grouping. A more efficient approach is to hash each change/comment payload (or stream into a hasher) and group on the fixed-size hash.

Copilot · 2026-04-13T11:05:31Z

KustoSchemaTools/Changes/ColumnDiffHelper.cs

+                .Where(c => !oldColumns.ContainsKey(c.Key))
+                .ToList();
+
+            var typeChanged = newColumns
+                .Where(c => oldColumns.ContainsKey(c.Key) && !string.Equals(oldColumns[c.Key], c.Value, StringComparison.OrdinalIgnoreCase))
+                .Select(c => new { Name = c.Key, OldType = oldColumns[c.Key], NewType = c.Value })
+                .ToList();
+
+            var removedFromYaml = oldColumns
+                .Where(c => !newColumns.ContainsKey(c.Key))


BuildColumnDiff enumerates newColumns/oldColumns directly when rendering added, typeChanged, and removedFromYaml. Since these are Dictionary instances, enumeration order can vary depending on how the dictionaries were constructed, which can make the rendered diff (and cluster grouping fingerprint) non-deterministic. Sorting by column name before rendering (and sorting removed columns) would make the output stable across runs/clusters.

Suggested change

.Where(c => !oldColumns.ContainsKey(c.Key))

.ToList();

var typeChanged = newColumns

.Where(c => oldColumns.ContainsKey(c.Key) && !string.Equals(oldColumns[c.Key], c.Value, StringComparison.OrdinalIgnoreCase))

.Select(c => new { Name = c.Key, OldType = oldColumns[c.Key], NewType = c.Value })

.ToList();

var removedFromYaml = oldColumns

.Where(c => !newColumns.ContainsKey(c.Key))

.Where(c => !oldColumns.ContainsKey(c.Key))

.OrderBy(c => c.Key, StringComparer.Ordinal)

.ToList();

var typeChanged = newColumns

.Where(c => oldColumns.ContainsKey(c.Key) && !string.Equals(oldColumns[c.Key], c.Value, StringComparison.OrdinalIgnoreCase))

.OrderBy(c => c.Key, StringComparer.Ordinal)

.Select(c => new { Name = c.Key, OldType = oldColumns[c.Key], NewType = c.Value })

.ToList();

var removedFromYaml = oldColumns

.Where(c => !newColumns.ContainsKey(c.Key))

.OrderBy(c => c.Key, StringComparer.Ordinal)

Copilot · 2026-04-13T11:05:31Z

KustoSchemaTools/Changes/ScriptCompareChange.cs

+                    int last = 0;
+                    for (int i = 0; i < relevantLines.Count; i++)
+                    {
+                        var b = i - 1 > 0 ? relevantLines[i - 1] : null;


In RenderLineDiff, the previous-line lookup uses i - 1 > 0, which skips index 0 when i == 1. This makes the neighbor-context logic inconsistent (line 0 is never considered as the previous line for line 1). Use i > 0 (or i - 1 >= 0) for the boundary check.

Suggested change

var b = i - 1 > 0 ? relevantLines[i - 1] : null;

var b = i > 0 ? relevantLines[i - 1] : null;

alex-slynko and others added 2 commits April 13, 2026 11:50

Copilot AI review requested due to automatic review settings April 13, 2026 11:01

Copilot started reviewing on behalf of alex-slynko April 13, 2026 11:02 View session

Copilot AI reviewed Apr 13, 2026

View reviewed changes

mike-mones approved these changes Apr 13, 2026

View reviewed changes

alex-slynko merged commit 9587667 into main Apr 13, 2026
9 checks passed

alex-slynko deleted the improve-diff-output branch April 13, 2026 13:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve diff output#163

Improve diff output#163
alex-slynko merged 2 commits intomainfrom
improve-diff-output

alex-slynko commented Apr 13, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 13, 2026

Uh oh!

Copilot AI Apr 13, 2026

Uh oh!

Copilot AI Apr 13, 2026

Uh oh!

Copilot AI Apr 13, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	var b = i - 1 > 0 ? relevantLines[i - 1] : null;
	var b = i > 0 ? relevantLines[i - 1] : null;

Conversation

alex-slynko commented Apr 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alex-slynko commented Apr 13, 2026 •

edited

Loading