When writing long sentences in documentation repositories, git
tends to show
really unhelpful diffs. They are unreadable because long lines aren’t broken,
which hides edits happening towards end of line. A colleague of mine asked me if git
couldn’t be configured to make this sort of thing more obvious. Challenge accepted!
Kaushal Modi’s blog post on git diff for minified JS and CSS inspired this idea
for all you prose lovers. Essentially we’ll tell git to preprocess files with a
command that splits text by sentences before running git diff
.
To do this, we first create a script to replace period+whitespace with newlines. This is a good enough heuristic to distinguish sentences, but feel free to come up with a more appropriate one (fellow Americans, I heard you might want two spaces after your full stops).
sed -r -e 's/\. +/.\n/g' $*
Once added to $PATH
(and checking it by running breaksentences myfile.txt=
the
script can be added as a “diff driver” in git config (either globally in
~/.gitconfig
or for only a specific repo via .git/config
). Once the driver is
defined, it can be used in a .gitattributes
file. All thanks to the magic of
gitattributes(5) and git-config(5), and their concept of “diff drivers”.
[diff "sentences"]
textconv = breaksentences
*.md diff=sentences
Feel free to edit the wildcard to match more adequately than md
files! Now your diffs should now be looking nice!
Remember that our modification doesn’t apply any changes to the files, only to
the diff
tool, and be aware that diff drivers can interfere with interactive
tools using diff output to stage files. This can be a deal-breaker for some,
but it’s still neat to learn about git magic. Powerful stuff.