When writing long sentences in documentation repositories,
git tends to show
really unhelpful diffs. They are unreadable because long lines aren’t broken,
which hides edits happening towards end of line. A colleague of mine asked me if git
couldn’t be configured to make this sort of thing more obvious. Challenge accepted!
Kaushal Modi’s blog post on git diff for minified JS and CSS inspired this idea
for all you prose lovers. Essentially we’ll tell git to preprocess files with a
command that splits text by sentences before running
To do this, we first create a script to replace period+whitespace with newlines. This is a good enough heuristic to distinguish sentences, but feel free to come up with a more appropriate one (fellow Americans, I heard you might want two spaces after your full stops).
sed -r -e 's/\. +/.\n/g' $*
Once added to
$PATH (and checking it by running
breaksentences myfile.txt= the
script can be added as a “diff driver” in git config (either globally in
~/.gitconfig or for only a specific repo via
.git/config). Once the driver is
defined, it can be used in a
.gitattributes file. All thanks to the magic of
gitattributes(5) and git-config(5), and their concept of “diff drivers”.
[diff "sentences"] textconv = breaksentences
Feel free to edit the wildcard to match more adequately than
files! Now your diffs should now be looking nice!
Remember that our modification doesn’t apply any changes to the files, only to
diff tool, and be aware that diff drivers can interfere with interactive
tools using diff output to stage files. This can be a deal-breaker for some,
but it’s still neat to learn about git magic. Powerful stuff.