remdiff -- a remote diff'er

Want to see the changes between two files on your computer? "diff" is a wonderful command.

But what if one or both of those files are on a different machine? Usual response would be to copy the file(s) to your local machine, then run the diff. And how often have you accidently overwritten your local copy because your fingers got ahead of your brain?

And that's why I wrote "remdiff". I consider this a pretty important script in my toolkit. You want this on your disk-to-disk backup system. Really, you do.

If you would rather directly download the file, you can do that here. Download it, and rename to get rid of the .txt on the end. (The .txt is there so my webserver will easily display it.)

Note: as of September 2024, remdiff got a lot "bigger" (almost 200 lines). You may prefer the utter simplicity of the old version, you can see that here. Still, I really think you will like this newer version much better

========================================================
#!/bin/sh

# REMDIFF: like "diff", except one or both files being "diff"ed can be
# remote from the system where remdiff is run.
#
# This does require that the user running "remdiff" has logins on the remote
# machines, and key logins would save a lot of password typing.
# 
# Change history:
# 2024-09-01: Added IBS file comparison features.
# 2024-08-28: fixed the long-standing "no colons in file name" limitation.

 #
 # Copyright (c) 2012,2016,2024 Nick Holland (nick@holland-consulting.net)
 #
 # Permission to use, copy, modify, and distribute this software for any
 # purpose with or without fee is hereby granted, provided that the above
 # copyright notice and this permission notice appear in all copies.
 #
 # THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
 # WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
 # MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
 # ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
 # WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
 # ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
 # OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
 # 

# code notes:
# you will note I did not use "mktemp", but rather used the depreciated
# "$$" to generate a pseudo-unique file name.  This is improper, but one
# of my target platforms is AIX, which lacks a mktemp command.
#
# Code readability and portability is a goal here, it is likely some
# error conditions are not handled properly
#
# There's an issue if a file has a ":" in it, this makes it a little
# ambiguous if "fi:le" is a filename or a file "le" on server "fi".
# scp handles this, "Local file names can be made explicit using absolute
# or relative pathnames to avoid scp treating file names containing ":"
# as host specifiers.  Hopefully, we do that properly here.
# Admittedly lightly tested, as I try to avoid : in file names.
# Maybe flaw(?): .path:/file.  Treated as local, as I'm having trouble
# imagining a hostname with a leading period.




yell() { echo "$0: $*" >&2; }
die() { yell "$*"; exit 111; }
try() { "$@" || die "cannot $*"; }

### CHANGE THE FOLLOWING if you have a different root for IBS, i.e., "/bu"
IBSBASE=/ibs

# If the system running remdiff is running OpenSSH 9.0 or later, scp will
# transport the files using the SFTP protocol.  If you will be hitting
# systems that have SFTP disabled, change "SCPOPTS" below to "-O" to force
# the use of the old scp protocol instead of SFTP.
SCPOPTS="" 

if [[ -z "$2" ]]; then
    cat <<- _ENDUSAGE
	usage:
	    $0 [-i] file1 file2
	where either or both can be a remote files.
	If file2 is just a hostname, then the path and file
	name of the first is used.

	-i causes remdiff to create a new command line based on certain
	  IBS assumptions, and guesses of what you are trying to do and then
	  remdiff calls itself to process that new line.
	  It only makes sensible assumptions if running on an IBS server
	  with an IBSBASE that matches what is coded into the script.

	Examples:
	   remdiff -i file
	      compares "file" in IBS to the corresponding file on the machine
	      it was backed up from.
	   remdiff -i file hostname:
	      compares "file" in IBS to the corresponding file on another host.
	      This host need not be backed up by IBS, but must be accessible.
	   remdiff -i file yyyy-mm-dd
	      compares "file" to the IBS backup of the same file on the
	      specified date. (not actually a remote operation, but saves
	      some typing).

	_ENDUSAGE
    exit
fi

## IBS specific code.
# This block of code builds a new remdiff command line for various common
# IBS specific tasks.  
# Advertisement: IBS is Nick Holland's Incremental Backup System, a
# disk-to-disk based backup tool that is very useful to system administrators
# in a Unix environment.  More info here:
# https://holland-consulting.net/scripts/ibs/
# 
# This is kinda a cheat -- rather than nicely integrating this code into
# the rest of remdiff, it "pre-processes" the provided options to remove
# the IBS specific parts and turn it into an absolute invocation of remdiff.
# I started off thinking this was a lazy hack of a way to do it, but as I
# wrote it and tested the code, I realized that this was a good way to keep
# the IBS code separate AND perhaps there would be other types of
# preprocessing that might be desired for other purposes.
# 
# If you are not using IBS, this entire block of code can be deleted.
if [[ $1 == "-i" ]]; then
    IBS="y"
    shift
    # Assumption: IBS path names are:
    #     /ibs///

    # convert relative path (no leading slash) to absolute path, if needed
    if [[ $1 = /* ]]; then
	FILE1=$1
    else
	FILE1=$PWD/$1
    fi
    ORGPATH=/$(echo $FILE1|cut -d/ -f5-)   
    ORGHOST=$(echo $FILE1|cut -d/ -f3)

    #echo "FILE1=$FILE1 ORGPATH=$ORGPATH ORGHOST=$ORGHOST"

    if [[ -z $2 ]]; then
	## no second param; compare to original.
	TARG2=$ORGHOST:$ORGPATH
	echo $0 $1 $TARG2
    elif echo $2|grep -q "^..*:$"; then
	## second param is hostname
	TARG2=$2$ORGPATH
    elif echo $2 |grep -q 20[0-9][0-9]-[01][0-9]-[0-3][0-9]* ; then
	## second parameter is a date.
	TARG2=$IBSBASE/$ORGHOST/$2*/$ORGPATH
    else
	echo "remdiff is not sure how to handle these options"
	exit
    fi
    $0 $1 $TARG2
    exit
fi
### End of IBS specific code


## Actual remdiff 
# Process first parameter
if echo $1 |grep -q "[^/^.].*:"; then # remote?
    # $1 is on a remote machine, copy the file.
    HOST1=$(echo $1 | cut -f1 -d:)
    FILEPATH1=$(echo $1 | cut -f2- -d:)
    FILENAME1=$(basename $FILEPATH1)
    DFILE1=/tmp/$HOST1:$FILENAME1.$$
    try scp $SCPOPTS -q $1 $DFILE1
else
    # $1 is a local file.
    HOST1=""
    FILEPATH1=$1
    FILENAME1=$(basename $FILEPATH1)
    DFILE1=$1
fi

# Process second parameter
if echo $2 |grep -q "[^/^.].*:"; then # remote?
    # $2 is on a remote machine, copy the file.
    HOST2=$(echo $2 | cut -f1 -d:)
    if [[ "$HOST2:" == "$2" ]]; then
	# no path name given, assume $1's path.
	FILEPATH2=$FILEPATH1
    else
	FILEPATH2=$(echo $2 | cut -f2- -d:)
    fi
    FILENAME2=$(basename $FILEPATH2)
    DFILE2=/tmp/$HOST2:$FILENAME2.$$
    try scp $SCPOPTS -q $HOST2:$FILEPATH2 $DFILE2
else
    # $2 is local
    DFILE2=$2
fi
 
diff -u $DFILE1 $DFILE2
 
rm -f /tmp/*.$$
========================================================
Put this in your path.

Usage

$ remdiff localfile server:/path/to/file
or
$ remdiff server:/path/to/file localfile
or
$ remdiff server1:/path/to/file1 server2:/path/to/file2

How it works:

An effort is made to figure out if a file is specified in host:path/file format. If so, the file is copied over to a local tmp file. Same takes place for the other file. Once both files are in a known, local place, a diff -u is run.

Tested on OpenBSD (pdksh), AIX (ksh88) and Linux (bash), and will probably work on most Unixes as-is. There's probably some Linux shell it will fail with.

Limitations:

Only limited error checking is done using the TryYellDie code.

Files with a ':' in the file name may be handled correctly. This is not super-well tested. Let me know if you find a failure, and the command to reproduce it.

The files should be modest in size -- don't fill your /tmp partition!

It is possible to do this without a tmp file at all, but I haven't figured out a way to easily handle TWO remote files without tmp files in a Posix compliant shell. The one remote, one local trick is this: cat the remote file to stdout, and diff uses '-' as one of its input files:

$ ssh $REMOTE cat $FILE | diff -u $LOCALFILE -
$ ssh $REMOTE cat $FILE | diff -u - $LOCALFILE
I find it more useful to be able to diff two remote files than huge files I don't want to copy locally. Your needs may vary.

Yes, I hard-coded in the -u for unified diff format. I consider unified diffs the One True diff format, if you disagree, you are probably a good enough coder to find it and "fix" it your way.

IBS usage

Productive use of IBS often involves comparing files against a backup within IBS, but those paths can get clumsy to type, so I added IBS-aware expansions of remdiff command lines. In all cases, the IBS-aware expansions require the IBS backup file to be the first parameter; I don't think this is a limitation. You can always expand the paths yourself if there's a case not covered here.

So here are some examples:

# remdiff -i /ibs/host/2022-04-21/etc/fstab
# cd /ibs/host/2022-04-21/etc/ ; remdiff -i fstab
Both commands should generate the same results. Compares the backed up file with host:/etc/fstab.
Expands to remdiff /ibs/host/2022-04-21/etc/fstab host:/etc/fstab
# remdiff -i /ibs/host/2022-04-21/etc/fstab host2:
# cd /ibs/host/2022-04-21/etc/ ; remdiff -i fstab host2:
Again, both commands should generate the same results. Compares the IBS backup file with the same file on a DIFFERENT host.
Expands to remdiff /ibs/host/2022-04-21/etc/fstab host2:/etc/fstab
# remdiff -i /ibs/host/2024-07-21/etc/fstab 2024-07-04
# cd /ibs/host/2024-07-21/etc/ ; remdiff -i fstab 2024-07-04
This is an IBS cheat, but it is useful -- compares this file to a different date of the same backup. This is purely local, but remdiff expands out the path names appropriately.
Expands to remdiff /ibs/host/2024-07-21/etc/fstab /ibs/host/2024-07-04/etc/fstab
 
 

Holland Consulting home page
Contact Holland Consulting
 

since June 20, 2021

Copyright 2020, 2024, Nick Holland, Holland Consulting.