I probably could have done this in perl and it would have been much faster…but I could say that about a lot of things. My problem is that Perl is not my command-line. Perl can be a REPL, but bash always is a REPL.
Ever look at a patch file?
If you’re a programmer that deals with patch files, you know there some parts of a patch file that you might not want to keep.
So, we start off by reading the patch file using while-read < filename.
Then we add a case statement, looking for lines that begin with diff. Those lines indicate a change of file. We can read the file name by chopping off the diff --git a prefix from the line.
We end up with:
while read L ; do case "$L" in (diff*) : ;; *) continue;; esac; M=(${L#diff --git a/}); echo $L; echo ${M[0]}; sleep 1 ; done \ < ~/Documents/jbr_rb.patch
Then to get the files we really want we test the directory like:
while read L ; do case "$L" in (diff*) : ;; *) continue;; esac; M=(${L#diff --git a/}); echo $L; [[ $M == client/* ]] && echo ${M[0]}; sleep 1 ; done < ~/Documents/jbr_rb.patch
And we see a noisy result here:
How many files were changed? Only 102.
> while read L ; do \ case "$L" in (diff*) : ;; *) continue;; esac; \ M=(${L#diff --git a/}); \ [[ $M == client/* ]] && echo ${M[0]}; \ done < ~/Documents/jbr_rb.patch \ > ~/Documents/jbr_insteresting.txt > wc -l ~/Documents/jbr_insteresting.txt 102 /home/jreynolds/Documents/jbr_insteresting.txt
And this way we can scan that big patch file for files in the client subdirectory that were pertinent to our task.