Thursday, May 08, 2014

How to grep git commits

From the There's-got-to-be-a-better-way Department, comes this tale of subterranian Git spelunking...

It all started because I lost some code. I had done my duty like a good little coder. I found a bug, wrote a test that reproduced it and filed an issue noting the name of the failing test. Then, some travel and various fire drills intervened. Memory faded. Finally, getting back to my routine, I had this bug on my docket. Already had the repro; should be a simple fix, right?

But where was my test? I asked Sublime to search for it, carefully turning off the annoying find-in-selection feature. Nothing. Maybe I stashed it. Do git stash list. Nothing that was obviously my test. Note to self: Use git stash save <message> instead of just git stash. I did git show to a several old stashes, to no avail. How am I going to find this thing?

A sensible person would have stopped here and rewritten the test. But, how much work are we willing to do to avoid doing any work? Lots, apparently.

So, I wrote this ridiculous bit of hackery instead. I call it grep_git_objects:

import os
import sys
import subprocess

if len(sys.argv) < 2:
    print "Usage: python grep_git_objects.py <target>"
    sys.exit()

target = sys.argv[1]

hashes = []
for i in range(0,256):
    dirname = ".git/objects/%02x" % i
    #print dirname
    for filename in os.listdir(dirname):
        h = "%02x%s" % (i, filename)
        hashes.append(h)

        cmd = "git cat-file -p %s" % h
        output = subprocess.check_output(cmd.split(' '))
        if target in output:
            print "found in object: ", h
            print output

Gist: grep_git_objects.py

Run it in the root of a repo. It searches through the .git/objects hierarchy to find your misplaced tests or anything else you've lost.

You'd think git grep would do exactly that. Maybe it does and I'm just clueless. I hope so, 'cause otherwise, why would you have a git grep command? In any case, git grep didn't find my missing test. The above bit of Python code did.

Once I found it, I had the hash code for a Git object. Objects are the leaves of Git trees. So, I grepped again for the hash code of the object and found the tree node it belonged to, working my way up a couple more levels to the commit.

With commit in hand, I could see by the "WIP on develop" message that is was indeed a stash, which I must have dropped by mistake. Anyway, that was fun... sorta.