If you’re on a Unix variant of most any sort, this command will find you every file whose name contains foo
anywhere on your machine:
find / -name '*foo*'`
It will also run impossibly slowly, because it’s scanning all of the (maybe millions of) files on your machine. A better option is to periodically generate a database of all files, then do something akin to
select distinct filename from my_filesystem where name like '%foo%'
As luck would have it, most Linux distributions that I’m familiar with come with a tool called locate(1)
that does this for you, which allows you to write
locate foo
and get what you want. It requires periodically running a tool called updatedb(8)
to refresh the database. A quick check of a Linux box I have on hand says that updatedb(8)
runs once a day. That’s probably fine, but works less well if you’re changing a lot of filenames; then locate(1)
will be behind the times. Which is sad.
Macs have it quite a lot better. They are constantly running a tool called Spotlight, which indexes everything all the time. As I understand it, the Spotlight DB is updated whenever you modify anything about a file, including its contents or its name. There’s a vast Spotlight architecture lurking underneath everything you do on a Mac. At the command line, the mdfind(1)
tool lets you search for files whose Spotlight metadata matches any number of criteria, and mdls(1)
lists the metadata for a specified file. I’m a Spotlight-metadata novice at this point, but here are some good resources:
- This is a good set of tips for accessing the full variety of metadata available to you through the
md*
tools. - The documentation on Uniform Type Identifiers (which are one part of the Spotlight metadata).
I wrote a tiny little script whose job is to invoke mdfind(1)
when you’re on a Mac, and invoke locate(1)
on any other Unix. This is handy to me, since I copy my ~/bin
directory to every Unix machine I use, and would very much like to use the same locate
command on every machine. (You should see my ~/.bashrc
, he said sexily.) Here it is:
(15:32 -0400) slaniel@laptop:~$ cat ~/bin/locate #!/bin/bash pattern=$1 os_ver=`uname` if [ ${os_ver} == "Darwin" ]; then mdfind "kMDItemFSName=\"*${pattern}*\"c" else /usr/bin/locate -i $@ fi
That mdfind
line initially said
mdfind "kMDItemContentType=\"public*\" && kMDItemFSName=\"*${pattern}*\"c"
until I realized that not every file I cared about was in the public
domain: public
would only catch files of some publicly known type (e.g., QuickTime videos, GIFs, etc.). For instance, here’s what mdls(1)
says about a .DOCX file:
kMDItemContentType = "org.openxmlformats.wordprocessingml.document" kMDItemContentTypeTree = ( "org.openxmlformats.wordprocessingml.document", "org.openxmlformats.openxml", "public.zip-archive", "com.pkware.zip-archive", "public.data", "public.item", "public.archive", "public.composite-content", "public.content" ) […] kMDItemKind = "Microsoft Word document (.docx)"
That last item (kMDItemKind
) is interesting. It leads to a quick bit of command-line hackery to find every Microsoft-format file in your filesystem:
(15:53 -0400) slaniel@AKAMAI_laptop:~$ mdfind "kMDItemKind == '*Microsoft*'" | tr '\n' '\0' | xargs -0 mdls | grep kMDItemKind | grep -o '[^"]\+"$' |sed 's#"$##' | sort | uniq -c | sort -nr 542 Microsoft Word 97 - 2004 document (.doc) 534 Microsoft Excel Workbook (.xlsx) 307 Microsoft Word document (.docx) 145 Microsoft Word 97 - 2004 document 145 Microsoft Excel 97-2004 Workbook (.xls) 33 Microsoft Excel template 29 Microsoft PowerPoint presentation 28 Microsoft Excel workbook 28 Microsoft Excel Macro-Enabled Workbook (.xlsm) 28 Microsoft Excel 97-2004 workbook 27 Microsoft Word document 20 Microsoft PowerPoint 97-2004 presentation 15 Microsoft personal dictionary 9 Microsoft Word 97 - 2004 template 8 Microsoft Outlook document 8 Microsoft Graph preferences 3 Microsoft Word Macro-Enabled template (.dotm) 1 Microsoft Word MHTML document (.mht) 1 Microsoft Word 97 - 2004 template (.dot) 1 Microsoft PowerPoint toolbar 1 Microsoft PowerPoint 97 - 2004 Template 1 Microsoft Outlook signatures 1 Microsoft Office Theme
There’s a whole world of Spotlight metadata here, of which I’m only skimming the surface. Suffice to say that it’s an extraordinarily powerful tool to have in your kit.