If you’re on a Unix variant of most any sort, this command will find you every file whose name contains foo anywhere on your machine:
find / -name '*foo*'`
It will also run impossibly slowly, because it’s scanning all of the (maybe millions of) files on your machine. A better option is to periodically generate a database of all files, then do something akin to
select distinct filename from my_filesystem where name like '%foo%'
As luck would have it, most Linux distributions that I’m familiar with come with a tool called locate(1) that does this for you, which allows you to write
locate foo
and get what you want. It requires periodically running a tool called updatedb(8) to refresh the database. A quick check of a Linux box I have on hand says that updatedb(8) runs once a day. That’s probably fine, but works less well if you’re changing a lot of filenames; then locate(1) will be behind the times. Which is sad.
Macs have it quite a lot better. They are constantly running a tool called Spotlight, which indexes everything all the time. As I understand it, the Spotlight DB is updated whenever you modify anything about a file, including its contents or its name. There’s a vast Spotlight architecture lurking underneath everything you do on a Mac. At the command line, the mdfind(1) tool lets you search for files whose Spotlight metadata matches any number of criteria, and mdls(1) lists the metadata for a specified file. I’m a Spotlight-metadata novice at this point, but here are some good resources:
- This is a good set of tips for accessing the full variety of metadata available to you through the
md*tools. - The documentation on Uniform Type Identifiers (which are one part of the Spotlight metadata).
I wrote a tiny little script whose job is to invoke mdfind(1) when you’re on a Mac, and invoke locate(1) on any other Unix. This is handy to me, since I copy my ~/bin directory to every Unix machine I use, and would very much like to use the same locate command on every machine. (You should see my ~/.bashrc, he said sexily.) Here it is:
(15:32 -0400) slaniel@laptop:~$ cat ~/bin/locate
#!/bin/bash
pattern=$1
os_ver=`uname`
if [ ${os_ver} == "Darwin" ]; then
mdfind "kMDItemFSName=\"*${pattern}*\"c"
else
/usr/bin/locate -i $@
fi
That mdfind line initially said
mdfind "kMDItemContentType=\"public*\" && kMDItemFSName=\"*${pattern}*\"c"
until I realized that not every file I cared about was in the public domain: public would only catch files of some publicly known type (e.g., QuickTime videos, GIFs, etc.). For instance, here’s what mdls(1) says about a .DOCX file:
kMDItemContentType = "org.openxmlformats.wordprocessingml.document"
kMDItemContentTypeTree = (
"org.openxmlformats.wordprocessingml.document",
"org.openxmlformats.openxml",
"public.zip-archive",
"com.pkware.zip-archive",
"public.data",
"public.item",
"public.archive",
"public.composite-content",
"public.content"
)
[…]
kMDItemKind = "Microsoft Word document (.docx)"
That last item (kMDItemKind) is interesting. It leads to a quick bit of command-line hackery to find every Microsoft-format file in your filesystem:
(15:53 -0400) slaniel@AKAMAI_laptop:~$ mdfind "kMDItemKind == '*Microsoft*'" | tr '\n' '\0' | xargs -0 mdls | grep kMDItemKind | grep -o '[^"]\+"$' |sed 's#"$##' | sort | uniq -c | sort -nr
542 Microsoft Word 97 - 2004 document (.doc)
534 Microsoft Excel Workbook (.xlsx)
307 Microsoft Word document (.docx)
145 Microsoft Word 97 - 2004 document
145 Microsoft Excel 97-2004 Workbook (.xls)
33 Microsoft Excel template
29 Microsoft PowerPoint presentation
28 Microsoft Excel workbook
28 Microsoft Excel Macro-Enabled Workbook (.xlsm)
28 Microsoft Excel 97-2004 workbook
27 Microsoft Word document
20 Microsoft PowerPoint 97-2004 presentation
15 Microsoft personal dictionary
9 Microsoft Word 97 - 2004 template
8 Microsoft Outlook document
8 Microsoft Graph preferences
3 Microsoft Word Macro-Enabled template (.dotm)
1 Microsoft Word MHTML document (.mht)
1 Microsoft Word 97 - 2004 template (.dot)
1 Microsoft PowerPoint toolbar
1 Microsoft PowerPoint 97 - 2004 Template
1 Microsoft Outlook signatures
1 Microsoft Office Theme
There’s a whole world of Spotlight metadata here, of which I’m only skimming the surface. Suffice to say that it’s an extraordinarily powerful tool to have in your kit.
