Linux commands: find

I’m not a Linux guru. I’ve always known just enough to get the job done, but was never focused on getting deep into the command line for its own sake. Ever since joining the devops team at TripAdvisor almost two years ago, though, and more recently taking responsibility for operations at Scratch, I’ve had to learn a much broader array of commands – some things can only be done with a specialized command, of course, but even common tasks frequently go much faster when you have a bigger toolbox.

With that in mind, I thought I would start a regular feature on this blog, in which I went through one or more of my favorite Linux commands – things I use all the time, and that make my life dramatically easier. All of these will fit into the “advanced beginner to intermediate” category – while there’s always something new and exciting to discover about old standbys (for instance, cd - will return you to your previous location, and tail -F will continue working after a file has been replaced with a new file), I’m going to focus on commands a couple rungs up the ladder from ls.

find

I’m going to start with find, one of the commands I use most often. find is useful in so many situations, in so many different ways, that it’s hard to remember what I did before I figured out how to use it. In a nutshell, find returns a list of files that match some set of criteria, and optionally does something with them. Here’s the basic form, which searches for files of name “filename” in the current directory and sub-directories:

find . -name filename

You can easily change this to another directory:

find /tmp -name filename

Or add a wildcard to the name (note the quotes):

find . -name "filename*"

You can direct it to return only files, or only directories:

find . -name "filename*" -type f
find . -name "filename*" -type d

You can limit the depth to which find should traverse:

find . -name "filename*" -maxdepth 2

Or search based on file age or size:

find . -mtime +3  # more than 3 days old
find . -size 2G   # more than 2Gb

One of the most useful things you can do with find is to run a command on every return value. So, if you wanted to delete everything in the current directory over three days old, you could say:

find . -mtime +3 -exec rm -rf {} \;

The key section here comes after -exec: {} is replaced with the filename, and the command has to end with \; (the backslash escapes the semi-colon, to prevent it from being used as a delimiter between commands on the same command line).

Note: you can also use -delete to delete files, but that wouldn’t illustrate the point.

An alternate way to run a command on all return values is to use a bash for-loop (this can be a simple way of including multiple commands, or including a pipe):

for i in `find . -mtime +3`; do echo "Deleting $i" && rm -rf $i; done

Some examples:

# gunzip all files under the current directory
find . -name "*.gz" -exec gunzip {} \;

# gzip all .txt files under the current directory
find . -name "*.txt" -exec gzip {} \;

# non-destructively gzip all .txt files (two approaches)
for i in `find . -name "*.txt"`; do gzip -c $i > $i.gz; done
find . -name "*.txt" -exec sh -c 'gzip -c {} > {}.gz' \;

As with most Linux commands, there are many, many more options, and I recommend taking a minute or two to look through the man pages. You may never use most of the more obscure options, or even remember their syntax, but it’s helpful to know they exist (and where to find information on them), for that odd moment when you need something a bit more specialized.

Of course, just reading this blog isn’t going to make you an expert in anything. The only way you’ll learn is by doing, so I recommend trying to use find as much as possible for at least a week – even when not strictly necessary. Soon enough, you’ll wonder how you ever got along without it.

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Leave a comment