Like peanut butter and chocolate, cut
and tr
are two great commands that go great together. cut
provides an easy way to grab one or more fields from a line of text, and tr
allows you to prepare the line first. Let’s start with cut
.
Consider the following directory listing:
# ls -l total 5392 -rw-r--r--. 1 root root 480112 Jul 15 2015 greenhopper.log -rw-r--r--. 1 root root 70154 Jan 13 12:07 jira-incoming-mail.log -rw-r--r--. 1 root root 4595402 Jan 14 09:58 jira.log -rw-r--r--. 1 root root 152594 Aug 15 21:49 jira-outgoing-mail.log -rw-r--r--. 1 root root 196573 Jan 5 04:54 jira-security.log
What if, for some reason, you only wanted the file name (column 9) and size (column 5)? cut
will allow you to specify and output only the columns you want.
# ls -l | cut -d' ' -f5,9 480112 greenhopper.log 70154 jira-incoming-mail.log 4595402 jira.log 152594 jira-outgoing-mail.log 196573 jira-security.log
cut
‘s -d
parameter specifies the delimiter (in this case, a space), and the -f
parameter specifies which fields to grab. -f
can take a single field (e.g., -f3
), a range (e.g., -f3-5
), or a combination of both (e.g., -f1,3-5,9
).
Unfortunately, it turns out that I lied a little. Instead of the above output, running this command actually returned this:
# ls -l | cut -d' ' -f5,9 13 4595402 jira.log 21:49 5
The problem is that there is sometimes more than one space between the group name and file size, and between the month and the day. cut
(like String.split()
in Java and Javascript) will interpret every individual delimiting character as indicating a new field. If only there were a way to compress the spaces…
Enter tr
. tr
gives a very simple way to translate or delete characters in a string. So, to compress the spaces, you simply pipe ls
‘s output through tr -s ' '
.
# ls -l | tr -s ' ' total 5392 -rw-r--r--. 1 root root 480112 Jul 15 2015 greenhopper.log -rw-r--r--. 1 root root 70154 Jan 13 12:07 jira-incoming-mail.log -rw-r--r--. 1 root root 4595402 Jan 14 09:58 jira.log -rw-r--r--. 1 root root 152594 Aug 15 21:49 jira-outgoing-mail.log -rw-r--r--. 1 root root 196573 Jan 5 04:54 jira-security.log
At this point, using cut
will yield the output described above.
tr
can also be used to delete characters. Let’s say you have an access log, in which each line starts with a timestamp surrounded by brackets:
# cat access.log [14/Jan/2016:06:47:56 +0000] - - "GET ... [14/Jan/2016:06:48:00 +0000] - - "GET ... [14/Jan/2016:06:48:34 +0000] - - "GET ... [14/Jan/2016:06:48:36 +0000] - - "GET ... ...
If we wanted to get a list of just the dates and times, we could use cut
like so:
# cat access.log | cut -d' ' -f1 [14/Jan/2016:06:47:56 [14/Jan/2016:06:48:00 [14/Jan/2016:06:48:34 [14/Jan/2016:06:48:36 ...
I’d like to get rid of that first bracket. I suppose I could use cut
again in this specific situation, with the bracket as the delimiter, but it’s more elegant to just delete it using tr
with the -d
parameter:
# cat access.log | cut -d' ' -f1 | tr -d [ 14/Jan/2016:06:47:56 14/Jan/2016:06:48:00 14/Jan/2016:06:48:34 14/Jan/2016:06:48:36 ...
Or, suppose I just wanted the date, without the slashes:
# cat access.log | cut -d: -f1 | tr -d [ | tr -d / 14Jan2016 14Jan2016 14Jan2016 14Jan2016 ...
Lastly, you can use tr
to map from one set of letters to another. If, for instance, you wanted to change everything to uppercase, you could map a-z
to A-Z
with the -t
parameter:
# cat access.log | cut -d: -f1 | tr -d [ | tr -t a-z A-Z 14/JAN/2016 14/JAN/2016 14/JAN/2016 14/JAN/2016 ...
Or, if you wanted to convert these dates into a CSV, just change newlines into commas:
# head -n 4 access.log | cut -d: -f1 | tr -d [ | tr -t \\n , 14/Jan/2016,14/Jan/2016,14/Jan/2016,14/Jan/2016,
cut
and tr
aren’t flashy or complicated (like find
, sed
, or awk
), but once you start using them, they quickly become indispensable tools for transforming the output of other tools into more useful forms.