If you only use awk when you need to select specific fields from lines of text, you might be missing out on a lot of other services that the command can provide. In this post, we’ll look at this simple use along with many other things that awk can do for you with enough examples to show you that the command is a lot more flexible than you might have imagined.
Plucking out columns of data
The easiest and most commonly used service that awk provides is selecting specific fields from files or from data that is piped to it. With the default of using white space as a field separator, this is very simple:
$ echo one two three four five | awk ‘{print $4}’
four $ who | awk '{print $1}' jdoe fhenry
White space is any sequence of blanks and tabs. In the commands shown above, awk is extracting just the fourth and first fields from the data provided.
[Get regularly scheduled insights by signing up for Network World newsletters.]
Awk can also pull text from files by just adding the name of the file after the awk command.
$ awk '{print $1,$5,$NF}' HelenKellerQuote The beautiful heart.
In this case, awk has picked out the first, fifth, and last words in the single line of text.
The $NF specification in the command picks the last piece of text on each line. That is because NF represents the number of fields in a line (23) while $NF represents the value of that last field (“heart.”). The period is included in the output because it’s part of the final string.
Fields can be printed in any order that you might find useful. In this example, we are rearranging the fields in date command output.
$ date | awk '{print $4,$3,$2}' 2019 Nov 22
If you omit the commas between the field designators in an awk command, the output will be pushed into a single string.
$ date | awk '{print $4 $3 $2}' 2021Aug2
If you replace the usual commas with hyphens, awk will attempt to subtract one field from another–probably not what you intended. It doesn’t take the hyphens as characters to be inserted into the print output. Instead, it puts some of its mathematical prowess into play.
$ date | awk '{print $4-$3-$2}' 2019
In this case, it’s subtracting 2 (the day of the month) from the year (2021) and simply ignoring “Aug”.
If you want your output to be separated by something other than white space, you can specify your output separator with OFS (output field separator) like this:
$ date | awk '{OFS="-"; print $4,$3,$2}' 2021-Aug-02
If your date command output looks like what you see below, you need to change the fields to $7,$2,$3 in the commands shown above since the year is listed last.
$ date Mon Aug 2 02:26:34 PM EDT 2021
Printing simple text
You can also use awk to simply display some text. Of course, if all you want to do is print a line of text, you’d be better off using an echo command. On the other hand, as part of an awk script, printing some relevant text can be very useful. Here’s a practically useless example:
$ awk 'BEGIN {print "Hello, World" }' Hello, World
Here’s a more sensible example in which adding a line of text to label your data can help identify what you’re looking at:
$ who | awk 'BEGIN {print "Current logins:"} {print $1}' Current logins: shs nemo
Specifying a field separator
Not all input is going to be separated by white space. If your text is separated by some other character (e.g., commas, colons or semicolons), you can inform awk by using the -F (input separator) option as shown here:
$ cat testfile a:b:c,d:e $ awk -F : '{print $2,$3}' testfile b c,d
Here’s a more useful example – pulling a field from the colon-separated /etc/passwd file:
$ awk -F: '{print $1}' /etc/passwd | head -11 root bin daemon adm lp sync shutdown halt mail operator games
Evaluating content
You can also evaluate fields using awk. If you, for example, want to list only user accounts in /etc/passwd, you can include a test for the 3rd field. Here we’re only going after UIDs that are 1000 and above:
$ awk -F":" ' $3 >= 1000 ' /etc/passwd nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin shs:x:1000:1000:Sandra Henry-Stocker,,,:/home/shs:/bin/bash nemo:x:1001:1001:Nemo,,,:/home/nemo:/usr/bin/zsh dory:x:1002:1002:Dory,,,:/home/dory:/bin/bash ...
If you want to add a title for your listing, you can add a BEGIN clause:
$ awk -F":" 'BEGIN {print "user accounts:"} $3 >= 1000 ' /etc/passwd user accounts: nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin shs:x:1000:1000:Sandra Henry-Stocker,,,:/home/shs:/bin/bash nemo:x:1001:1001:Nemo,,,:/home/nemo:/usr/bin/zsh dory:x:1002:1002:Dory,,,:/home/dory:/bin/bash
If you want more than one line in your title, you can separate your intended output lines with “n” (newline characters).
$ awk -F":" 'BEGIN {print "user accountsn============="} $3 >= 1000 ' / etc/passwd user accounts ============= nobody:x:65534:65534:Kernel Overflow User:/:/sbin/nologin shs:x:1000:1000:Sandra Henry-Stocker:/home/shs:/bin/bash nemo:x:1001:1001::/home/nemo:/bin/bash bugfarm:x:1006:1006::/home/bugfarm:/bin/bash dbell:x:1002:1002::/home/dbell:/bin/bash dorothy:x:1008:1008::/home/dorothy:/bin/bash eel:x:1005:1005::/home/eel:/bin/bash gijoe:x:1017:1017::/home/gijoe:/bin/bash gino:x:1016:1016::/home/gino:/bin/bash lola:x:1007:1007::/home/lola:/bin/bash myself:x:1013:1013::/home/myself:/bin/bash newuser:x:1014:1014::/home/newuser:/bin/bash shark:x:1003:1003::/home/shark:/bin/bash tadpole:x:1004:1004::/home/tadpole:/bin/bash jadep:x:1012:1012::/home/jadep:/bin/bash snakey:x:1018:1018::/home/snakey:/bin/bash monsterfromthedeepbluesea:x:1019:1019::/home/monsterfromthedeepbluesea:/bin/bash
Doing math with awk
awk provides a surprising mathematical ability and can calculate square roots, logs, tangents, etc.
Here are a couple examples:
$ awk 'BEGIN {print sqrt(2021)}'
44.9555 $ awk 'BEGIN {print log(2019)}' 7.61036
For more details on awk‘s mathematical skills, check out Doing math with awk.
Writing awk scripts
You can also write standalone scripts with awk. Here’s an example that mimics one of the examples provided earlier, but also counts the number of users with accounts on the system.
#!/usr/bin/awk -f # This line is a comment BEGIN { printf "%sn","User accounts:" print "==============" FS=":" n=0 } # Now we'll run through the data { if ($3 >= 1000) { print $1 n ++ } } END {
print "==============" print n " accounts" }
Notice how the BEGIN section, which is run only when the script starts, provides a heading, dictates the field separator, and sets up a counter to start with 0. The script also includes an END section which only runs after all the lines in the text provided to the script have been processed. It displays the final count of lines that meet the specification in the middle section (third field is 1,000 or larger).
Use the script like this:
$ ./list_users /etc/passwd User accounts: ============== nobody shs nemo bugfarm dbell dorothy eel gijoe gino lola myself newuser shark tadpole jadep snakey monsterfromthedeepbluesea ============== 17 accounts
Searching for text in a file
You can use awk to select lines from a file that contain a specified word or string. The examples below illustrate both with an allowance for a lowercase or uppercase “B” in the third.
$ awk '/harmony/ {print}' Happy_Quotes "Happiness is when what you think, what you say, and what you do are in harmony." —Mahatma Gandhi $ awk '/present moment/ {print}' Happy_Quotes
"The present moment is filled with joy and happiness. If you are attentive, you will see it." —Thich Nhat Hanh
$ awk '/[Bb]e happy/ {print}' Happy_Quotes "Be happy for this moment. This moment is your life." — Omar Khayyam
Selecting a portion of a file
To select a section of text from a file, you need to include text from the start and end lines that identify the portion to be extracted. Here’s an example:
$ awk /LAST/,/END/ Happy_Quotes LAST QUOTE "Action may not always bring happiness, but there is no happiness without action." —William James THE END
Replacing text in a file
To change text that’s in a file using awk, you can use syntax like that shown below. Just keep in mind that this changes the text that you see, but does not modify the file contents. To save the changes, redirect the output to a temporary file and then use it to replace the original.
$ awk '{gsub(/joy/,"fear"); print}' test11 "The present moment is filled with fear and happiness. If you are attentive, you will see it." —Thich Nhat Hanh
To replace multiple strings, use more than one gsub command:
$ awk '{gsub(/joy/,"fear"); gsub(/happiness/,"misery"); print}' test11 "The present moment is filled with fear and misery. If you are attentive, you will see it." —Thich Nhat Hanh
Counting lines or words in a file
To print the number of lines in a file using awk, do this:
$ awk 'END {print NR}' Happy_Quotes 12
The inclusion of END in the command means output is provided after lines have been processed. NR (number of records) represents the number of lines in the file.
To print the number of words (or strings) on each line of a file, you can use a command like this:
$ awk '{print NF}' Happy_Quotes 18 15 12 16 11 12 17 18 20 2 15 2
You can use a script like that shown below to count the words and provide just the total.
$ cat count_words #!/usr/bin/awk -f BEGIN { num=0 } { print NF num=num + NF } END {print num}
This script runs through the target file a line at a time and adds the word count for each line to the total. This works because NF represents the number of fields in each line.
Alternately, you can use awk commands like these to get the overall and the per-line plus overall counts:
$ awk 'BEGIN {num=0}; {num=num+NF}; END {print num}' Happy_Quotes 158 $ awk 'BEGIN {num=0}; {print NF;num=num+NF}; END {print num}' Happy_Quotes 18 15 12 16 11 12 17 18 20 2 15 2 158
$ count_words Happy_Quotes 158
Determining the most heavily used commands
You can also use awk along with a number of other commands to view which commands you have used most frequently within the life span of your current history file.
$ history | awk '{print $2}' | sort | uniq -c | sort -nr | head -5 191 sudo 91 ls 80 vi 57 systemctl 51 man
If you have added date and time fields to your history file, use $4 instead of $2 in the command above.
Wrap-Up
A long-standing Unix command, awk still provides very useful services and remains one of the reasons that I fell in love with Unix many decades ago. While some of awk‘s functions (like counting lines) can be more easily performed by other commands (like wc and wc -l), it’s still useful to know what awk can do, especially if you get into writing awk scripts and need to use many of its functions.
Copyright © 2021 IDG Communications, Inc.