Documente Academic
Documente Profesional
Documente Cultură
Programmable Text Processing with awk Programmable Text Processing with awk
The awk utility scans one or more files and an action on all of the lines that match a particular condition. The actions and conditions are described by an awk program and range from the very simple to the complex. awk got its name from the combined first letters of its authors surnames: Aho, Weinberger, and Kernighan.
Aho
Weinberger
Kernighan
It borrows its control structures and expression syntax from the language C.
start with a few lines and keep adding until it does what you want
action is performed on every line that matches pattern (or condition in other words). If pattern is not provided, action is performed on every line. If action is not provided, then all matching lines are simply sent to standard output. Since patterns and actions are optional, actions must be enclosed in braces to distinguish them from pattern. The statements in an awk program may be indented and formatted using spaces, tabs, and new lines.
Prof. Andrzej (AJ) Bieszczad Email: andrzej@csun.edu Phone: 818-677-4954 4
Programmable Text Processing with awk Special awk Patterns: BEGIN, END
BEGIN and END provide a way to gain control before and after processing, for initialization and wrap-up. BEGIN: actions are performed before the first input line is read. END: actions are done after the last input line has been processed. BEGIN { print "List of html files:" } /\.html$/ { print } END { print "There you go!" }
action may include arithmetic and string expressions and assignments and multiple output streams.
NR - Number of records processed NF - Number of fields in current record FILENAME - name of current input file FS - Field separator, space or TAB by default OFS - Output field separator, space by default ARGC/ARGV - Argument Count, Argument Value array
Used to get arguments from the command line
10
Could be any other regular expression. Special variable RS: record separator
can be changed in BEGIN action
Special variable NR is the variable whose value is the number of the current record.
11
$0 is the entire line $1 is the first field, $2 is the second field, ., $NF is the last field Only fields begin with $, variables are unadorned
12
Programmable Text Processing with awk awk: Simple Output From AWK
Printing Every Line
If an action has no pattern, the action is performed to all input lines
{ print }
will print all input lines to standard out
{ print $0 }
will do the same thing
{ print $1, $3 }
expressions separated by a comma are, by default, separated by a single space when output
13
{ print $(NF-2) }
prints the third to last field
{ print $1, $2 * $3 }
14
{ print NR, $0 }
will print each line prefixed with its line number
15
printf( format, val1, val2, val3, ) { printf(total pay for %s is $%.2f\n, $1, $2 * $3) }
when using printf, formatting is under your control so no automatic spaces or newlines are provided by awk. You have to insert them yourself.
16
awk Operators:
= assignment operator; sets a variable equal to a value or string == equality operator; returns TRUE is both sides are equal != inverse equality operator && logical AND || logical OR ! logical NOT <, >, <=, >= relational operators +, -, /, *, %, ^ arithmetic String concatenation
Prof. Andrzej (AJ) Bieszczad Email: andrzej@csun.edu Phone: 818-677-4954 18
Programmable Text Processing with awk awk: Arithmetic and Variables Examples
Counting is easy to do with Awk $3 > 15 { emp = emp + 1} # work hours are in the third field END { print emp, employees worked more than 15 hrs} Computing sums and averages is also simple
{ pay = pay + $2 * $3 } END { print NR, employees print total pay is, pay print average pay is, pay/NR }
19
20
{ names = names $1 " " } END { print names } Printing the Last Input Line
although NR retains its value after the last input line has been read, $0 does not
21
String
length, substitution, find substrings, split strings
Output
print, printf, print and printf to file
Special
system - executes a Unix command
e.g., system(clear) to clear the screen Note double quotes around the Unix command
exit - stop reading input and go immediately to the END pattern-action pair if it exists, ot herwise exit the script
22
23
24
25
Example # reverse - print input in reverse order by line { line[NR] = $0 } # remember each line END { for (i=NR; (i > 0); i=i-1) { print line[i] } }
26
27
28
29
30
31
32
33
34
35
$ cat test --> look at the input file. 1.1 a 2.2 at 3.3 eat 4.4 beat $ cat awk8 --> look at the awk script. { printf $1 = %g , $1 printf exp = %.2g , exp($1); printf log = %.2g , log($1); printf sqrt = %.2g , sqrt($1); printf int = %d , int($1); printf substr( %s,1,2) = %s \n, $2, substr( $2,1,2); } $ awk -f awk8 test --> execute the script. $1=1.1 exp=3 log=0.095 sqrt=1 int =1 substr(a,1,2)=a $1=2.2 exp=9 log=0.79 sqrt=1.5 int=2 substr(at,1,2)=at $1=3.3 exp=27 log=1.2 sqrt=1.8 int=3 substr(eat,1,2)=ea $1=4.4 exp=81 log=1.5 sqrt=2.1 int=4 substr(beat,1,2)=be $_
Prof. Andrzej (AJ) Bieszczad Email: andrzej@csun.edu Phone: 818-677-4954 36
awk challenge