Documente Academic
Documente Profesional
Documente Cultură
AWK Programming
Introduction
“Computer users spend a lot of time doing simple, mechanical data
manipulation - changing the format of data, checking its validity, finding items with
some property, adding up numbers, printing reports, and the like. All of these jobs
ought to be mechanized, but it’s a real nuisance to have to write a special-purpose
program in a standard language like C or Pascal each time such a task comes up.
“Awk is a programming language that make it possible to handle simple,
mechanical data manipulation tasks with very short programs, often only one or two
lines long. An awk program is a sequence of patterns and actions that tell what to
look for in the input data and what to do when it’s found.”
Aho, Kernighan and Weinberger. 1988. “The AWK Programming Language”
Lining Up Fields
printf statement form
printf (format, value1, value2, …, valuen)
where format is a string that contains text to be printed verbatim interspersed with specification of
how each of the values is to be printed
A specification is a % followed by a few characters that control the format of a value.
Task: Use prinf to print the total pay for every employee
awk ‘{printf (“total pay for %s is $%.2f \n”, $1, $2 * $3)}’
2
Animal Science 562
AWK Programming
Task: Print each employee’s name and pay.
awk ‘{printf (“%-8s $%6.2f \n”, $1, $2 * $3)}’
Selection by Comparison
Task: A comparison pattern to select the records of employees who earn $5.00 or more per hour.
awk ‘$2 >= 5’ emp.data
Selection by Computation
Task: Print the pay of those employees whose total pay exceeds $50.
awk ‘$2*$3 > 50 {printf(“$%.2f for %s \n”, $2*$3, $1)}’
Combinations of Patterns
• Patterns can be combined with parentheses and the logical operators &&, ||, and !, which stand
for AND, OR, and NOT, respectively.
Data Validation
• Awk is an excellent tool for checking that data has reasonable values and that it is in the right
format.
Task: Use comparison patterns to apply five plausibility tests to each line of emp.data.
awk ‘NF !=3 {print $0, “number of fields is not equal to 3”}’
awk ‘$2 <3.35 {print $0, “rate is below minimum wage”}’
awk ‘$2 > 10 {print $0, “rate exceeds $10 per hour”}’
awk ‘$3 <0 {print $0, “negative hours worked”}’
awk ‘$3 >60 {print $0, “too many hours worked”}’
3
Animal Science 562
AWK Programming
Task: Use BEGIN to print a heading. (Note. This is a multiple line file and must be executed from
a file.)
BEGIN {print “Name Rate Hours”; print “ ”}
{print}
• You can put several statements on a single line if you separate them by semicolons.
Handling Text
• One strength of awk is its ability to handle strings of characters as conveniently as most
languages handle numbers.
Task: Find the employee who is paid the most per hour.
$2 > maxrate {maxrate = $2; maxemp=$1}’
END print “highest hourly rate: “, maxrate, “ for “, maxemp}
String Concatenation
Task: Create new strings by combining old ones
{names = names $1 “ “}
END {print names}
Built-in Functions
4
Animal Science 562
AWK Programming
• Provides built-in variables that maintain frequently used quantities: number of fields, input line
number
• Built-in functions for computing
square root
logarithms
random numbers
1.6 Control-Flow Statements (Note: These constructs are available in gawk and not awk at
ISU.)
IF-Else Statement
$2 > 6 {n = n+1; pay = pay + $2 * $3}
END {if (n > 0)
print n, “employees, Total pay is “, pay,
“average pay is “, pay/n
else
print “no employees are paid more than $6/hour”
}
While Statement
Task: Show how the value of an amount of money invested at a particular interest rate grows over
a number of years, using the formula
value = amount (1 + rate)years.
#interest1 - compute compound interest
# input: amount rate years
# output: compounded value at the end of each year.
{ i=1
while (i <= $3){
printf(“\t%.2f\n”, $1*(1+$2)^i)
i=i+1
}
}
Try
gawk -f interest1
1000 .06 5
1000 .12 5
References
Aho, A. V., B. W. Kernighan, and P. J. Weinberger. 1988. The AWK Programming Language.
Addison-Wesley. New York.
Dougherty, D. 1990. Sed and Awk: UNIX Power Tools. O’Reilly and Associates, Inc. California.