Documente Academic
Documente Profesional
Documente Cultură
These notes are intended for use by students in CS1621 at the University of Pittsburgh and no one else These notes are provided free of charge and may not be sold in any shape or form Material from these notes is obtained from various sources, including, but not limited to, the textbooks:
Concepts of Programming Languages, Seventh Edition, by Robert W. Sebesta (Addison Wesley) Programming Languages, Design and Implementation, Fourth Edition, by Terrence W. Pratt and Marvin V. Zelkowitz (Prentice Hall)
Compilers Principles, Techniques, and Tools, by Aho, Sethi and Ullman (Addison Wesley)
Expressions
Things to consider:
Precedence and associativity Order of operand evaluation
Side-effects of evaluation
Overloadings and coercions
3
Expressions
Most languages have similar precedence for the standard operators: * / then + But programmer needs to understand precedence and associativity for all operators, especially those that may be unusual
4
Expressions
In Pascal, the boolean operators have higher precedence than the relational operators (opposite of C++)
if x < y then writeln(Less); if x < y and y < z then writeln(Middle);
Above is an error in Pascal, since the first subexpression evaluated would be y and y
In C++
if (x < y && y < z) cout << Middle << endl;
This is fine in C++
5
Expressions
Expressions
Output? See plusplus.cpp try it on different platforms http://www.cppreference.com/operator_precedenc e.html See problem in Assignment 3 Compare to plusplus.java and plusplus.pl
Expressions
In some cases, expression is ambiguous and compiler will not let you do it, or warn you about it
Ex: A ** B ** C in Ada
Must have parentheses
Sometimes you could probably figure it out, but youre better off not trying
Ex: If more than one coercion can occur in C++
May have defined constructor and conversion fn
8
Expressions
Sometimes you dont think you should care, about precedence and associativity, but you should
In math, addition and multiplication are associative and commutative On computer, overflow can cause this to not always be the case:
floats x = 1e+30, B = 1.0/1e+30, C = 1e+30 A*B*C A*C*B ~= 1e+30 = infinity see Overflow.cpp F1.add(F2); F2.add(F1) -- If F1 and F2 are from different classes, the operations may be different or perhaps not even legal
9
Expressions
Without side-effects, the results are the same, but if f(X) changes the value of X, the results could be different
Most languages allow reference parameters with functions These can cause logic errors if used improperly See side.cpp
10
Expressions
Best advice is to program in such away as to either avoid all side-effects, or to only allow them in cases where they will not affect expression evaluation
11
Expressions
Operator Overloading
Used in many newer high-level languages
is clearer than
if (A.compareTo(B) < 1)
12
Expressions
Bad:
Can harm readability if used incorrectly
Ex: + defined to do multiplication
But methods could be improperly named as well
Function calls are not obvious, especially if other versions of the function exist
In C++ we could have an member function + and also a friend function + which is used?
13
Expressions
Some languages like C++ and Ada allow programmer-defined operator overloading
14
Expressions
However, often the operators and functions used are defined for only a single type In this case, to allow mixed expressions to be used, some types must be converted to other types
The differences in languages are whether these conversions should be IMPLICIT or EXPLICIT
15
Expressions
Explicit conversion
In this case the language allows little or no mixed expressions in the code To allow mixing of data types, the programmer must convert through an operation of function call
Ex: Ada does not even allow mixing of floats and integers
Good:
Everything is clear no uncertainty or ambiguity Programmer can more easily verify correctness of programs Easier to avoid logic errors
16
Expressions
Bad:
Makes language very wordy Can be annoying, especially when the types are similar (ex. addition of integers and floats)
Good:
Less wordy makes programs shorter and sometimes easier to write
17
Expressions
Bad:
Programs are harder to verify for correctness It is not always clear which coercion is being done, especially when programmer-defined coercions are allowed Can lead to logic errors in programs Ex: In C++ expressions are always coerced if they can be Standard rules of promotion for predefined types can be easily remembered However, programmer can also define functions that will be used for coercion
Constructors for classes and conversion functions are both implicitly called if necessary Now the rules are less clear and can lead to ambiguity and logic errors
18
Expressions
Consider A = B + C where A, B and C are all of different types Any/all of the following could exist:
+ operator with two type B arguments + operator with two type C arguments Constructor for type B with argument type Constructor for type C with argument type Coercion function from C to B Coercion function from B to C Constructor for type A with argument type Constructor for type A with argument type C B
B C
How does programmer know which will be used? Should NOT assume any particular coercion will occur in this case
Here explicit coercion should be used to remove ambiguity
Expressions
Boolean expressions
Expressions that evaluate to TRUE or FALSE
Expressions
21
Expressions
Short-Circuit Evaluation
Important note (that we may not have emphasized earlier):
Operator precedence and associativity are for OPERATORS, not OPERANDS
The operators simply indicate how the operands are combined/utilized, NOT the order in which they are accessed/determined For example: A + B + C + D We know we first add A and B, then add C, then add D But the VALUES for A, B, C and D could be obtained in ANY ORDER
Done to optimize execution (ex. in parallel)
22
Expressions
2) An operand may not be even be valid if a previous operand evaluates in a certain way
Ex: if ((X != 0) && (Y/X < 1)) cout << rational; Considering the && operator, if the first operand evaluates to FALSE, the second operand evaluates to a run-time error Now if the compiler would try to do these in parallel it could cause problems Solution is SHORT-CIRCUIT EVALUATION (SSE)
23
Expressions
24
Expressions
Ex:
Without SSE, how would we have to write this to prevent possible run-time error?
Do on board
Drawbacks of SSE?
Now computer must evaluate operands sequentially Slows down program execution, especially in environments with multiple CPUs
Expressions
C++ and Java use SSE for && and || but arbitrary evaluation for bitwise & and |
26
Expressions
Assignment
Central to Imperative Languages
27
Expressions
Variations
Some languages allow multiple targets
Expressions
Concern also must be given for overloading the assignment operator (legal in C++ and Ada)
It is possible to cause it to behave differently from what is normally expected Care has to be taken so that it works in all cases
29
Expressions
If we want to use this assignment as with other assignments, we need to return the assigned result as the result of the assignment
In C++ this is typically a reference return value, so that we can cascade the operator effectively A = (B = C); (A = B) = C; On the left, when the assignment B = C is finished, we need the rvalue of the result On the right, when the assignment A = B is finished, we need the lvalue of the result Reference allows both (even though right seems silly to do)
Expressions
One issue that you may not normally consider: How is the rvalue evaluated?
For statically typed languages, there is usually no ambiguity expression result type must match the type of the variable
But for dynamically typed languages, it is no longer clear
Ex: in Prolog A=5+3 Since A is not necessarily an integer, 5 + 3 could be taken as a string just as reasonably as it could be taken as an arithmetic expression See assig.pl
31
Control Statements
Iteration
Repeat an action 0 or more times
32
Control Statements
Selection
One-way selection
if statement exists in virtually every imperative language
Idea here is that we either execute a statement or do not In modern languages this is achieved using an if without the optional else
Two-way selection
Now we incorporate the else with the if
33
Control Statements
Typical syntax:
if <condition> <statement> else <statement>
Interesting issues:
1) Form of condition? 2) What kinds of statements are allowed? 3) Is nesting allowed and how is it interpreted?
34
Control Statements
1) Form of condition
Most languages require a boolean expression (true or false only) C/C++ are exceptions int values are allowed Original FORTRAN and BASIC allowed only a single statement
This is not conducive to good programming techniques Only way to have multiple statements is by using an unconditional branch, i.e. GO TO
2) Kinds of statements
35
Control Statements
3) Nesting
It logically follows that a statement within an if clause or else clause could be another if statement
Remember orthogonality
Control Statements
37
Control Statements
38
Control Statements
Multiple Selection
Idea is to choose from many possible options
Control Statements
However, in some situations, the options are based on different result values of a single expression:
Ex: Menu in which user chooses an option from 1 to 5; each option causes a different action
40
Control Statements
Control Statements
C, C++ and Java do not automatically break out after the selection has been executed
This is good and bad (as usual) Adds flexibility
If the execution for one selection is a superset of another, it makes sense to allow the flow to continue within the selection statement
Control Statements
C, C++, Java, Ada, Turbo Pascal, BASIC also provide a default choice
Good idea to always use so you can detect an out of range value without causing a runtime or logic error
43
Control Statements
Iteration
Three primary types of iterative loops: conditional loops, counting loops and arbitrary loops 1) Conditional (logically controlled) loops
Number of iterations is determined by a boolean condition, and cannot be (usually) precalculated
Note that we cannot predict when this condition will become false
44
Control Statements
Two versions are provided for convenience we can always simulate one loop with the other (plus some conditionals)
See loops.cpp Clearly the difference is where each is more appropriate
45
Control Statements
Conditional loops are the most general kind of loops, and are really all that is needed in an imperative programming language However, many looping applications deal with arrays and sequences of values
For convenience and efficiency it is prudent to provide a looping structure geared toward these applications
Control Statements
We can (usually) precalculate the number of iterations based on the initial value, terminal value and increment
Ex: for (int i = 3; i <= N; i+=2) { i obtains values 3, 5, 7, , N (or N 1 if N is even) For N = 31, the number of iterations equals CEILING((TERM INIT+1)/INCR) or CEILING((N 3 + 1)/2) = CEILING((31 3 + 1)/2) = 15
Precalculation is nice because it allows the computer to base the loop on an iteration count (if it chooses to do so) which can be executed more quickly than conditional testing each time
47
Control Statements
Machine can use a register for the iteration count and not have to worry about obtaining operands for the comparisons at each iteration of the loop, something that must be done with a conditional loop
To allow precalculation and iteration counts to work, some restrictions must be made on the loop
Loop control variable cannot be altered by the programmer within the loop body Terminal value must be calculated only one time, when loop is first entered It will also speed things up if the loop control variable is an integer (or integral type) so no float operations are necessary
Control Structures
Pascal and Ada also do not allow an increment other than 1 or 1, and do not carry the value of the control variable past the end of the loop
In Pascal, the value is officially undefined, but in any Pascal implementation it will typically be one of two things: 1) The terminal value of the loop or 2) The terminal value + 1 or 1. 1) typically indicates that iteration counts are being used In Ada, the loop control variable is implicitly declared in the loop header, and becomes really undefined at the end of the loop accessing it afterward will cause an undeclared variable error
This is now generally accepted as a good idea, since it reduces side-effect problems of using loop control variables that were declared and assigned elsewhere. C++ and Java both allow (but do not require) this as well
49
Control Structures
Attitude in Pascal and Ada is that if you want more complex iteration (ex. increment other than 1 or 1, option of changing number of iterations during the loops execution) you should use a while loop
Control Statements
Now really anything goes and the pre-testexpr and post-body-expr are evaluated for each iteration of the loop Can certainly be used for a counting loop, as most of you have used it
Can also be used as an arbitrary loop to do more or less whatever programmer wants it to do
Added flexibility, with added danger The usual for C, C++ see for.cpp
51
"foreach" loop
Newer languages also have included a "foreach" loop to iterate through data
Key difference between "for" and "foreach"
"for" iterates through indexes (typically), which can be used to access an array / collection if desired
Loop control variable is typically an integer
"foreach" loop
We can iterate over a collection without having to know the implementation details of the collection
Allows for data hiding and improves error prevention We will likely discuss this more when we discuss object-oriented programming
53
"foreach" loop
Disadvantage
When accessing an array, we may want or need the index value
Ex: What if we want to change the data in the array or reorganize it
Ex: Sorting would difficult using "foreach"
54
Control Statements
3) Arbitrary Loops Now the loop is basically an infinite loop, with the programmer expected to break out of it explicitly at some point Ada allows this with the
loop
end loop; exit statement will break out of the loop, and can be put into an if statement Thus we can break out of the loop from more than one place
55
Control Statements
Although C, C++ and Java do not explicitly have this construct, you can certainly build it by making a while or for loop an infinite loop and using the break statement to break out
while (1) // C while (true) // Java
{
}
{
}
Again this feature adds flexibility, but makes code less readable and harder to debug
56
Control Statements
Unconditional Branching
Transfer execution from one section of code to another section of code Commonly known as the goto Used extensively in early languages which lacked block control structures
Ex. early FORTRAN and BASIC programs relied heavily on the goto
It was necessary then, but most modern languages contain block control structures
57
Control Statements
Even then computer scientists were aware of how problematic they could be
Spaghetti code that results is very difficult to read Modification of one code segment can significantly impact many parts of the program programmer must be aware of all places that can go to that code segment Debugging is very difficult it is hard to find and fix logic errors since all possible execution paths are difficult to trace
Control Statements
Unrestricted goto allows code segments that normally have only one entry and exit point to have many
Ex: What happens if you jump into the middle of a procedure (what about parameters?) or a while loop (condition is skipped)
Control Statements
60
Subprograms
Subprograms
Semi-independent blocks of code with the following basic characteristics:
Only one entry point the beginning of the subprograms, and execute when called:
Parameter information is passed to subprogram Caller execution is temporarily suspended, and subprogram executes When subprogram terminates, caller execution resumes at point directly following the subprogram call
61
Subprograms
Once defined, these can be used anywhere they are needed in a program
62
Subprograms
In order to have an effect on the overall program, a procedure needs to act on something other than just the variables local to the procedure. This can be done through:
Outputting data to the display or to a file Altering a (relatively) global variable that will be accessed/used later by a different part of the program Altering formal parameters such that the actual parameters in the caller are modified
This will be discussed in more detail soon
63
Subprograms
Functions can be thought of as code segments that calculate and return a single result
Modeled after math functions Used within expressions, where result value is substituted for the call The effect of functions on the overall program is the value returned by them. Thus, from an ideal (and mathematical) point of view, functions should have NO OTHER effect on the overall program
64
Subprograms
Should NOT modify global variables Should NOT alter actual parameters
C/C++/Java
Only have functions, no procedures
Subprograms
Local variables
How/when are they allocated?
Stack-dynamic:
Default in most modern imperative languages Required for recursive calls, since memory must be associated with each call, not each subprogram
Ex: Binary Search
mid = (left + right)/2;
Many different values for mid must be able to coexist, one for each call on the run-time stack Could not do it memory was statically allocated
66
Subprograms
Overhead is time for allocation and deallocation each time a subprogram is called
May not seem like a lot of time is needed, but it can add up if many calls are made in a program
Access must be indirect since actual memory location of variable will not be known until a subprogram call is made
Location in run-time stack depends upon calls made prior to current one, which can differ from run to run Also adds some time overhead
Static:
Used in languages that do not support recursion (ex. older FORTRAN)
67
Subprograms
Also optional in other languages, such as C and C++ Allow variables to retain values from call to call
Remember the lifetime is the duration of the program Ex: In CS1501 LZW algorithm writing codewords to a file, the bit buffer is static The leftover bits are kept in the buffer for the next call
68
Subprograms
Parameters
Parameters are vital to subprograms
When writing subprograms, programmer decides which is required for a given subprogram
69
Subprograms
Then programmer utilizes syntax/rules in language being used to achieve the desired option
Sometimes the syntax/rules of the language do not fit exactly with the 3 use options given In these cases programmer must be careful to use the parameters as he/she intends
Some definitions:
Formal Parameter:
Parameter specified in the subprogram header Only exists during duration of subprogram exec Sometimes called "parameter"
70
Subprograms
Actual Parameter:
Parameter specified in call of the subprogram May exist outside of the scope of the procedure Sometimes called just "argument"
71
Subprograms
Pass-by-Reference
Pass-by-Result Pass-by-Value-Result Pass-by-Name
You should be familiar with Pass-by-Value and Pass-by-Reference Others may be new to you Well discuss each
72
Subprograms
Pass-by-Value
Formal parameter is a copy of the actual parameter
i.e. get r-value of actual parameter and copy it into the formal parameter
Subprograms
Benefit is that actual parameters cannot be altered through manipulation of the formals
Also useful in some recursive calls, since a new copy is made with each call
Problem is that copying a parameter can be quite expensive, both in terms of time and memory
Ex: Consider an object with an array of 1000 floats
Object is copied with each call to the function If, for example, recursive calls are made, a lot of memory can be consumed very quickly
74
Subprograms
Implementation:
Using a run-time stack, this is straightforward
When subprogram is called, copy of actual parameter is placed into a local variable, which is stored on the run-time stack (in the activation record for the subprogram) During subprogram execution, formal parameter is used like any other local variable for the subprogram Only difference is that it is initialized via the actual parameter
75
Subprograms
Pass-by-Reference
Formal parameter is a reference to (or address of) the actual parameter variable
get l-value of actual param and copy it into the formal param, then access the actual param indirectly through the formal param
Used in Pascal (var parameters), in C (using explicit pointers) and C++ and PHP (&) Most appropriate for IN and OUT parameter passing, but can be used for all Actual param usually restricted to a variable
76
Subprograms
Benefit is that we can change or not change the actual parameter using the formal it is up to the programmer Also good that memory is saved only an address is copied Problem is that we can miss logic errors if we accidentally alter an actual parameter through the formal parameter Also some applications (ex: some recursion) dont work as well
We may not want change at one call to affect another call
77
Subprograms
Subprograms
Implementation:
Using run-time stack, address of actual is stored in activation record Actual is accessed indirectly in sub through its address
79
Subprograms
Pass-by-Result
Reference parameters are not an exact fit for out parameters
Ex: A procedure designed to read data from a file into an object
Here we dont care about what used to be in the object we just want to be sure that at the end the appropriate value is assigned
With reference parameters we COULD access the old value and use it if we wanted to (or by mistake) Pass-by-Result prevents this
80
Subprograms
In Pass-by-Result, actual parameter is not actually passed to the subprogram it only waits to have a value passed back to it Formal parameter is a local variable
During life of subprogram its value does not affect actual parameter at all At end of subprogram its value is passed back to the actual parameter
Subprograms
// Note: This is NOT real code int A[8]; for (int i = 0; i < 8; i++) A[i] = i; global int j = 2; foo(A[j]); output(A[]); sub foo(int param) { int temp = 25; j = 5; param = temp; } -----------------------------------------------Output: 0 1 25 3 4 5 6 7 // if address obtained // at call Output: 0 1 2 3 4 25 6 7 // if obtained at ret.
82
Subprograms
If used, address is typically obtained at call Ada 83 out parameters for simple types are ALMOST this, but the formal parameter value cannot be accessed within the sub (so it is not really a local variable)
Ada 95 changed out parameters to allow them to be accessed, fitting the Pass-By-Result model more closely
Implementation:
At sub call, actual param address is calculated and stored in run-time stack, as is the formal param (as a local) Final result of formal is copied back to actual address at end of sub
83
Subprograms
Pass-by-Value-Result
Now actual parameters value is passed to the formal parameter when subprogram is called, being stored and used as a local variable At the end of the subprogram the value is passed back to the actual parameter As the name indicates, this is a combination of Pass-by-Value and Pass-by-Result Used for IN and OUT parameters
84
Subprograms
If aliasing is NOT allowed/used, and if no exceptions occur in the subprogram the effect of value-result and reference is the same
Precondition: Actual parameter has value obtained previous to call
85
Subprograms
Subprograms
Idea is that language creators did not want to require the params to be passed in any specific way
They just wanted to require the in-out effect If the result could differ based on whether params are value-result or reference, then the program is erroneous
Up to programmer to NOT use aliases
Implementation:
Value + Result
87
Subprograms
Pass-by-Name
Definitely wackiest way of param passing
Subprograms
Thus the parameter value or address could change based on where/when in the subprogram it is evaluated However, the referencing environment used is that of the CALLER, not of the subprogram
So only changes within the subprogram that have a global effect will change its evaluation This also makes implementation more difficult
Subprograms
But it gets wacky when array elements or variable expressions are passed
Now changes within the subprogram can affect the index of the array or a variable within the expression
Can cause evaluation to differ in different parts of the subprogram
90
Subprograms
global int i = 0, var = 11, n = 5; global int A[2] = {4, 8}; foo(var, 2*n, A[i]); // all pass by name void foo(int x, int y, int z) { x = x + 1; output(var);
output(y);
output(z); i = i + 1; }
n = n + 1;
z = z + 1; z = z + 1;
output(y);
output(z); output(z);
91
Subprograms
Implementation:
It is not trivial to allow macro to be evaluated and reevaluated in environment of the caller Parameterless subprograms called thunks are used
Thunk evaluates parameter in current state of callers referencing environment Returns the resulting address or value
Overhead and confusing results are why this is not used in newer languages
92
Subprograms
Subprograms as Parameters
We allow variables as parameters so that we can access their values (or addresses) from within a subprogram Why not allow subprograms so that we can execute them from within a subprogram? Some languages do allow this (ex. Pascal, C++, PHP)
93
Subprograms
Can the parameter subprogram arguments differ in form from each other?
If so, how to type check and even check the number of arguments when the subprogram is actually called?
Easiest solution is to require the arguments to all have the same form
Header of parameter subprogram must be given within the header of the subprogram it is being passed to
Scope is also an issue what is the referencing environment of the subprogram that is being passed as a parameter? Three reasonable possibilities exist:
94
Subprograms
1) The referencing environment in which the parameter subprogram is CALLED: shallow binding 2) The referencing environment in which the parameter subprogram is DEFINED: deep binding
3) The referencing environment in which the parameter subprogram is PASSED as an argument: ad hoc binding
Note that shallow binding fits well with dynamic scoping and deep binding fits well with static scoping
95
Subprograms
Pascal and C++ both use deep binding Shallow binding is used by SNOBOL, which also uses dynamic scoping Ad hoc binding has never been used See fnparams.cpp
96
Subprograms
Enables programmer to use the same name for similar functions that take different argument types
97
Subprograms
Use: Make it easier for the programmer to use consistent names for subprograms
Without overloading: Programmer must make up different but similar names for subprograms that do similar things but for different types
Ex: abs(int) fabs(float) labs(long) Ex: ISort(int * A) FSort(float * A)
With overloading: Programmer uses the same name and the compiler decides which to use
Ex: abs(int) abs(float) abs(long) Ex: Sort(int * A) Sort(float * A)
98
Subprograms
Operator Overloading is the same idea, but with symbols rather than identifiers
We discussed these issues previously
See Slide 12 of cs1621b.ppt
99
Generics
Generics
Parametric polymorphism
One or more parameters are passed to a subprogram when it is instantiated (i.e. when the code is generated) indicating the types that will be used for the parameters in the subprogram call
Can also be used in conjunction with packages (Ada) and classes (C++)
Thus a single subprogram declaration can be used to generate many different callable subprograms, all with the same functionality
100
Generics
Motivation:
Programmers often apply data structures and algorithms to more than one data type
Ex. Sorting, Searching algos Ex. BST, PQ, Stack, Queue data structures
Even with overloading, the programmer must still write different (identical except for type) versions of the code Generics simply transfer the job of making the different versions from the programmer to the compiler automates the overloading process
Note that DIFFERENT VERSIONS of the code MUST STILL BE generated
101
Generics
So the reason we have generics is to save the programmer some time (and perhaps some confusion)
Generics
In C++, template instantiations can be explicit or implicit Implicit: generated automatically by the compiler when a call is seen with the appropriate arguments
Duplicate instantiations are merged into a single code segment Coercion cannot be done, since the types wont match the template correctly Saves programmer some typing
Generics
Java Generics
In Java 1.5 "generics" were added to the language It is somewhat misleading, since generic abilities were always built into the Java language
Collections were defined in terms of class Object, which is the superclass to other Java classes
104
Generics
However, retrieving objects back from the collection required explicit casting to the actual type if we wanted full access to them
ArrayList A = new ArrayList(); A.add(new String("Wacky")); String S = (String) A.remove(0);
Also any typing mistakes (mixing types in the collection unintentionally) could only be caught at run-time (via casting exceptions) Overall not bad, but some people thought type parameters should be allowed
105
Generics
JDK 1.5 added syntax very similar to that for C++ templates
However, it is very different from C++ templates (and Ada generics as well)
It is not really adding any new generic abilities to the language It is not creating new code for each version of the class or method It is designed to make collections of objects more type-safe See more details in the handout
106
Implementing Subprograms
What is involved when a subprogram is called, during its execution, and when it terminates?
This will differ depending on if recursion is allowed in a language or not
Most modern languages allow recursion, but original FORTRAN (up to FORTRAN 77) did not allow it
107
Implementing Subprograms
Implementing Subprograms
Return Value
If sub is a function
Local Variables
Static
Parameters
Return Address
Implementing Subprograms
Implementing Subprograms
Now multiple instances of an activation record can occur at the same time, so they must be created dynamically (at run-time), unlike in FORTRAN Lets look at some of the contents of an activation record
111
Implementing Subprograms
Temporaries
Local Variables
Temps and local variables are allocated within the subprog. call. In Pascal, C and C++, the local variables must be of fixed size. In Ada, they can be variable size (ex. arrays)
Parameters
Dynamic Link to previous call Static Link to NonLocals Return Address
Parameters, links to non-Locals and the return address are placed into the AR by the caller of the subprogram, so they are lower in the record
112
Implementing Subprograms
See rtstack.cpp
int x, y[5]; // address of x is 162 + (other AR stuff) float z; // address of z is 162 + (other AR stuff) // + 4 + 20
113
Implementing Subprograms
But the variable locations could be in different places on the run-time stack How to find them?
114
Implementing Subprograms
Implementing Subprograms
2) Display
Implementing Subprograms
Static Links
Due to rules of static scope, if a subprogram is called, its textual parent subprogram MUST be active
sub foo { sub fum { } } main { // cannot call fum directly }
117
Implementing Subprograms
However, textual parent does NOT have to be previous call on run-time stack
So dynamic link in AR is not enough (but would work for dynamic scoping)
sub foo { sub innerA { } sub innerB { innerA; } innerB; } main { foo;
innerA
innerB foo
}
118
Implementing Subprograms
Static links connect an AR to the AR of the subs textual parent, no matter where previously on the RT stack it is How is this used to access nonlocal variables?
Can be determined and maintained based on the nesting depths of the subprograms that are called
The difference in the nesting depths between the sub using a nonlocal variable and the sub in which the nonlocal is declared is equal to the number of static links that must be crossed to find the correct AR for the variable
119
Implementing Subprograms
This difference can be stored for each variable when the program is compiled, so that at run-time finding the variable is simple
sub parent { var X, Y sub child1 { var X, Z sub grand1 { var Z } } sub child2 { var Y call child1 } } main { call parent }
Implementing Subprograms
What actually happens when a sub is called? AR for textual parent of sub must be located on the run-time stack, so that the static link can be linked to it
A clear (but inefficient) way to do this is to follow dynamic links down the RTS until the AR for the parent sub is found A better way can take advantage of the fact that the calling sub and the called sub must be relatives in the declaration tree
Calling sub could grandparent) Calling sub could Calling sub could Calling sub could recursion) Calling sub could be parent of called sub (but not be called sub (direct recursion) be a sibling of called sub be a descendent of called sub (indirect be a niece of called sub
121
Implementing Subprograms
So instead of following dynamic links, at compile-time we can pre-calculate the number of static links (from caller) to follow to find the appropriate textual parent AR
Always equal to: nesting_depth (calling sub) nesting_depth(called sub) + 1 Calling sub could be parent of called sub
X (X+1) + 1 = 0 static links (user caller's AR)
Calling sub could be a descendent of called sub (indirect recursion) Calling sub could be a niece of called sub
Follow diff. in nesting depth + 1 static links
122
Implementing Subprogams
procedure Bigsub is procedure A(Flag: Boolean) is procedure B is ... A(false); end; -- B begin -- A if flag then B; else C; end; -- A procedure C is procedure D is here end; -- D ... D; end; -- C begin -- Bigsub A(true); end; -- Bigsub
D dynamic link to C static link to C return addr. to C dynamic link to A static link to Bigsub return addr. to A param flag ( = false) dynamic link to B static link to Bigsub return addr. to B dynamic link to A static link to A return addr. to A param flag ( = true) dynamic link to Bigsub static link to Bigsub return addr. to Bigsub dynamic link to caller static link return addr.
Bigsub
123
Implementing Subprograms
Implementing Subprograms
Display
Uses a single array to store links to ARs at all relevant nesting depths
To access a nonlocal at a given nesting depth, we just follow the display entry for that depth, then the local_offset
Never more than one link to follow
Array is updated as subs are called and as they terminate Generally faster than static links if many nesting levels are used We will skip the details here read the text
125
Implementing Subprograms
But it is actually much easier to handle, since block entry and exit is always the same
Parent block goes to child block
Implementing Subprograms
Simply push new block declarations onto run-time stack, and pop them when block terminates But we only have one activation record, so no links are required
"Non-locals" can be accessed just like locals
127
Implementing Subprograms
Dynamic Scoping
When a non-local variable is accessed, we always follow the dynamic links until the correct declaration is found
Clearly could differ depending upon call sequence But the mechanics are actually simple
ARs must store names of local variables so we know where to stop the search
In static scoping the names are not needed just the offsets
128
Data Abstraction
Data abstraction:
New type can be used without required detailed knowledge of how it is implemented
We don't need to know the details of how it is stored in memory
Data Abstraction
2) The representation of the objects is hidden from users of the ADT DATA HIDING
Objects can only be manipulated via the provided interface
130
Data Abstraction
Ex: Stack
Data: something that can store and access multiple data values in the manner dictated by the operations Operations:
Push add new value to top of stack Pop remove top value from stack Top view top value (or a copy) without removing Empty is stack empty
User of stack only needs to know the parameters and effect of each operation to use a stack correctly Implementation could be an array, a linked-list, or maybe something different
Does not affect use Implementer can hide these details from the user through private declarations
131
Data Abstraction
The idea of data abstraction was not always supported by programming languages
Ex: FORTRAN, Pascal, C did not fully support either encapsulation or data hiding
When learning good programming style, users tried to "simulate" data abstraction
Logically group type definitions, procedures and functions together as a unit Only access the data type via the procedures and functions
Naturally, this was at the programmer's discretion See ADT.p
132
Data Abstraction
Encapsulation units that contain all details of the new type Access modifiers that prevent access to internal details of the ADT from outside the encapsulation unit
Characteristics of OOP
1) Data abstraction: encapsulation + information-hiding
The operations for manipulating data are considered to be part of the data type (encapsulated) The implementation details of the data type (both the structure of the data and the implementation of the operations) are separate from their specifications and (possibly) hidden from the user
As we discussed with ADTs
134
OOP
2) Inheritance
The characteristics of an ADT (data + operations) can be passed on to a subtype
Subtype can also add new data and operations
Allows programmer to build new (derived) types from old (parent) ones
Common data/operations do not have to be rewritten (or copied) Operations that are slightly different in derived type can be rewritten (overridden) for that type New data/operations tailor the derived type to the problem at hand Parent type is unchanged and may (sometimes) be used together with derived type
135
OOP
136
OOP
3) Polymorphism
Variables of a parent class can also be assigned objects of a subclass (or subclass of a subclass) Operations used with a variable are based upon the class of the object currently stored (could be a parent type object or a derived type object)
Operations may have been overridden in the derived class Dynamic binding allows parent and derived objects to be used together in a logical way
137
OOP
Polymorphism allows these different objects to be accessed consistently within the same array Think about how you could do the code above in C or Pascal
It would not be easy!
138
OOP
One option: Make one giant struct or record to contain all of the data, including a union or variant
Base class would use only the core data items Derived classes would use additional data items as provided in the union or variant To do the operations, we would need a switch or case to test which type the variable is, so that it can be written out appropriately
OOP
OO Languages
1) Smalltalk was the first and purest OOL
All data (even numeric literals) are objects, and are all descendents of class Object
Objects are all allocated from the heap, and implicitly deallocated (garbage collection) Variables are references, with implicit dereferencing Execution of a program (logically) involves objects sending messages to each other, executing methods, and responding back
So the data is driving the execution, not the control statements
140
OOP
OOP
This propagation of messages can sometimes lead to very short code, if variables are eliminated
142
OOP
Now we cascade the messages to allow fewer statements (also do: loop iterates through characters in a string, so we dont need the loop counter
(((Prompter prompt: 'Enter your name' default:'') select: [ :c | c isLetter ]) size printNl.
Now the select: loop generates a string based on the condition in the block
143
OOP
OOP
Advantage: Derived class can likely implement its methods more efficiently with access to parent data Disadvantage: Change in parent class implementation will likely require change in derived class implementation
Ex. Traversable stack
145
OOP
Variables have no types since they are only used to refer to objects, not to determine the messages an object can receive
Clearly some liabilities with this approach
Slows language down due to run-time overhead Programmer type errors cannot be caught until execution time
146
OOP
student.cls as an example of a subclass studentTest.st as an example showing polymorphic access twodarry.cls as another subclass example
See twodTest.st
OOP
OOP
C++ Inheritance
Do not need a superclass (no Object base class for all other classes) Multiple inheritance is allowed
Complex and difficult to use
OOP
C++ polymorphism
By default all functions are statically bound
Recall that this allows faster execution, a goal of the C++ language However true polymorphism can not be utilized with statically bound functions
Dynamic binding is enabled by using virtual functions and pointers (or references)
This tells the compiler not to bind the function name to the code until run-time
See poly.cpp
150
OOP
Like C++:
Access can be private, public or protected Static binding can optionally be used to improve run-time speed Overall syntax for member data and function access Variables are typed
151
OOP
Disadvantage: Can become complicated when interfaces and inheritance are both used
OOP
OOL Implementation
Data:
Typically a record/struct type of storage is used Class Instance Record (CIR)
Data members are accessed by name, in the same way as records Subclass adds extra data to CIR of parent class Private access enforced by limiting visibility of the data
153
OOP
Subprograms:
Static binding
Subprograms that will be called are determined by the variable type Variable types are known at compile time and code can be determined then
Dynamic Binding:
Subprograms that will be called are determined by the objects type, not the variables type Objects stored in a variable are determined at run time Appropriate links must be stored with the object
But they are the same for all objects of that class
Virtual Method Table (VMT) used to store links to all pertinent subprograms
154
Parallelism
Parallelism
Parallelism
157
Parallelism
We must have a mechanism that allows Task B to pause until the data is available
B could loop and keep checking for data B could wait for some signal from A
158
Parallelism
Competition Synchronization Both tasks are competing for the same shared resource If one or both tasks modify the data, it could cause data inconsistencies Ex: Task A and Task B are MAC machine accesses of the same bank account
Task Task Task Task Task Task A B A A B B checks the balance: $200 checks the balance: $200 withdraws $200 updates balance to $0 withdraws $200 updates balance to $-200
We must have some mechanism that ensures MUTUAL EXCLUSION for CRITICAL DATA
We could have a LOCK on the data, or a similar mechanism allowing only one task to access it at a time
159
Parallelism
Synchronization Mechanisms
Semaphores
Devised by Dijkstra
Basically guards that are placed around code
P must succeed to gain access to code
Decrements a counter when it succeeds
V executes when critical section ends Based on initial value of counter, we can control how many tasks are allowed to access the critical section at once
Parallelism
Monitors
Devised by Hansen and Hoare
Critical data section is part of a data object that allows only one task entry at a time
Better than semaphores for competition synchronization, because mechanism is built into the monitor
Harder to programmer to mess up
Parallelism
Message Passing
Proposed by Hansen and Hoare
Parallelism
Idea is that we know exactly where in the code both tasks will be when a rendezvous occurs
So even though tasks execute asynchronously, we synchronize them with respect to each other at a rendezvous
Ex: Ada
163
Parallelism
164
Parallelism
Deadlock
When a (shared) resource has restricted access, it can cause a task to stop execution
Wait in a semaphore queue Wait in a monitor queue Wait in an accept queue
Parallelism
Starvation
To combat deadlock, most languages allow a task to release a resource prematurely in some circumstances
Ex: If one of the Tasks in the previous example release the semaphore, the other can proceed
Under these circumstances there is the possibility that a task may never acquire all of the resources that it needs at the time it needs them starvation
Parallelism
Stop
Immediately frees locked objects Can lead to data inconsistency
167
Prolog
168
Prolog
Prolog
If a given assertion succeeds, execution proceeds to the next one If a given assertion fails, execution backtracks and attempts to re-satisfy the previous assertion
Prolog
Instantiated
Variable is associated with a value
Once a variable is instantiated, it keeps that value, and all occurrences of that variable within the same scope have that value
Cannot be re-assigned in sense of imperative languages However, if execution backtracks past the point at which it was instantiated, it can again become uninstantiated
Prolog
Prolog
173
Prolog
If a subgoal in a rule fails at any point, we backtrack and attempt to resatisfy a previously satisfied subgoal
When resatisfying a subgoal, the db search resumes from the point at which it succeeded the first time
See recurse.pl
174
Prolog Lists
175