Documente Academic
Documente Profesional
Documente Cultură
Definition:- Symbol table is an abstruct data structure for storing the information. The table
differs from the other data structure in method of accessibility. The other data structures are
index/ pointh accessible where as table is context accessible.
The table Entry here of the form:
Key
Associated Data
length
2. Ordered
In this method, the table is maintained is sorted form based on the variables name. In
such circumstances an insertion must be accomplished by a lookup procedure which
determines where in the symbol table the variable attribute should be placed. The actual
insertion of new entry may generate. Some additional overhead primarily because other
entries may have to be moved to get the position of insertion.
For searching a particular key, we apply Binary-Search Technique. Suppose
( K1,V1) .... (Kmid,Vmid) .... (Kn,Vn) are the entries in the table.
Here mid= n div 2
Algorithm is described below
Find(low, high)
While low < high do
begin
mid = (low+high) div 2;
if k < entry[mid].key then
high = mid;
Find(low,high);
else
low = mid + 1;
Find(low,high);
end
So the key for which we are sending is placed in high or low variable.
The time complexity of this algorithm is O(log n).
Methods of Sorting:
i.
Array
We short the entries in the table in some particular ..... with arrays.
With arrays, the searching of particular entry is very fast. But insertion is time
consuming.
For inserting particular entry, first we have to find its position to locate it. And all the
entries below it are shifted down.
ii.
Index
With this method, the insertion is easy.
ptr
Here
only ptr field is manipulated. We have to do nothing with table.
So, it
is easier for insertion.
iii.
Linked List
In this approach, we combined the array and linked list.
A
B
E
J
Here, array is for searching and linked list is used for insertion and deletion. Here there is
no actual limit of number of entries in the table.
To search a key, starting symbol can be found by comparison and then entries can
be counted to find the exact match.
3. Tree
In a binary tree structured symbol table, each node have the following format:
Left ptr
Key
Value
Right ptr
Here two new fields are present in the record structure. Thus two fields are left pointer
and right pointer. Access to the tree is gained through the ... node. A search proceeds
down the structural links of the tree until the desired node is found as a NULL link field
is encounted.
Lets take an example of storing a string abcd in this format.
LP
value
value
RP
b
NULL
RP
c
b
value
RP
d
Here in case of balanced binary tree the time complexity of a searching a node among
the n node is given by O(log2 n)
4. Hash
A hashing function or key to address transformation is defined as a mapping H:
KA. That is, a hashing function H takes as its argument a variable named and
produces a table address at which the set of attributes for that variables are stored.
With this method the search time is essentially independent of the number of records
in the table.
H
K
A
table space
address space
Let n be the number of entries in the table we define loading factor,
load factor = no of entries(n) / total address space (|A|)
If load factor is high, it is difficult to manage the table.
Now in practical we have
K>>|A|
So if we assign more than one key to one address, there is a problem of collision.
Pre conditioning:Each element of K usually contains characters which are numeric, alphabetic. The
individual characters of a name are not particularly amenable to arithmetical and
logical operation. The proun of transforming a variables name to a form which can be
easily manipulated by a hashing function is called pre conditioning.
Pre conditioning can be handled most efficiently by using the num erically
coded internal representation. Example: ASCII on FBCDIC of each character in the
name.
There are number of hashing functions that are applicable tp symbol table
handling.
1) Division Method:The most widely accepted hashing function s division method which is
defined as, H(x)= (x mod m)+1 for division m.
In mapping keys to addresses, the division method preserves, to a certain
extent, the unitormity that exist in a key set. Keys which are closely bunched
together are mapped to unique address.
In general, if many key are congruent modulo d, and m is not relatively
prime to d, then using m a a divisor can result in poor performance of the
division method.
2) Mid-square method:A second hashing function that performs reasonably well is the midsquare method. In this method, a key multiplied by itself and an address is
obtained by.... bits or digits at both ends of the product until the number of bits
or digits left is equal to the desired address length.
3) Folding Method:-
For the folding method, a key is partitioned into a number of parts, each of
which has the same length as the required address with the possible exception
of the last part. The parts are then added together, ignoring the final carry, to
form an address if the keys are in binary form, the exclusive- OR operation
may be substituted for addition.
Folding is a hashing function which is useful for compressing multiword
keys so that other hashing functions can be used.
4) Length-dependent method:In this approach, the length of the variable name is used in conjunction
with some subpart of the name to produce either a table address directly, or
more commonly, an intermediate ke. The fynction that produced the best
results summed the internal binary representation of the first and last
characters and the length of variable have shifted left four binary places.
A hashing function is a many-to-one mapping. That is, the name space K is in general
much longer than the address space. A .. onto which K is mapped. Of course, two
records cannot occupy the same location and therefore some methods must be used to
resolve the collision that can result.
Open Addressing:
To minimize the number of collisions, a hashing function should map the
variable names in a program to the address space as unitarily as possible.
With open addressing, if a variable name x is mapped to a storage location d,
and this location is already occupied, then other locations in the table are scanned
until a free record location is found for the new record. The cocetion are scanned
according to a sequence which can be defined in many ways. The simplest technique
for handling collision is to use the following sequence:
d, d+1, ...., m-1, m, 1, 2, ...., d-1, ....
A free record location is always found if at last one is available, otherwise the
search halts after scanning m locations. When a record is looked up, the save sequence
of locations is scanned until that record is located, as until an empty record position is
found. This method of collision resolution is called linear probing.
There are three main difficulties with the open-addressing:
1) When trying to locate an open location for record insertion, there is in many
instances, the necessity to examine records that do not have the same initial
hash value.
2) A table-overflow situation cannot be satisfactorily handled using open
addressing. If an overflow occurs, the entire table must be recognized.
3) Difficulty of physically deleting records.
Chaining:
Chaining can be used in a variety of ways to handle overflow records. This method
involves the chaining of colliding records into a special overflow area which is
separate from the prime area. A separate chain kept for each set of colliding records
and conse a pointer field must accompany each record in a primary or an overflow
location.
Figure shows this:
Variable
ADD
Empty
B
Empty
Empty
Value
Link
1
3
The algorithm performs the insertion and deletion by first examining the prime
area locations, as determined by the hashing function and then the overflow area if
necessary. Note that for explicit declaration, the algorithm can be improved by having
insertion performed at the front of list of unordered overflow records. This... allows
for fast insertion; however it has not guarantee that duplicate declarations will be
detected.
Here disadvantage is that the additional storage is required to store the links.
But its performance and versatility is superior to open addressing. The open
addressing scheme is easier to implement and because of its efficient utilization of
storage. It should be considered when implementing the compiler on a small machine.
Symbol-Table Organization for Blocked Structured Language:By a block- structured language, we mean a language which a module can contain
nested submodules and each sub modules can have its own set of locally declared variables.
A variable declared within a module unless the same variable name is redefined within a sub
module of A. The redefinition of a variable holds throughout the scope of the sub modules.
-
The insertion operation is very simple in a stack symbol table. New records are added
at the top location in the stack. Declaration involving duplicate names can exist in blockstructured languages, but they cannot occur in the same block.
The deletion operation involves the linear search of the table from the top of the
bottom. The search must be conducted in this order to guarantee that the latest
occurrence of a variable with a particular name is located first. ..... because sets of
symbol table records are discarded as blocks are terminated. The average length of
search for a stack symbol table will be less than for the corresponding unordered symbol
table.
2. Stack implemented tree structural tables
In block-structured language, when the compilation of block is completed, the block
must be removed from the table. As a result, the problem of deleting table records must
be addressed. In a tree, the steps to delete a record are:
- Locate the position of record in the tree
- Remove the record from the tree by altering the structural links so as to bypass the
record.
- Rebalance the tree if the deletion of the record has left the tree unbalanced.
It should be observed that the symbol table is maintained as a stack, when a block is
entered during compilation, the value of TOS is updated. As declarations are
encountered, records are inserted on the top the symbol table. The tree for a particular
block can balanced as records are inserted.
For deletion operation some strategy is used to locate the latest occurrence of desired
record. The search must begin at the tree structure for the last block to be entered and
proceed down to the tree for the 1st block entered.
3. Stack Implemented Hash-structured Symbol Table:
The insertion and deletion operations for stack implemented hash symbol tables are
essentially same as for non blocked structure language because local variables are
deleted as blocks are compiled in a blocked structured language.
Back end of a compiler:
-