Sunteți pe pagina 1din 12

Symbol Tables

The job of the symbol table is to store all the names of the program and information about each name In block structured languages, roughly speaking, the symbol table collects information from declarations and uses that information whenever a name is used later in the program
this information could be part of the syntax tree, but is put into a table for efficient access to names If there are different occurrences of the same name, the symbol table assists in name resolution

Either the parser or the lexical analyzer can do the job of inserting names into the symbol table (as long as scope information is given to the lexer)
Information can be added throughout compilation
1

Symbol Table Entries: Different types of names


Variables (identifiers)
character string (lexeme), may have limits on number of characters data type, storage class (if not already implied by the data type), name and lexical level of block in which it is declared, other access information, if necessary, such as modifiability constraints, base address and memory offset, after allocation

Arrays
Also needs number of dimensions Upper and lower bounds of each dimension

Records and structures


List of fields Information about each field

Functions and Procedures


Number and types of parameters, Type of return value
2

Symbol Table Representation


The two main operations are
Insert ( name ) makes an entry for this name Lookup ( name ) finds the relevant occurrence of the name by searching the table

Lookups occur a lot more often than insert Hash tables most often chosen to represent symbol tables as they have good time complexity for lookup
0 1

name

Hash function

Scope Analysis
The scope of a name is tied to the idea of a block in the programming language
Standard blocks (statement sequences, sometimes if statement) Procedures and functions Program (global program level) Universe (predefined functions, etc.)

Names must be unique within the block in which they are declared (no two objects with the same name in one block)
There are some languages with exceptions for different types

Name resolution: a use of a name should refer to the most local enclosing block that has a declaration for that name.

Declaration before Use?


We are dealing primarily with languages in which there are declarations of names required
Names of variables, constants, arrays, etc. must be declared before use Names of functions and procedures vary C requires functions and procedures to also be declared before use, or at least given a prototype Java does not require this for methods

Scope of a name (in a statically scoped language):


The scope of a constant, variable, array, etc. is from the end of its definition to the end of the block in which it is declared The scope of a function or procedure name In a declaration before use language:
is from the beginning of its definition to the end of the block in which it is declared Allows recursive functions

Otherwise the scope is the entire block in which it is declared


5

Further Structure of Symbol Table


For nested scopes, we may use lists of hash tables, with one element of the list for each scope The lookup function will first search the current lexical level table and then continue on up the list, using the first occurrence of the name that it finds Level 0
Level 3 c x Level 2b a Level 2a x Level 1 c
w x

Parts of the table not currently active may be kept for future semantic analysis

More Symbol Table Functions


In addition to lookup and insert, the symbol table will also need
initializeScope (level) , when a block is entered to create a new hash table entry in the symbol table list finializeScope (level), on block exit put the current hash table into a background list Essentially makes a tree structure, where one child may be distinguised as the active block

The symbol tables shown so far are all for the program being compiled, also needed is a way to look up names in the universe
System-defined names

Alternate Representation
The lists of hash tables can be inefficient for lookup since the system has to search up the list of lexical levels
More names tend to be declared at level 0, thus making the most common occurrence be the most expensive

An optimization of the symbol table as lists of hash tables is to keep one giant hash table
Within that table each name will have a list of occurrences identified by lexical level

This representation keeps the (essentially) constant time lookup


But makes leaving a block more expensive as hash table must be searched to find all entries that need to be removed and stored elsewhere
8

Static Scope
The scoping system described so far assumes that the scope rules are for static scoping
The static problem layout of enclosing blocks determines the scoping of a name

There are also languages with dynamic scoping


The scoping of a name depends on the call structure of the program at run-time The name resolution will be to the closest block on the call stack of a block with a declaration of that name the most recently called function or block

Object-Oriented Scoping
Languages like Java must keep symbol tables for
The code being compiled Any external classes that are known and referenced inside the code The inheritance hierarchy above the class containing the code

One method of implementation is to attach a symbol table to each class with two nesting hierarchies
One for lexical scoping inside individual methods One to follow the inheritance hierarchy of the class

Resolving a name
First consult the lexically scoped symbol table If not found, search the classes in the inheritance hierarchy If not found, search the global name space Consists of the current package and any imported packages
10

Testing and Error Recovery


If a name is used, but the lookup fails to find any definition
Give an error but enter the name with a dummy type information so that further uses do not also trigger errors

If a name is defined twice,


Give an ambiguity error, choose which type to use in later analysis, usually the first

Testing cases
Include all types of correct declarations Incorrect cases may include Definition of an ambiguous name Definition without a name Meaningless recursive definitions (in some types of structures) References to an undefined name
11

References
Keith Cooper and Linda Torczon, Engineering a Compiler, Elsevier, 2004. Kenneth C. Louden, Compiler Construction: Principles and Practices, PWS Publishing, 1997. Per Brinch Hansen, On Pascal Compilers, Prentice-Hall, 1985. Out of print. Aho, Lam, Sethi, and Ullman, Compilers: Principles, Techniques, and Tools. Addison-Wesley, 2006. (The purple dragon book)

12

S-ar putea să vă placă și