Department of Computer Science Stage Second rogrammlng Languages 0.9:70 Topic evlew CompllaLlon lnLerpreLaLlon Assembler CompllaLlon Phases MachlnedependenL and MachlnelndependenL Summary evlew CompllaLlon A Compller ls a program LhaL reads a program wrlLLen ln one language (Lhe source language) and LranslaLes lL lnLo an equlvalenL program ln anoLher language (Lhe LargeL language) [u = u + b ;j Muchine code [1000101001111j lnLerpreLaLlon lnLerpreLers A program LhaL LranslaLes a user's hlghlevel program (Source rogram) lnLo machlne program ( 1argeL rogram) aL Lhe same Llme Assembler A program LhaL converLs a program wrlLLen ln assembly language lnLo machlne code Assembler Assembly Language Low Level Language Machlne Code Preprocessor Compiler Assembler Loader / Linker-editor Source Program Modified Source Program Target Assembly Program Relocatable Machine Code Absolute Machine Code Library , Relocatable Object Files Language rocesslng SysLem #lncludelosLreamh vold maln() lnL a clna couLa Comp||at|on hases !ase 1- LexicaI AnaIysis: n a compiler, linear analysis is called lexical analysis or scanning. Example position = initiaI + rate * 60 would be grouped into the following tokens: position = initiaI + rate * 60; Lexeme Token Position The dentifier position. = The Assignment Symbol initial The dentifier nitial. + The Plus Sign rate The dentifier Rate. * The Multiplication Sign 60 The Number 60 ; The Semi-colon Symbol !ase 2- Syntax AnaIysis: Hierarchical analysis is called parsing or syntax analysis, it involves grouping the tokens of the source program into grammatical phrases that are used by the compiler to synthesize output. the grammatical phrases of the source program are represented by a parse tree . !arse tree for: position =initiaI + rate*60; Assignment Statement = identifier position Expression Expression Expression Expression + * identifier rate identifier initial Number 60 Expression !ase 3- Semantic AnaIysis: checks the source program for semantic errors and gathers type information for the subsequent code- generation phase. uses the hierarchical structure determined by the syntax-analysis phase to identify the operators and operands of expressions and statements. An important component of semantic analysis is type checking. Here the compiler checks that each operator has operands that are permitted by the source language specification. For example, a real number is used to index an or when a binary arithmetic operator is applied to an integer and real. !osition + initial = rate * 60 !osition + initial = rate * 60.0 inttoreal Convert by semantic anaIysis !ase 4- Intermediate Code Generator: After syntax and semantic analysis, some compilers generate an explicit intermediate representation of the source program. We can think of this intermediate representation as a program for an abstract machine. This intermediate representation should have two important properties; it should be easy to produce, and easy to transIate into the target program. temp1 = inttoreal(60) temp2 = id3 * temp1 temp3 = id2 + temp2 id1 =temp3 !ase 5- Code Optimization : The code optimization phase attempts to improve the intermediate code, so that faster-running machine code will result. temp1 = inttoreal(60) temp2 = id3 * temp1 temp1 = id3 * 60.0 temp3 = id2 + temp2 id1 = id2+ temp1 id1 =temp3 After perform Optimization !ase 6- Code Generation: The final phase of the compiler is the generation of target code, consisting normally of relocatable machine code or assembly code. Memory locations are selected for each of the variables used by the program. Then, intermediate instructions are each translated into a sequence of machine instructions that perform the same task. MOVF id3, R2 MULF #60.0, R2 MOVF id2, R1 ADDF R2, R1 MOVF R1, id1 |t|ona| Comp||at|on 1erm|no|og|es Loa mo|e (execuLable lmage) Lhe user and sysLem code LogeLher L|nk|ng an |oa|ng Lhe process of collecLlng sysLem program unlLs and llnklng Lhem Lo a user program ,ach|neepenent an ,ach|ne|nepenent ,ach|neepenent ls a Lerm for appllcaLlon sofLware LhaL runs only on a parLlcular Lype of compuLer mach|ne|nepenent A Lerm applled Lo sofLware LhaL ls noL dependenL on Lhe properLles of a parLlcular machlne and can Lherefore be used on any machlne Also called mocbloeloJepeoJeot or ctossplotfotm pottoble eclaraLlons and 1ypes aLa 1ypes CompuLers manlpulaLe sequences of blLs 8uL mosL programs manlpulaLe more general daLa numbers SLrlng LlsLs
aLa 1ypes rogrammlng languages provlde daLa Lypes LhaL ralse Lhe level of absLracLlon from blLs Lo daLa 8uL compuLer hardware only knows abouL blLs! JhaL ls a Lype? o type ls o set of voloes o type ls bolltlo type ot o composlte type ClasslflcaLlon of 1ypes 8ullLln 1ypes 8ullLln boolean characLer lnLeger real (floaL) 1helr lmplemenLaLlon varles across languages 8ullLln 1ypes numerlc 1ypes MosL languages supporL lnLegers and floaLs Some languages supporL oLher numerlc Lypes Complex numbers (e orLran) Slgned and unslgned lnLegers (e C) Some languages dlsLlngulsh numerlc Lypes dependlng on Lhelr preclslon Slngle vs double preclslon numbers C's int (4 byLes) and long (8 byLes) ClasslflcaLlon of 1ypes LnumeraLlons LnumeraLlons lmprove program readablllLy and error checklng 1hey were flrsL lnLroduced ln ascal type weekday = (sun,mon,tue,wed,thu,fri,sat); 1hey deflne an order so Lhey can be used ln enumeraLlonconLrolled loops 1he same feaLure ls avallable ln C enum weekday ,sun, mon, tue, wed, thu, fri, sat,; ClasslflcaLlon of 1ypesComposlLe 1ypes ComposlLe o f one or more slmpler Lypes Lxamples ecords varlanL ecords Arrays SeLs olnLers LlsLs lles varlables A varlable ls an absLracLlon of a memory cell varlables can be characLerlzed as a sexLuple of aLLrlbuLes name Address value 1ype LlfeLlme Scope varlables ALLrlbuLes -ame name of Lhe varlable ress Lhe memory address wlLh whlch lL ls assoclaLed A varlable may have dlfferenL addresses aL dlfferenL Llmes durlng execuLlon lf Lwo varlable names can be used Lo access Lhe same memory locaLlon Lhey are called allases varlables ALLrlbuLes 1ype deLermlnes Lhe range of values of varlables and Lhe seL of operaLlons LhaL are deflned for values of LhaL Lype Ia|e Lhe conLenLs of Lhe locaLlon wlLh whlch Lhe varlable ls assoclaLed varlables ALLrlbuLes ||fet|me of a varlable ls Lhe Llme durlng whlch lL ls bound Lo a parLlcular memory cell scope of a varlable ls Lhe range of sLaLemenLs over whlch lL ls vlslble blndlng A bloJlo ls an assoclaLlon such as beLween an aLLrlbuLe and an enLlLy or beLween an operaLlon and a symbol loJlo ls Lhe assoclaLlon of aLLrlbuLes wlLh program enLlLles loJlo tlme ls Lhe Llme aL whlch a blndlng Lakes place vlslblllLy scope ls used Lo deflne Lhe exLenL of lnformaLlon hldlngLhaL ls Lhe vlslblllLy or accesslblllLy of varlables from dlfferenL parLs of Lhe program vlslblllLy publlc prlvaLe proLecLed varlable lnlLlallzaLlon 1he blndlng of a varlable Lo a value aL Lhe Llme lL ls bound Lo sLorage ls called loltlollzotloo lnlLlallzaLlon ls ofLen done on Lhe declaraLlon sLaLemenL eg ln C# int sum = 0; 1ype Checklng 1ype checkinq ls Lhe process of ensurlng LhaL a program obeys Lhe language's Lype compaLlblllLy rules 1ype checkinq ls Lhe acLlvlLy of ensurlng LhaL Lhe operands of an operaLor are of compaLlble Lypes A compotlble type ls legal for Lhe operaLor arbage collection Every modern programming language allows programmers to allocate new storage dynamically New records, arrays, objects, etc. Every modern language needs facilities for reclaiming and recycling the storage used by programs arbage collection t's usually the most complex aspect of the run- time system for any modern language (Java, C#, ML, Lisp, Scheme, .) What is garbage? s reclaiming and recycling the storage used by programs. A value is garbage if it will not be used in any subsequent computation by the program end