Sunteți pe pagina 1din 5

System Software and Compiler Lab

Token Separation

Aim
To write a program to perform token separation for a given subset of a language.

Description
Scanning is the first phase of a compiler in which the source program is read character by
character and then grouped in to various tokens. Token is defined as sequence of characters with
collective meaning. The various tokens could be identifiers, keywords, operators, punctuations,
constants, etc. The input is a program written in any high level language and the output is stream
of tokens. Regular expressions can be used for implementing this token separation

Algorithm
Step 1: Read the content of the file using File Reader.

Step 2: Separate the string with the delimiter space using String Tokenizer.

Step 3: Match the String with the pattern using Regular Expression.

Step 4: Group the tokens as identifier ,keywords ,operators ,punctations etc… and display it
using the given format <keyword,int>.

Program
package compiler;

importjava.io.BufferedReader;

importjava.io.FileNotFoundException;

importjava.io.FileReader;

importjava.io.IOException;

importjava.util.StringTokenizer;

importjava.util.regex.Matcher;

importjava.util.regex.Pattern;

public class Token {

Ex. No. 1 | Token Separation 1


System Software and Compiler Lab

public static void main(String[] args) throws IOException {

System.out.println("Token Seperation");

String code="";

try {

FileReaderfr=new FileReader("E://K7_Eclipse_ws//lab//src//compiler//samplein.txt");

BufferedReaderbr=new BufferedReader(fr);

while(br.ready()){

code=code+br.readLine();

} catch (FileNotFoundException ex) {

System.out.println("File Does Not Exist");

StringTokenizerst= new StringTokenizer(code,"");

String str;

Pattern keypath=Pattern.compile("public|static|void|class|package|int|float|char|String");

Pattern idpath=Pattern.compile("[a-zA-Z]([a-zA-Z]|[0-9])*");

Pattern oppath=Pattern.compile("[+|-|*|/|>|<|=]");

Pattern numpath=Pattern.compile("[0-9]+");

Pattern punpath=Pattern.compile("[|;|,|{|}|(|)|]");

Matcher kmatch;

Matcher idmatch;

Matcher opmatch;

Matcher nummatch;

Matcher punmatch;

Ex. No. 1 | Token Separation 2


System Software and Compiler Lab

while(st.hasMoreElements()){

str=st.nextToken();

kmatch=keypath.matcher(str);

idmatch=idpath.matcher(str);

opmatch=oppath.matcher(str);

nummatch=numpath.matcher(str);

punmatch=punpath.matcher(str);

if(kmatch.matches()){

System.out.println("<Keyword,"+str+">");

else if(idmatch.matches()){

System.out.println("<Identifier,"+str+">");

else if(opmatch.matches()){

System.out.println("<Operator,"+str+">");

else if(nummatch.matches()){

System.out.println("<Constant,"+str+">");

else if(punmatch.matches()){

System.out.println("<Punctuation,"+str+">");

Ex. No. 1 | Token Separation 3


System Software and Compiler Lab

Input

Output

Ex. No. 1 | Token Separation 4


System Software and Compiler Lab

Result
Thus the program for token separation is done successfully.

Ex. No. 1 | Token Separation 5

S-ar putea să vă placă și