Documente Academic
Documente Profesional
Documente Cultură
COURSE-PROJECT REPORT
ON
Submitted by:
Abhilash Rejanair
Aditya Hosamani
Aniruddha Achar B P
1NT13CS003
1NT13CS007
1NT13CS016
CERTIFICATE
This is to certify that the Project Report
1NT13CS003
1NT13CS007
1NT13CS016
ACKNOWLEDGEMENT
This project was compiled for the Object Oriented Programming Course of 4th
Semester.
We would like to thank our Professor, Mrs. Vijaya Shetty for providing us with
the opportunity and daring us to come up with something new and creative. We
also thank her for her assistance and support, both moral and technical, in writing
this project.
We would also like to thank our respective parents and family members, all of
whom have been thoroughly supportive.
We are also grateful to the internet, in no small amounts, for all the amazing
research material and ideas which inspired us to come up with something on our
own.
Last, but not the least, we are sincerely indebted to the author of the prescribed
text book, Herbert Schildt, for providing us an absolute reference guide, using
which we could solve numerous technical problems we faced.
With the mutual consensus among the team members, we have decided to release
the source code of this project to the public, effectively making this whole project
open source, after the due evaluation is done.
The project and its code will soon be available on Github, under MIT/APACHE
license.
ABSTRACT
Contents
Chapter 1 Introduction ............................................................................................................... 1
1.1 HTML .......................................................................................................................... 1
1.2 LATEX [2] ................................................................................................................... 1
1.3 Text processor: ............................................................................................................. 1
Chapter 2 The ALANG language: ............................................................................................. 3
HTML ................................................................................................................................ 3
ALANG.............................................................................................................................. 3
2.1
2.2
2.3
Chapter 1
Chapter 1 Introduction
Why develop a new language when there are thousands of other languages out there?
The answer is twofold. Firstly different languages were developed for different purposes. Each
programming language has its limitations and advantages. Languages that are widely used take
time to change and changes are rolled out slowly.
There is also a claim Programmer training is the dominant cost of programming
language. [1]
The cost of programming languages can be reduced by developing languages closer to
natural language
1.1 HTML
With the advent of the internet age, a new formatting method to universally present documents
was invented called HTML. It created a standard for formatting documents that could be
shared, linked and viewing documents. Use of tags for formatting a document made it easy for
writing and editing of documents. HTML is the standard format for presentational mark-up on
the World Wide Web.
With the smartphone boom, web browsers are in everyones hand; making HTML and its
excellent formatting capability that much more important. But with the HTML tags came the
learning curve and the need to remember the tags for each of the properties. Also the length
and correct sequence of tags to perform a task many a times hinders the fast and efficient
formatting of documents that are web ready.
In HTML, we use <em>the em and strong tags</em> to add <strong>emphasis</strong>.
2014-2015
Chapter 1
readers, the code in itself is highly readable. As a quick demonstration of this quality, consider
the following code segment which is equivalent to the HTML example given above:
We use ~ the em and strong tags~ to add *emphasis*
ALANG supports the same basic formatting techniques as HTML and LATAX but strives to
keep the syntax as light as possible.
HTML
ALANG
2
Department of computer science
2014-2015
Chapter 2
HTML
<h1>this is head one</h1>
<p>this is a paragraph with <strong>bold</strong></p>
<p>this has <pre>code</pre> and <em>italics</em></p>
ALANG
#1this is head one
this is a paragraph with *bold*
this has `code` and ~italics~
The difference if striking, the code in ALANG is clear and does not clutter the actual content.
The user can look at the code without giving much thought to what each tag means. Thus
increasing readability.
2.1
The author of Markdown, John Grubel uses the same name to refer to his Markdown to HTML compiler, but
in the name of clarification of this report, we shall talk only about the language Markdown and not the compiler.
3
Department of computer science
2014-2015
Chapter 2
Tables can be created by separating the columns by a pipeline symbol (|) and
each row should be written in a new line. To indicate the start and end of a table, the
table is enclosed inside #t and %.
Tables in ALANG:
#t
|table one| table 2 | table 3 | table 4 |
|This is a test column |A second test column is here| Too many columns here |Last|
|Row 2 column 1|Row 2 column 2 |Row 2 column 3|Row 2 column 4|
%
Tables can also be used with headers, i.e., using the thead HTML tag. The syntax for
this is almost the same as that for normal tables, with one small exception.
Tables in ALANG:
#t
|table heading one| table heading 2 | table heading 3 | table heading 4 |
|----------------------|--------------------|---------------------|--------------------|
|This is a test column |A second test column is here| Too many columns here |Last|
|Row 2 column 1|Row 2 column 2 |Row 2 column 3|Row 2 column 4|
%
Emphasis is added by surrounding the text with * or ~ for bold and italics respectively.
A part of the document can be made italics by surrounding them with ` (grave). Ex:
This is a paragraph with *some bold text* followed by ~italic text~ and ended with
`code`.
In addition to all the above mentioned elements, ALANG also supports images
and links. Both of them are surrounded by !. The syntax for images and links are as
given below:
Syntax for image:
!{image source}!
Syntax for link:
!(anchor text)[link]!
2014-2015
Chapter 2
$variable name:
Content of the section
$
With the section support, we have added CSS to style the appropriate sections
which should be specified at the end of the document. CSS part of the code is specified
within brackets.
Syntax for CSS:
##variable name
{
CSS part here
}
With these additions we have tried to improve upon the already existing Markdown
language.
5
Department of computer science
2014-2015
Chapter 4
6
Department of computer science
2014-2015
Chapter 4
The resultant HTML file generated is by no means read-only or restricted by any means. The
code of the website is completely available to the user for modification, removal of credits, etc.
____
____
____
____
____
____
____
ALANG
____
____
____
____
______
______
______
______
______
______
HTML
+ CSS
______
__
ALANG HTML
CONVERTER
As soon as a tag is
encountered, respective
flag is set to 1.
Conversion of Tags:
File Output:
As soon as tag is
equivalent is found.
directory.
nested tags.
Footer is requested.
Style.css is generated.
when destroyed.
7
Department of computer science
2014-2015
Chapter 4
Chapter 4 Implementation
The implementation of the convertor has was divided into three important segments.
These three segments were assigned to the members of the team. The development of the
project was done using a product design technique called the swift technique. Here each
member was required to produce a working iteration of the project every week. At the end of
the project, the swift cycle was reduced to a single day.
Convertors that convert one form of document to another generally have a front-end
that interprets the input and transforms it into some kind of intermediate form, and a back-end
that generates the output. The front end performs the lexical analysis or scanning while the
back-end does the cod conversion. The front-end can perform the scanning and parsing of the
input code in one single pass instead of using multiple passes.
Three importation parts of the project:
1. File handling.
The front-end
2. Tokenization
3. Conversion from ALANG to HTML
Back-end
The development of the language has been discussed in the previous chapters. In this chapter
we will talk about the implementation of the convertor in C++. The following section will
discuss about the file handling and creation of appropriate directories and files for the scanning
and code conversion.
8
Department of computer science
2014-2015
Chapter 4
bootstrap.css
CSS
style.css
bootstrap.js
JS
Website
jquery.js
Media
index.html
2014-2015
Chapter 4
detected, then converted and next the element surrounding it and so on. This is achieved using
a stack. The systems stack is implicitly called to hold the hold the scanned and tokenized string
until the end of the element is found. First the innermost i.ie the smallest element is converted,
this converted is replaced in place and the control is transferred to the outer element to scan for
other elements if any inside this element. If none are found, the outer element that contains the
converted element is passed to the converting function to be converted to its HTML equivalent.
As the starting and ending tags(tokens) are same for most of the elements, a flag is maintained
for each element to check if it is the starting or the end of the element.
The start_s() has switch cases for each of the tags, whenever a character matching the
tokens is encountered, the switch is triggered, the flag corresponding to that element is set to
true indicating that the starting of the element was found, next if the tag supports nested tags,
the next tag is found else the steps to convert the processed text to its equivalent HTML is
performed by passing the processed string i.e. the string that is assigned with the value of the
contents of the tag if passed to a method of the convert class that appends equivalent HTML
tags to the content and returns the appended string back. This now replaces the output string
that will have the HTML equivalent of the code. In case of images, along with conversion, the
source files i.e. the media files mentioned in the code are copied to a folder named media which
will act as the source of the images.
One the processing is completed, the string is passed to the file handling classes where
it is written into the index.html file.
2014-2015
Chapter 4
Start
NO
Is
spanning
element
Until the
end of
the token
Make a
recursive call
to start to find
the next
element.
YES
Store content
until the end
of the token
Send the
content of the
element to
convert
methods
Send the
content of the
element to
convert
methods
FIGURE 4 FLOW CHART FOR SCANNING INPUT
11
Department of computer science
2014-2015
Chapter 4
12
Department of computer science
2014-2015
Chapter 5
Chapter 5
The main focus of the convertor is to scan and convert the code. As the writing and
reading from files are hardware and software dependent, the analysis was done on the scanning
and code conversion algorithm. Also readability was a part of the efficiency. The readability
of the code has been improved drastically as seen in previous chapters.
To find the efficiency of the algorithm developed, two methods were used. One was a
mathematical analysis of the code, the other was using the system clock to verify the efficiency
of the program.
13
Department of computer science
2014-2015
Chapter 5
Time in seconds
1.4
1.2
1
0.8
0.6
0.4
0.2
0
0
10
20
30
40
50
60
FIGURE 5 EFFICIENCY
The above graph makes it clear that the conversion algorithm is linear in nature. Some of the
other parsers used to convert Markdown to HTML are exponential in nature. Marking this
convertor quite efficient and quick.
14
Department of computer science
2014-2015
Chapter 5
15
Department of computer science
2014-2015
Chapter 6
16
Department of computer science
2014-2015
Reference
Reference
[1] A. Aiken, Cloud front, [Online]. Available:
https://d2bk0s8yylvsxl.cloudfront.net/stanford-compilers/slides/01-03-the-economy-ofprogramming-languages.pdf.. [Accessed 26 March 2015].
[2] LaTeX, LaTeX intro, LaTeX, 09 Feb 2008. [Online]. Available: http://latexproject.org/intro.html.. [Accessed 26 March 2015].
[3] A. Ranta, Specifying the lexer, in Implementing Programming Languages, 2012.
[4] Wikipedia, Markdown, [Online]. Available: http://en.wikipedia.org/wiki/Markdown..
[Accessed 26 March 2015].
[5] A. Levitin, Introduction to the Design and analysis of Algorithms, Delhi: Pearson, 2009.
17
Department of computer science
2014-2015