Ankit Gupta Hamid Raza Introduction Project Description Project Scope Architecture Our Role Work Detail Future Work Reference Paragraph Summarization is the process of providing a summary to any paragraph which include all possible words which are really important for the summary. The purpose of this report is to give a brief idea about what we did till now into our project. The whole project used different methods to produce summarizing a paragraph. And more than one summaries are available. The advent of WWW has created a large reservoir of data.
A short summary, which conveys the essence of
the document, helps in finding relevant information quickly.
Paragraph summarization also provides a way to
cluster similar words and present a summary. The intended user of this project is any one who has some basic knowledge of computers. This project helps you to find summary of any paragraph or any document. The result of the summarization is more than one , the user need to select which one is more suitable for his/her. Even if a user want to customize the result of summary than he could able to do it. The scope of this project is to give the best summaries of any paragraph but selecting the best one is done by the user.
Now a day internet has a vast area of available
documents and we all want to have a text in a summarized manner. A summary is a text that is produced from one or more texts, that contains a significant portion of the information in the original texts, and that is no longer than half of the original texts.
Summaries may be classified as:
Extractive
Abstractive In abstractive summarization, information from the source text is rephrased.
Human beings generally write abstractive summaries
(except when they do their assignments ).
Abstractive summarization has not reached a mature
stage because allied problems such as semantic representation, inference and natural language generation are relatively harder. Extractive summaries are created by reusing portions (words, sentences, etc.) of the input text verbatim.
For example, search engines typically generate
extractive summaries from webpages.
Most of the summarization research today is on
extractive summarization. Till now we have completed our project at the level that it provide two summaries of any paragraph.
Now we are working on one another method by which
we can give a more better summary of the paragraph.
These method include different algorithms and we try
to figure out the best suitable for us. We have three members in this project or we have divided out project into three different parts like:- Frontend Backbone Testing
Backbone means what are the different algorithms
have implement in the project. In this project there are some earlier work done in this project.
This project is an extension of Text Summarization
in which the text is being summarized using different methods.
Text Summarization is a very earlier process of
summarizing a text but it is not too efficient. In my project i.e. Paragraph Summarization we have used three different methods of Summarization which can generally summarize any paragraph.
We have used a ‘extractive’ summarization method
during implementation of the project. Till now we have completed one third parts of the project i.e. it provide summary for the paragraph but right now it provide only two summaries. Currently we are working on the third method of summarization. In this third method we try to implement ‘neural network’ so that the system can perform more better way. Our Project is a simple Desktop Application which is used to find out summary of any paragraph or text. The concept of paragraph summarization is same as text summarization but the difference in our project is that it also provide the related sub-summary to give some answer to the text so that any user can easily understand the text and also able to know the related answer to the text. We have to build the effective software that meets all the requirements. We use Python to build our project and using Tkinter GUI for making our project more presentable.
With the GUI we are also using some NLTK (natural
language toolkit method) to work with them and try to optimize our result.
We also use different modules of Python to complete
our project like: sklearn,scipy,tensorflow,etc. We can add different features to this project so that it can also give response in some case when there is a question in the paragraph.
We add feature like add different words by itself to
make a summary complete.
It can help in providing the best result for most of
the thesis and research papers. Most of the current research is based on extractive multi-document summarization.
Current summarization systems are widely used to
summarize NEWS and other online articles.
Keyword based techniques rank sentences based
on the occurrence of relevant keywords. word2vec - tool for computing continuous distributed representations of words. - google project hosting,” https://code.google.com/p/word2vec/, accessed: 2015-05-15. https://.nltk.org/summary G. Sizov, “Extraction-based automatic summarization: Theoretical and empirical in- vestigation of summarization techniques,” 2010.