Sunteți pe pagina 1din 37

CITATION

ANALYSIS
GAURAV VARSHNEY(12PEB136)
SIDDHARTH GUPTA(12PEB542)
MOTIVATION
 The advent of WWW has created a large
reservoir of data
 From Which part should I start my research
work
 How do I know if this research work is
important
 What are the best journals in the particular
field
 How can I choose my research area
MOTIVATION
It All Starts With A Citation …..

Author impact Journal impact


e.g. h-index e.g. impact factor
INTRODUCTION
Citation analysis is defined as the
evaluation and interpretation of the citations
received by articles, scientists, universities,
countries, and other aggregates of scientific
activity, used as a measure of scientific
influence and productivity.
Citation count refers to the number of
times one paper has been cited or referenced in
the work of another
How Citation analysis helps researchers…..

• understand the reach of their research


• identify patterns in the way their work is
used
• benchmark themselves against their peers
• lend credibility to their resumes when
applying for grants and promotions
• set objective targets for themselves and
their publications
• Guide the other researcher in their work
Reasons To Cite Previous Work

 To guide other researchers in their work


 To help other researchers trace the genealogy of your
ideas
 To direct readers to previously used methods, and
equipment
 To criticize or correct previous work
 To substantiate your claims and arguments with evidence
 To show that you have considered various opinions in
framing your arguments
 To highlight the originality of author work in the context
of previous work
CITATION NETWORKS

• In a citation network, information flows from one


paper to another via the citation relation.
• Citation contexts capture the influence of one paper on
another as well as the flow of information.
• Citation contexts or the short text segments
surrounding a paper's mention serve as “micro
summaries” of a cited paper!

8
Resources Available
• There are three key databases/online resources that are used as sources
of citation data
• Due to their differing coverage – citation counts will also differ.

Scopus Subscription-based database, over 21,000


(SCImago Journal Rank [SJR]) journals; 5.5 million conference papers

DBLP Subscription-based database, over 12,000


(Computer Science journals; 160,000 conference proceedings
Bibliography )
Google Scholar Free online resource, citation data is based
(My Citations/Google Scholar
on internet searching.
Metrics)
DIFFERENT USE OF CITATION
ANALYSIS
Citation analyses can be grouped according to some broad types based on
who/what is being evaluated.

Citation-
based
metrics

Ranking
Ranking Ranking Extractive
universities
journals researchers Summary
and countries
Ranking journals: Journals are ranked by counting the
number of times their papers are cited in other journals.
Journal-level metrics are generally meant to serve as an
indicator of journal prestige.
Ranking researchers: Researchers are ranked by
counting the number of times their individual papers are
cited in other published studies. These metrics are also used
to evaluate researchers for hiring, tenure, and grant
decisions.
Extractive summary: Most of the summarization
research today is on extractive summarization Extractive
summaries are created by reusing portions (words,
sentences, etc.) of the input text verbatim.
Ranking universities and countries: There are
databases that rank universities and countries by
considering their overall research output through criteria
such as citable documents, citations per document, and
total citations.
FUTURE PROSPECT

Creation of web app which will provide the


Extractive Summary of the Research paper
and Important work from particular paper
using citing sentences.
CONTINUED…
 Extracted Summary will contain :-
 Sentences from the “Introduction” part i.e. brief
overview related to title
 Sentences from “Implementation” part i.e. key work
done in Implementation section.
 Sentences from “Experiment ” part will provide result
obtained
 Finally contain some part of the ” conclusion and
Future ”
 Highly cited reference(in detail).
PROJECT REQUISITE
CERMINE (Content ExtRactor and MINEr)
-API for extracting metadata from Scientific
articles
ECLIPSE
-Eclipse is famous for our Java Integrated
Development Environment (IDE)
Gephi Tool
Gephi is the leading visualization and exploration software for all
kinds of graphs and networks. It is open-source and free.
IMPLEMENTATION OF OUR
PROJECT
INITIAL PROJECT WORK

Conversion of Research paper format from


.pdf to .doc.
Extracted metadata(references ,titles,
Authors etc.) and other content from papers.
 Creation of citation graph using
references.
Continued…

Extracted citing sentences from related


research paper.
 Extracted references to which the
citing sentences belong.
Counted the number of citing sentences
from a particular reference.
 Created a paragraph using collection of
citing sentence on the basis of the
(maximum count) number of citing
sentences from a particular reference.
 Find the keyword from the Paragraph

using Rake Algorithm


WHAT CERMINE GIVE…..
EXTRACTED META DATA CONTENT
EXTRACTED META DATA FILE OF
RESEARCH PAPER
TITLES AS A NODE……(EXCEL FILE)
REFERENCES AS A EDGES…..(EXCEL FILE)
CITATION GRAPH (FOR SMALL DATA)
CITATION GRAPH (RELATIVELY
LARGE DATA)
CITATING SENTENCES EXAMPLE

References

Citation Sentences
References

Citation Sentences
CITATION COUNT RESULT
CITATION PARAGRAPH EXAMPLE 1
CITATION PARAGRAPH EXAMPLE 2
In the Indian context, single-day experiments at three
locations in urban and sub-urban Delhi have been
performed
Some of these countries have more than 90% of TV
bands unused at all times
indicated the opportunities of broad band distribution
over TV white space in Indian conditions
KEYWORD EXTRACTION RESULT
 A result of an example we have taken for taking
out the keywords from the paragraph.
 keyWordCandidates = {-urban delhi=4.0, tv white
space=9.0, single-day experiments=4.0, tv bands
unused=9.0, indian conditions=4.0, indian
context=4.0, broad band distribution=9.0,
countries=1.0, performed=1.0, 90%=1.0, times=1.0,
opportunities =1.0, urban=1.0, locations=1.0}
sortedKeyWordCandidates = {tv white space=9.0, tv
bands unused=9.0, broad band distribution=9.0, -
urban delhi=4.0}
KEYWORD EXTRACTION
 Extracted out the keywords from the paragraph
created using citing sentences
 EXAMPLES
 from Paragrah 1
 Authors talks about
"smart,cities,iot,,internet“
 from Paragarph 2
 Authors talks about
"bands,tv,indian,conditions,space"
CONCLUSION
 In the first phase we have made Citation Graph
using references which have been extracted from
scientific papers.
FUTURE WORK
 Sentences are to be ranked according to the sum
of weights of terms relevant to the topic in the
sentence and to which part it belong to research
paper.

 Provide a extractive summary of research paper.

 Based on the choice of user, provide more details


of a particular section of research paper.
i.e .Dynamic Summary
REFERENCES
 G.Parthasarathy ; D.C.Tomar, “Sentiment Analyzer: Analysis of Journal
Citations from Citation Databases”, IEEE International Conference on
Confluence The Next Generation Information Technology Summit
(Confluence),2014.
 Qamar Mahmood, Muhammad Abdul Qadir, Muhammad Tanvir
Afzal,”Document Similarity Detection using Semantic Social Network Analysis
on RDF Citation Graph “, IEEE International Conference ,2013.
 Vahed Qazvinian;Dragomir R. Radev,” Scientific Paper Summarization Using
Citation Summary Networks”, Proceedings of the 22nd International Conference
on Computational Linguistics (Coling ), Manchester,pages 689–696
 Filippo Galgani, Paul Compton, Achim Hoffmann “Summarization based on bi-
directional citation analysis”, Information Processing and Management
International Journal, 2015, Pages 1-24.
 http://cermine.ceon.pl/about.html
 https://scholar.google.co.in/intl/en/scholar/citations.html

 http://guides.lib.umich.edu/c.php?g=282982&p=1887443

S-ar putea să vă placă și