Sunteți pe pagina 1din 24

Ontology Matching for Linked

Open Data
Shashi Singh

Concept Introduction

What is Linked Data?


set of best practices for publishing and connecting
structured data on the Web.[1]

Linked Data Sets


Relationship between Linked Data, Semantic Web and
Web of Data

Linked Data is the Semantic Web done


right

W3C Linking Open Data Project

Grassroots community effort to


Publish existing open license datasets as Linked Data on the Web
interlink things between different data sources

Continued

A feel of how LOD is growing

Reviewed Papers

Jain, P., Hitzler, P., Sheth, A., Verma, K.: Ontology alignment for linked open data.
BLOOMS is a system for finding schema-level links between LOD datasets in the sense of
Ontology matching.

Julius Volz1, Christian Bizer, Martin Gaedke, and Georgi Kobilarov Discovering and
maintaining links on the web of data

Data Linking: Capturing and Utilising Implicit Schemalevel Relations Andriy Nikolov Victoria
Uren Enrico Motta

Ontology alignment for linked open data.

Main Challenges:
LOD datasets are interlinked these interlinks are mainly on instance level(owl:sameAs)
Schema level information that is taxonomies built using rdfs:subClassof is relatively scarce.
there is a lack of interlinks between the different schemas.
Applications based on LOD face difficulties due to loosely connected pieces of information.
there are no established benchmarks or available baselines for measuring precision and
recall for LOD schema alignment.
Most competitive state-of-art ontology alignment systems performed poorly on LOD schema
datastes.

Detailed Analysis

Results

BLOOMS Approach
- The chosen datasets give significant coverage of the
LOD cloud. cover different domains such as Music,
Publication and the Web.
- Some of the dataset providers such as LinkedMDB have
not made their schema publicly available.
there are no established benchmarks or available baselines for
measuring precision and recall for LOD schema alignment
human experts familiar with the domains created reference
alignments

The experts identified all possible subclass and


equivalence mappings via a subclass or an equivalence
relationship

BLOOMS approach
1) Preprocessing of the input ontologies
2) Construction of the BLOOMS forest
3) Comparision of the constructed
BLOOMS forest
4) Post Processing

Evaluation of Results They have compared more generic schema and have used Wikipedia for handling the
diverse domain of LOD. Following were the shortcomings of various ontology alignment systems suggested by
Jain et al.
Ontology Alignment
System

Issues

RiMOM

Failed due to Ontology size

AROMA

Unable to find any relevant relations

OMViaUo

Able to find only few correct analogies

Alignment API

Able to find few correct analogy but found some wrong analogies as
well

S-Match

computed correct anlogies but in general evaluated many results,


which resulted in low precision

Discovering and maintaining links on the web of data

The Gap there are tools available for publishing Linked Data on the Web but there is still a
lack of tools that support data publishers in setting RDF Links to other data sources and to
maintain RDF links over time as data sources change
Design Goal of Silk was to fill this gap.

Silk - Linking Framework, a toolkit for discovering and maintaining data links between Web
data sources

Components
1) A link discovery engine, which computes links between data sources based on a
declarative specification of the conditions that entities must fulfill in order to be interlinked.
2) A tool for evaluating the generated data links in order to fine-tune the linking specification
3) A protocol for maintaining data links between continuously changing datasources

Silk Link Discovery engine

Main Features
- support the greneration of owl:sameAs links as well as other types of RDF links
Flexible,declarative language for specifying link conditions
Can be employed in distributed environments without having to replicate datasets locally
Capablity of being used where terms from different vocalbularies are mixed and where no
consistent RDFS or OWL schemata exist.
Link specification Language
- Data Access

<

DataSource> Directive for data access

- Link Conditions

<LinkCondition> section is the heart of a Silk Link Specification


<LinkCondition>
<AVG>
<MAX >
<Compare>
<Param>
- Pre-Matching
<PreMatchingDefinition sourcePath="?a/rdfs:label" hitLimit="10">
<Index targetPath="?b/rdfs:label" />
<Index targetPath="?b/drugbank:synonym" />
</PreMatchingDefinition>

Evaluating Links
- Resource Comparison

Link Maintenance Protocol


- Link Transfer to target
- request fro Target Change List
- Subscription of Target Changes

Silk Implementation
Written in Python
Runs from command line
Framework can be downloaded form
Google Code(http://silk.googlecode.com)

Data Linking: Capturing and Utilising Implicit Schemalevel Relations


Andriy Nikolov Victoria Uren Enrico Motta

Challenges:
- Schema-level heterogeneity represents an obstacle for auto

mated discovery of coreference resolution links between individuals.

A brief introduction to the problem


of co-reference resolution

Co-reference resolution, or the determination of equivalent URIs referring to the same


concept or entity.

A few suggested ways for Co-reference resolution


- using a third party toolkit

- Silk - Linking Framework, a toolkit for discovering and maintaining data links between
Web data sources. Silk consists of three components.
- Coreference Resolution Service
A CRS maintaines "bundles" of URIs which are deemed to be equivalent
-The newly published repositories arelinked to hub repostories e.g Dbpedia and then, in
order to obtain complete information about a certain entity we need to compute a transitive
closure of coreference links and gather all URIs used to represent this entity in dfferent
datasets. These transitive closures can be maintained
in a centralised way e.g RKB explorer

Ongoing research in Linked Data

Digital Enterprise Research Institute


- publishing research theme
discovery research theme,
application domains research theme
streamed linked data,
Linked Government Data.
Linked Enterprise Data.

Motivation for the project


The application theme.

recent effort to use ontology alignment systems for aligning ontologies on Linked Open Datasets.
BLOOMS is a system for finding schema-level links between LOD datasets in the sense of
Ontology matching. I wanted to use Agreement Maker to align ontologies on Linked Open
Data and Compare the results. To be able to suggest ways to improve on the alignmnet.
- human experts familiar with the domains created reference alignments
- The experts identified all possible subclass and equivalence mappings via a subclass or an equivalence relationship

[1] Linked Data - The Story So Far

Christian Bizer, Freie Universitt Berlin, Germany


Tom Heath, Talis Information Ltd, United Kingdom
Tim Berners-Lee, Massachusetts Institute of Technology, USA
This is a preprint of a paper to appear in: Heath, T., Hepp, M., and Bizer, C. (eds.).
Special
Issue on Linked Data, International Journal on Semantic Web

Concept Introduction..

Examples of linked data sets in Linked Open Data Project


Advogato is exporting its users profiles using FOAF.

BBC Music Data about Artists, Releases and Reviews. Largely based upon MusicBrainz and the

Music Ontology

BBC Programmes Data about TV and Radio Programmes broadcast on by the BBC. Interlinked
with MusicBrainz and DBpedia.
The Bio2RDF project, a Semantic web atlas of post-genomic knowledge about human and
mouse, has published 27 biology-, gene- and medical-related data sets (altogether 2.3 billion
triples, served up by Virtuoso instances).

S-ar putea să vă placă și