Ontology Matching For Linked Open Data

Ontology Matching for Linked Open
Data
Data and Web Semantics
University of Illinois at Chicago - Fall2010
Concept Introduction
What is Linked Data?
Set of best practices for publishing and connecting structured

data on the Web.[1]
Relationship between Linked Data,

Semantic Web and Web of Data
Linked Data is the Semantic Web done right

( http://www.w3.org/2008/Talks/0617-lod-tbl/#%281%29 )
W3C Linking Open Data Project
Grassroots community effort to publish existing open license

datasets as Linked Data on the Web and Interlink things
between different data sources
(
http://esw.w3.org/SweoIG/TaskForces/CommunityProjects/Linki
ngOpenData
)
A feel of how LOD is growing
Reviewed Papers
Jain, P., Hitzler, P., Sheth, A., Verma, K.:

Ontology alignment for linked open data.
Julius Volz1, Christian Bizer, Martin
Gaedke, and Georgi Kobilarov Discovering
and maintaining links on the web of data
Data Linking: Capturing and Utilising
Implicit Schemalevel Relations Andriy
Nikolov Victoria Uren Enrico Motta
Ontology alignment for linked open data.

Main Challenges
LOD datasets are interlinked. These interlinks are mainly on

instance level (owl:sameAs)
Schema level information that is taxonomies built using
rdfs:subClassof is relatively scarce.
There is a lack of interlinks between the different schemas.
Applications based on LOD face difficulties due to loosely
connected pieces of information.
There are no established benchmarks or available baselines for
measuring precision and recall for LOD schema alignment.
Most competitive state-of-art ontology alignment systems
performed poorly on LOD schema datastes.
Detailed Analysis
Results
Some background information .

The chosen datasets give significant coverage of the
LOD cloud. They cover seperate domains such as
Music and Publication.
Some of the dataset providers such as LinkedMDB
have not made their schema publicly available.
There are no established benchmarks or available
baselines for measuring precision and recall for LOD
schema alignment
Human experts familiar with the domains created reference
alignments
The experts identified all possible subclass and equivalence
mappings via a subclass or an equivalence relationship
BLOOMS approach
Preprocessing of the input ontologies
Remove property restrictions
Tokenize composite class names to obtain a list of all
simple words contained within them
Construction of the BLOOMS forest

The forest is built using information from Wikipedia
Comparison of the constructed BLOOMS forest

Which yields decisions such as alignment of class names
Post Processing
Using a reasoner and Alignment API
Evaluation of Results
BLOOMS have compared more generic schema and have used Wikipedia for
handling the diverse domain of LOD. Following were the shortcomings of various
ontology alignment systems suggested by Jain et al.
Ontology Alignment
System
Issues
RiMOM
Failed due to Ontology size
AROMA
Unable to find any relevant relations
OMViaUo
Able to find only few correct analogies
Alignment API
Able to find few correct analogy but found some wrong

analogies as well
S-Match
Computed correct anlogies but in general evaluated

many results, which resulted in low precision
Discovering and maintaining links on the

web of data
The Gap
There are tools available for publishing Linked Data

on the Web but there is still a lack of tools that
support data publishers in setting RDF Links to
other data sources and to maintain RDF links over
time as data sources change
Silk Linking Framework
A toolkit for discovering and maintaining data links

between Web data sources
Discovering and maintaining links on the

web of data
Components
A link discovery engine, which computes links
between data sources based on a declarative
specification of the conditions that entities must fulfill
in order to be interlinked.
A tool for evaluating the generated data links in order
to fine-tune the linking specification
A protocol for maintaining data links between
continuously changing datasources
Silk Link Discovery engine

Main Features
Support the greneration of owl:sameAs links as well as other
types of RDF links
Flexible,declarative language for specifying link conditions
Can be employed in distributed environments without having to
replicate datasets locally
Capablity of being used where terms from different vocalbularies
are mixed and where no consistent RDFS or OWL schemata
exist.
Link specification Language
Data Access
<DataSource> Directive for data access
Silk Link Discovery engine

Main Features
Link Conditions
<LinkCondition> section is the heart of a Silk Link Specification
<LinkCondition>
<AVG>
<MAX >
<Compare>
<Param>
Pre-Matching
<PreMatchingDefinition sourcePath="?a/rdfs:label" hitLimit="10">
<Index targetPath="?b/rdfs:label" />
<Index targetPath="?b/drugbank:synonym" />
</PreMatchingDefinition>
Evaluating Links
Resource Comparison
Link Maintenance Protocol

Link Transfer to target
Request for Target Change List
Subscription of Target Changes
Data Linking: Capturing and Utilizing Implicit Schema level

Relations
Challenges:
The Web of Data is constantly growing [1], and the co-reference
links between data instances stored in different repositories
represent a major added value of the Linked Data approach.
(Co-reference resolution, or the determination of equivalent URIs
referring to the same concept or entity.)
Using a automatic co-reference resolution tool

Challenges for automatic co-reference resolution tool in the light of
heterogeneity in schema used by the repositories
Schema Matching and Co-reference resolution in Linked

Data Environment
Specific features of Linked Data Environment

Consider several interlinked datasets in combination
Involve information contained in third-party datasets
as background knowledge to support matching.
Exploit data patterns present in large volumes of
instance data
Develop methods to deal with relations like class
overlap and relation overlap rather than strict
equivalence
Goal of this approach and Related work in past

Utilize the features in Linked Data to perform
schema-level matching between repositories
and in turn facilitate instance co-reference
resolution
Related Work
SILK Linking Framework
A lot of user effort required
Hub Repository Approach

May lead to loss of some data
Explanation
Background
LinkedMDB repository describes movies from the IMDB database
DBPedia describes Wikipedia entries
Using Background Data for Ontology Matching

Infer schema level relations
movie: music_contributor and dbpedia:Artist
movie: actor and dbpedia:starring
Infer data patterns

Identical movies will have a overlap in release year and set of
actors
Explanation
Inferring schema-level mappings
Data level evidence
Schema level evidence
Establishing relation between movie:music-contributor and
dbpedia:artist via MusicBrainz
Inferring data patterns and refining the set of

existing mappings
{movie:actor,dbpedia:starring}(sim = 0.98)
{movie:initial_release_date;dbpedia:releaseDate}
(sim=0.96)
continued
Test Results
Test Results
Finding equivalence links between music_contributor individuals in
LinkedMDB and corresponding individuals in DBPedia(auxillary
dataset:Musicbrainz,goldstandard size942)
Steps
Instance-based schema-matching algorithm
The relations obtained in the above step were passed as input to data level
coreference resolution tool KnoFuss to discover owl:sameAs links between
instances
The two sets of result

Baseline: involves computing transitive closure of already existing links
Aligned: combined set of existing result and new results obtained by
algorithm after schema alignment
What Do I intend to Do !!!

Linked Open Data Ontology Alignment on
Agreement Maker
Taking BLOOMS result into consideration aligning the
ontologies on Linked Open Data.
Evaluate the results with respect to BLOOMS and
other Ontology Alignment Systems used in BLOOMS
Implement or Suggest solution for co-reference
resolution
Reference Papers

Ontology Matching For Linked Open Data

Încărcat de

Informații document

Titlu original

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Ontology Matching For Linked Open Data

Încărcat de

Drepturi de autor:

Formate disponibile

Ontology Matching for Linked Open

Set of best practices for publishing and connecting structured

Relationship between Linked Data,

Linked Data is the Semantic Web done right

W3C Linking Open Data Project

Grassroots community effort to publish existing open license

A feel of how LOD is growing

Jain, P., Hitzler, P., Sheth, A., Verma, K.:

Ontology alignment for linked open data.

LOD datasets are interlinked. These interlinks are mainly on

Some background information .

Construction of the BLOOMS forest

Comparison of the constructed BLOOMS forest

Failed due to Ontology size

Unable to find any relevant relations

Able to find only few correct analogies

Able to find few correct analogy but found some wrong

Computed correct anlogies but in general evaluated

Discovering and maintaining links on the

There are tools available for publishing Linked Data

Silk Linking Framework

A toolkit for discovering and maintaining data links

Discovering and maintaining links on the

Silk Link Discovery engine

Silk Link Discovery engine

Link Maintenance Protocol

Data Linking: Capturing and Utilizing Implicit Schema level

Using a automatic co-reference resolution tool

Schema Matching and Co-reference resolution in Linked

Specific features of Linked Data Environment

Goal of this approach and Related work in past

Hub Repository Approach

Using Background Data for Ontology Matching

Infer data patterns

Inferring data patterns and refining the set of

The two sets of result

What Do I intend to Do !!!

S-ar putea să vă placă și