Sunteți pe pagina 1din 33

Truth Finder Multiple Conflicting Information Providers on the Web

Abstract
Introduction

System Analysis
System Requirements

System Design

Abstract
The World-Wide Web has become the most important information source for most of us. As different websites often provide conflicting information the is no guarantee for the correctness of the data. In this paper we propose a new problem called Veracity and invent a new algorithm called Truth Finder". Our Experiments show that Truth Finder successfully finds true facts among conflicting information, and identifies Trust worthy websites better than the popular search engines.

Introduction
The World-Wide Web has become a necessary part of our lives and might have become the most important information source for most people. When we want to know the answer to any certain question, we go to ask.com or google.com.Is the World-Wide Web always trustable? Unfortunately the answer is NO". Different Websites often provide conflicting Information, as shown in the following examples.

oExample 1
Height Of The Mount Everest:
Suppose a user is interested in how high the Mount Everest is and queries Ask.com with What is the height of Mount Everest?.Among the top 20 results, he or she will find the following facts. Four websites(Including Ask.com itself) say 29,035 feet, five websites say 29,028 feet, one says 29,002 feet, and another one says 29,017 feet. Which answer should the user trust?

oExample 2
Author of Books:
We tried to find out who wrote the book Rapid Contextual Design(ISBN: 0123540518).We found many different sets of authors from different online book stores. Websites A1 Books Authors Karen Holtzblatt, Jessamyn Burns Wendell, Shelley Wood

Powells books
Cornwall books Mellons books Lakeside books

Holtzblatt, Karen
Holtzblatt-Karen, Wendell-Jessamyn Burns, Wood Wendell, Jessamyn Wendell, JessamynHoltzblatt, KarenWood, Shelley

Conflicting Information on the Web


Different websites often provide conflicting info. on a

subject, e.g., Authors of Rapid Contextual Design

Online Store Powells books Barnes & Noble

Authors Holtzblatt, Karen Karen Holtzblatt, Jessamyn Wendell, Shelley Wood

A1 Books
Cornwall books

Karen Holtzblatt, Jessamyn Burns Wendell, Shelley Wood


Holtzblatt-Karen, Wendell-Jessamyn Burns, Wood

Mellons books
Lakeside books Blackwell online

Wendell, Jessamyn
WENDELL, JESSAMYNHOLTZBLATT, KARENWOOD, SHELLEY Wendell, Jessamyn, Holtzblatt, Karen, Wood, Shelley

Trustworthiness of the Web


The trustworthiness problem of the web. According to

a survey on credibility of web sites:


54% of Internet users trust news web sites most of time 26% for web sites that sell products

12% for blogs

The problem of Veracity: Conformity to truth


Given a large amount of conflicting information about

many objects, provided by multiple web sites


How to discover the true fact about each object?

54% News Websites Princeton Survey In 2005

12% Blogs

26% Online Shopping

System Analysis:
oExisting System oProposed System oDisadvantage

oAdvantage

oExisting System:
Page Rank and Authority-Hub analysis is to utilize the hyperlinks to find pages with high authorities. These two approaches identifying important web pages that users are interested in, Unfortunately, the popularity of web pages does not necessarily lead to accuracy of information

oDisadvantages:
The popularity of web pages does not necessarily lead to accuracy of information. Even the most popular website may contain many errors. Where as some comparatively not-so-popular websites may provide more accurate information.

oProposed System:
We formulate the Veracity problem about how to discover true facts from conflicting information.
Second, we propose a framework to solve this problem, by defining the trustworthiness of websites, confidence of facts, and influences between facts. Finally, we propose an algorithm called TRUTHFINDER for identifying true facts using iterative methods.

oAdvantages:
Our experiments show that TRUTHFINDER achieves very high accuracy in discovering true facts. It can select better trustworthy websites than authoritybased search engines such as Google.

oModules:
Collection Of Unrelated Data Data Search Truth Finder Algorithm

Result Calculation

oModule Description:
Collection Of Data:
First we have to collect the specific data about an object and it is stored in related database. Create table for specific object and store the facts about a particular object.

Data Search:
Searching the related data link according to user input. In this module user retrieve the specific data about an object.

Truth Algorithm:
We design a general framework for the Veracity problem, and invent an algorithm called Truth Finder, which utilizes the relationships between web sites and their information, i.e., a web site is trustworthy if it provides many pieces of true information, and a piece of information is likely to be true if it is provided by many trustworthy web sites.

Result Calculation:
For each response of the query we are calculating the Performance. Using the count calculated find the best link and show as the out put.

o An Example

Websites
W1 W2 W3 W4

Facts
f1

Objects

o1 f2 f3 o2 f4

UML Diagrams

Use Case Diagram 1 :

User

Login

Search Query

Database

Result

Output

Use Case Diagram 2:

Sequence Diagram:

Collaboration Diagram :

Class Diagram:

State Diagram:

Home

Login

Query Process

Database

Conflicting Information

Truthfinder

Result

Activity Diagram:

Home

Validation

Login

Invalid Query Process Database

Conflicting information

Truthfinder

Result

oData Flow Diagram: Websites


W1 W2 W3 W4

Facts
f1

Objects

o1 f2 f3 o2 f4

System Requirements:

oHardware Requirements :
PROCESSOR RAM MONITOR HARD DISK CDDRIVE KEYBOARD MOUSE : : : : : : :

PENTIUM IV 2.6 GHz 512 MB DD RAM 15 COLOR 20 GB LG 52X STANDARD 102 KEYS 3 BUTTONS

oSoftware Requirements:
FRONT END TOOL USED OPERATING SYSTEM BACK END : : : : Java, J2ee (JSP), Servlets. JFrameBuilder Windows Xp SQL Server 2000

Conclusions:
Veracity: An important problem for Web search and

analysis

Resolving conflicting facts from multiple websites

Our approach: Utilizing the inter-dependency between

website trustworthiness and fact confidence to find (1) trustable web sites, and (2) true facts TruthFinder: A system based on this philosophy
high quality web sites

Achieves high accuracy on finding both true facts and

ANY QUERIES?

Thank you!

S-ar putea să vă placă și