Sunteți pe pagina 1din 42

Prepared By: Hetal Dodia (8) Asif kureshi (17) Tejas Patel (27) Nidhi Trivedi (37)

Internet Searching

Search Engine
History Examples

Types Of Search Engine


How It Works.
2

Internet
Internet An interconnected network of thousands of networks and millions of computers linking businesses, educational institutions, government agencies, and individuals together

Searching.
A lot of information makes a site huge, complex and navigation difficult.
Search is the user's lifeline for mastering complex websites.

Search feature is essential for users when they revisit a site, looking for specific info.

Types of Searching
A search can be of various types:
Internet Search: Search Engines like Yahoo, Info seek crawl the web gathering web pages or info on web pages, index them

and retrieve them when the specific term is found Database search: Databases store their information neatly organized into fields. A search Interface is provided for this.

SEARCH ENGINE
A tool designed to search for information on the World Wide Web. The

information may consist of web pages, images, information and other types of files.
A search engine is an information retrieval system

designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits.

Every ordinary user on the Internet must have good knowledge

about search engines and searching in order to explore the wonderful world that the Internet creates to a greater extent. Search engines help to minimize the time required to find information and the amount of information which must be consulted Searching is one of the most used action on the Internet. Search engines as an instrument of searching, are special sites on the Web that are designed to help people find information stored on other sites.
Includes external engines like Google, Yahoo, MSN, AOL, Live.
7

History of search engine


A list of web servers.. New servers were announced under title Whats

new The first tool for searching Archie Then rise of Gopher led to 2 new search programs Veronica and Jughead.(1991) Till 1993, no search engine existed for the web. Webs first primitive search engine W3catalog.(1993)

History Cont..
First all text crawler based search engine WEBCRAWLER (1994) Google adopted idea of selling search terms in 1998, from small

company named goto.com Brightest stars in the internet investing frenzy. Google rose to prominence (2000) Microsofts first SE MSN was using search results from Inktomi

History Cont
Microsoft rebranded SE, Bing launched on June 1 2009.

on July 29, A deal between Yahoo and Bing.


In 2012, Google released the Beta version of Open Drive- available as

a chrome application.

10

Market of Search Engines

11

The Best & Most popular Search engine

12

4th Most visited website in the world

13

Ask Question Search Engine

14

MicroSoft Bing Search Engine

15

Types Of Search Engines


Crawler-Based Search Engines 2. Human-Powered Directories 3. "Hybrid Search Engines" or Mixed Results 4. Meta Search Engine
1.

16

Crawler-Based Search Engines


Crawler-based search engines, such as Google, create their listings

automatically. They "crawl" or "spider" the web, then people search through what they have found.
If you change your web pages, crawler-based search engines

eventually find these changes, and that can affect how you are listed. Page titles, body copy and other elements all play a role.

17

Cont.
Crawler-based search engines are good when you have a specific

search topic in mind and can be very efficient in finding relevant information in this situation
LIKE.. Google, AllTheWeb and AltaVista

18

Human-Powered Directories
A human-powered directory, such as the Open Directory, depends on

humans for its listings. You submit a short description to the directory for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted. Changing your web pages has no effect on your listing. Things that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site

19

Cont..
Human-powered directories are good when you are interested in a

general topic of search. In this situation, a directory can guide and help you narrow your search and get refined results. Therefore, search results found in a human-powered directory are usually more relevant to the search topic and more accurate. However, this is not an efficient way to find information when a specific search topic is in mind. Example- Yahoo directory, Open Directory and LookSmart
20

Pros of Human-Powered Directories


Fast answers (sometimes)
Answers sent directly to your phone or email. This is especially

beneficial if you are on the go, and using a service such as ChaCha or KGB, that allows you to ask and answer your question via text message.
Sometimes standard search engines don't know what you're talking

about- and that's where dealing with an actual human helps.


21

Cons of Human-Powered Directories


Lengthy search time: Having to wait for, what may seem like forever,

before receiving an answer.

Unanswered questions: While some sites may take days, other sites may

not even have an answer for your question

Human Error: We all know and trust Google to deliver our answers, but

we have no idea who is answering our questions on human powered sites, and what their qualifications are. Would you trust just anyone? Because I certainly don't categorize, and sub-subcategorize your questions- which takes the simplicity out of these human powered search engines.
22

Annoying Categorization: Many sites ask you to categorize, sub-

"Hybrid Search Engines" or Mixed Results


Hybrid search engines use a combination of both crawler-based

results and directory results. More and more search engines these days are moving to a hybrid-based model.
It extremely common for both types of results to be presented.

Usually, a hybrid search engine will favor one type of listings over another.
For example, MSN Search is more likely to present human-powered

listings from Look Smart. Example-Yahoo ,Google


23

Meta Search Engine


Transmit user-supplied keywords simultaneously to several

individual search engines to actually carry out the search.


Search results returned from all the search engines can be

integrated, duplicates can be eliminated and additional features such as clustering by subjects within the search results can be implemented by meta-search engines.

24

Meta Search Engine Cont..


Meta-search engines are good for saving time by searching only in

one place and sparing the need to use and learn several separate search engines.
But since meta-search engines do not allow for input of many

search variables, their best use is to find hits on obscure items or to see if something can be found using the Internet.

25

Pros of meta search engines


1.Searching with many primary search engines often finds results missed by a single primary engine. 2. Requesting results from many primary engines in parallel saves time. 3. Eliminating duplicate results also saves time. 4. Getting results from many different primary engines provides opportunities to explore how to best combine the separate result lists
26

Cons of meta search engines


1. Timeouts or long waits may occur if the meta search engine is having difficulty contacting the primary engine. 2. Many meta search engines only get the top 10 to 50 results per primary engine. 3. Some advanced features (ex. phrase searching) may not be available. 4. Many meta search engines exclude one or more of the major primary search engines (Google, Microsoft, or Yahoo).
27

How it works
1. Index ahead of time

Find files or records Open each one and read it Store each word in a searchable index

2. Provide search forms


Match the query terms with words in the index Sort documents by relevance

3. Display results

28

29

1. Index ahead of time by spiders


To find information on the hundreds of millions of Web pages that

exist, a search engine employs special software robots, called spiders, to build lists of the words found on Web sites.
A program that automatically fetches Web pages. Spiders are used

to feed pages to search engines. It's called a spider because it crawls over the Web. Another term for these programs is web crawler.

30

Cont
Spiders store the lists in the engines database.
The engines indexing software builds an index of words . Information is matched against query input and retrieved

(processing algorithm)

31

What the Index Needs


Basic information for document or record

File name / URL / record ID Title or equivalent Size, date, MIME type

Full text of item More metadata


Product name, picture ID Category, topic, or subject Other attributes, for relevance ranking and display

32

Simple Index Diagram

33

Cont..
Once the spiders have completed the task of finding information

on Web pages the search engine must store the information in a way that makes it useful.
a search engine could just store the word and the URL where it was

found. In reality, this would make for an engine of limited use, since there would be no way of telling whether the word was used in an important or a trivial way on the page

34

Cont
Ranking list that tries to present the most useful pages at the top of

the list of search results


The engine might assign a weight to each entry, with increasing

values assigned to words as they appear near the top of the document, in sub-headings, in links, in the meta tags or in the title of the page
An index has a single purpose: It allows information to be found as

quickly as possible. There are quite a few ways for an index to be built, but one of the most effective ways is to build a hash table. In hashing, a formula is applied to attach a numerical value to each word.
35

2.Provide search form


Searching through an index involves a user building a

query and submitting it through the search engine.


The query can be quite simple, a single word at minimum. Building

a more complex query requires the use of Boolean operators that allow you to refine and extend the terms of the search
Boolean operators- AND, OR, NOT, FOLLOWED BY, NEAR etc.

36

Cont..
Most of search engines support caching to reduce the cost of time of

searching of common words like "Amazon" dramatically. If the site received a query whose result is stored in cache, it returns the result from the cache without any posting a query request to the main database.

37

3. Display result
After the search engine received the result from the main database

or cache, the site has to display the result to the user. The listing of result is usually quite simple: just list web pages that are hit with the description of the site. However, the order of the list is important yet difficult to judge by pure computation.

38

Page rank
Once the search engine has found web pages for the given query, what ordering should the links be provided?

Google researchers invented the page rank

some pages are found to be more important than others and so, if two pages match a query, order them so that the more important pages link comes first Ordering is based on the page rank which primarily looks to see if a page is an authoritarian page which means that a lot of other pages link to it
39

Cont.
Similarly, a hub is a page which has a lot of outgoing links and

may represent a good starting point Advertising can also affect the order that pages are offered

Advertisers will pay search engine sites to place their links before others, or in special areas of the web page If you go to Google and search for computers, you get links for Dell, Apple, Staples, and others near the top and to the right of the page why?

they paid to be there !! Best Buy didnt pay as much, so they are located lower down !

This is a consequence of commercializing the web money talks

40

Search Will Never Be Perfect


Search engines cant read minds User queries are short and ambiguous Some things will help Design a usable interface Show match words in context Keep index current and complete Adjust heuristic weighting Maintain suggestions and synonyms Consider faceted metadata search
41

THANK YOU

42

S-ar putea să vă placă și