11/11/2016

Search Engine Internals

Crawlers
A crawler is a program that visits Web sites and reads their pages and other information in order to create entries for a search engine index. The major search engines on the Web all have such a program, which is also known as a "spider" or a "bot."

Indexers
database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and the use of more storage space to maintain the extra copy of data.

Semantic
Semantics is the study of meaning. It focuses on the relation between signifiers, like words, phrases, signs, and symbols, and what they stand for, their denotation. semantics is the study of meaning that is used for understanding human expression through language. 





Search Engine Internals

Search Engine Internals

Crawlers
Indexers
Searching
Semantics

Ranking
How does a Search Engine work ?













Types of Search Engine

Crawler Powered Indexes
         Guruji.com, Google.com
Human Powered Indexes
www.dmoz.org
Hybrid Models
Submitted URLs to a search engine ?
Semantic Indexes

Hakia.com,


Search engine


Definition:  An internet-based tool that searches an index of documents for a particular term, phrase or text specified by the user. 
Common Characteristics:
Spider, Indexer, Database, Algorithm
Find matching documents and display them according to relevance

Frequent updates to documents searched and ranking algorithm
Introduction of SEO:

web search engine is a software system that is designed to search for information on the World Wide Web. The search results are generally presented in a line of results often referred to as search engine results pages.


Search engines look through their own databases of information in order to find what it is that you are looking for…