Article DetailsA Web Search Engine: A Detailed Analysis |
| Date Added: March 21, 2010 03:47:17 PM |
| Author: nactalia716 |
| Category: Internet Search: Search Engines |
An Internet search engine is a tool specially-designed to search for data on the Internet. The search results are commonly called hits and are provided in the form of a list. The data may consist of web pages, images, data and other types of files. Some search engines also gather information available in databases or open directories. In comparison with Internet directories which are maintained by human editors, search engines function automatically or are a mix of algorithmic and human input. Internet search engines function by storing data about numerous web pages which they retrieve from the WWW. These pages are retrieved by a web crawler, or differently called a spider. It is an automated Web browser which follows every link it discovers. The content of each page is then analyzed to determine how to index it. Words, for example, are removed from titles, headings and subheadings or special fields called meta tags. Data about web pages are stored in an index databank for further use in queries. Some search tools, such as Google, store the entire or part of the source page (also called a cache) and information about web pages, whereas others, such as AltaVista, save and store every word of every page they have discovered. The cached page always comprises the actual search text, because it is the one that was actually indexed. So, it can be very helpful because it holds information that can no longer be available elsewhere. When a user types search words in the search field, the tool browse through its catalogue and provides a listing of best-matching web pages according to its criteria, commonly with a short summary combined with the document's title and at times extracts from the text. Some search tools provide an advanced feature called proximity search which allows users to determine the distance between key words. The relevancy of the result set determines the usefulness of a search engine. Since there can be millions of web pages that comprise a particular word or phrase, web pages can be grouped into relevant and irrelevant ones. The results can be ranked to display the "best" ones first. The way a search tool displays web pages varies from one engine to another. The techniques also change with time, as the use of the Internet alters and new techniques are employed. |
|
|