logo Searching the Internet Effectively:

Search Engines


Search Engines are good for concepts that are represented by clearly defined terminology, which will appear only in relevant items. Time taken for "spider" to traverse web means that indexes won't be entirely up to date.

Search Engines vary in the number of pages that they cover (see Greg Notess' estimates of size and Search Engine Watch's 2004 Update).

No search engine covers the entire Internet, different search engines cover different parts of the Internet, and differences in ranking mean that different sites appear in the first few pages. A 2007 study showed that in a typical search repeated across several search engines, almost 90% of results appear only from one search engine.

Thumbshots comparison of ranking for searches is a good way to visualise how different search engines find different sites, and rank them differently.

Search engines do not index all pages on the Web. In particular, search engine robots may not index:

So there is a lot of the Internet that is not covered by the standard search engines: this is sometimes called "the invisible web" (Actually it's not invisible, and much of it is indexed by other search tools, e.g. directories).

Search Features

Search engines have different methods of searching, and offer different features. A good comparative table of features is provided by Greg Notess. However search engines change their features frequently.

Exploring a search engine: Google

http://www.google.com/
Google is one of the most popular search engines. It ranks search results according to the number of links to the site, and whether the sites linking are themselves well linked.  This is similar to using citation counts to evaluate the importance of print documents, and appears to be one of the keys to its effectiveness. 

Relevancy

Most search engines present results ranked by relevancy. How relevancy is determined varies according to the search engine, but in general relevancy reflects:

Using Relevance effectively

keyboard

Try a search for "pollution aspects of windmills" on Google: start with "pollution windmills"; then add more terms to see the effect on your search result.

Relevancy ranking generally works well for sites with specific names e.g. name of organisation. But:

Quality of results may be improved by adding the name of a reputable organisation

Excluding/including terms

If you enter several terms, Google will search for all the terms (implied AND). To specifiy that you don't want pages with a particular term, put "-" in front of the term.

This is useful where a search term may have different meanings. For example "cycle" can mean the activity of bicycling, and (particularly in the US) a motorcycle. Compare searches for:

cycle

cycle -motorcycle

Phrases and adjacency

In Google, as in most search engines, enclosing a phrase in quotes searches on the phrase. this is particularly useful when a specific phrase contains very common words. for example compare on Google:

just in time management

"just in time management". 

Alternative terms

Often the same concept may be described by several different words. For instance "pension" and  "superannuation" describe the same concept. We need the OR operator to search for these on Google. For example to search for widows' pensions in New Zealand

Widows pension OR pensions OR superannuation zealand

Use the OR operator to include alternative spellings: "images christmas OR xmas decorations"

Special operators

It can also be useful to search by:

How can you do these types of search in Google? One way is to use the "advanced search" option, which builds the search for you.

Caching

Google also keeps ("caches") a copy of the site as it was when the its robot indexed it; this can be useful for sites that have become inaccessible. You can access the copy by following the link "Cached" in the record display.

keyboard

Exercises

Try these searches on Google:

Google has a number of specialist features that are accessible through Soople

Google even has had a special song written about it.

What makes a good query? [Stacey&Stacey 2004]

Other selected search engines

Yahoo!

http://search.yahoo.com/

Yahoo! started life as a directory, then linked to search engines such as Google. However Yahoo! now has its own search engine. Try a search on both Yahoo! and Google, and compare the results.

Bing

http://www.bing.com/

Bing (previously Live Search) is Microsoft's Web search engine. It suggests related searches, and provides a snapshot of text from each page when you place the cursor next to the item in the search result. The advanced search option allows you to dynamically build a search.

Ask.com

http://www.ask.com
The Ask.com search engine is the result of a merger of two "search engines":

keyboard

Check Ask.com for information about

Exalead

http://www.exalead.com/
Includes some useful search features, including truncation, phonetic search, and a NEAR operator. Gives suggestions for narrowing your search.

Wolfram|Alpha

http://www.wolframalpha.com/
Wolfram|Alpha indexes sites that contain numeric or other structured information, and attempts to answer questions about this data. So "height Mount Cook" gives the altitude and calculates the likely air temperature and pressure at the summit.

Google Scholar

http://scholar.google.co.nz/
Google Scholar indexes research material (e.g. reports, e-journals, conference papers) on the web, and identifies citations to other research material. The display includes the number of times a document has been cited, and searches also retrieve references to print documents that have been cited in reasearch documents on the web.

Metasearch engines

Metasearch engines, or combined search engines, search across several search engines. Since no one search engine covers the whole Web, using metasearch engines makes sense, if: A disadvantage is that you can't take advantage of the special features of a particular search engine, and you may only be shown the first few from each search engine. Also, some search engines don't allow access by metasearch engines.
 

Selected metasearch engines are:

Clusty

http://clusty.com/
Clusty (formerly Vivisimo) returns results from several search engines, and clusters them by common terms. 

MetaCrawler

 http://www.metacrawler.com/
Metacrawler collates results from the different search engines, and presents a combined list, with duplicates removed.

keyboard

Compare a search on Clusty and MetaCrawler for information on "student allowances"

Search Engine Resources:

Some further reading on search engines:
Search Engine showdown/ Greg Notess.  http://www.searchengineshowdown.com/
Surveys search engines from a reference librarian's point of view.
Search Engine Watch/ Danny Sullivan. http://searchenginewatch.com/
A good overall and up-to-date directory; though oriented to assisting web managers to get their sites "noticed"; rather than helping searchers.
 
Google Guide
An interesting interactive tutorial on Google, with sections for novices, experts and teens.

arrow Search Strategies


Introduction Preparation Tools Directories Search Engines Strategies Evaluation

Last updated 19 June 2009 by Alastair Smith