Search engine

The educational technology and digital learning wiki
Jump to navigation Jump to search

According to Wikipedia, “a search engine is a program that help users to find information stored on a computer system. There are several types of search engines, which are designed to retrieve documents stored on the World Wide Web, inside a corporate or proprietary network, or in a personal computer.” (accessed 17:33, 26 January 2007 (MET)) The expression search engine usually refers to a Web search engine, which retrieves information from the public Web.

When users need some pieces of information, they just ask for web pages containing a given keyword and the search engine provides a list of items where the keyword can be found. This list is often sorted on the basis of the relevance of the results.


A short history of search engines

In 1990 the very first search engine, Archie, was created. It indexed files located on public FTP (File Transfer Protocol) sites so that users could find specific files inside a searchable database. It could not search by file contents.

In 1994 the WebCrawler search engine was created. It was the first full text search engine, i.e. where users can search for any word in any web page. This feature became the standard for all major search engines and it was the first search engine to be widely known.

In 1994 Lycos came out; soon after Excite and AltaVista appeared. At the moment the most popular search engines are Google, Yahoo! Search, and MSN Search by Microsoft.

How search engines work

Web search engines use 'robot' applications that automatically search the Internet, visit web pages, and store data about their content which is then compiled into a huge index. Every search engine uses a proprietary algorithm to create its indexes such that only meaningful results are returned for each query. In order to find relevant web pages, search engines give each document a rank, i.e. a relevance score. Relevance scores reflect the number of times a search term appears, if it appears in the title, if it appears at the beginning of the document, and if all the search terms are near each other. However, usually a short summery is more useful than a ranking.

Looking for more words narrows down the search while looking for fewer words widens the search. Many search engines use tools such as AND, OR and NOT to help. Some search engines have sophisticated ways of narrowing down a search to increase the chances of the user finding what they want: they can focus on a particular type of website, group their index into subjects and topics, or offer users suggestions after their initial search.

Here are the required steps to get information from a search engine: 1. users make a query typing keywords into a search engine; 2. the search engine software quickly sorts through millions of pages in its database to find matches to this query; 3. the results are ranked in order of relevancy.

How to search databases

Boolean Searching Users can type AND or NOT or just put + (plus) or (minus) signs after search terms in order to narrow the search and find exactly what they are looking for.

Search by Field Users can specify where the search terms should be on the page to get high priority or be included in the results. This option is usually available in the 'advanced search' page of the site.

Exact Quotation Searches Users can type the expression they are looking for between double quotation marks so that they are sure the terms are in the exact right order and are together. Quotation marks also allow you to search for exact phrases.

Filters Filters can be used to narrow searches. The most common types of filters get rid of 'adult' sites.

Mind Map A search engine, Kartoo, shows its search results as a visual concept map of connections between different sites.

Natural Language Searching When looking for specific information users can type a query in the form of an English question.

Google Search Users type the terms they are looking for, click enter and the engine will return all websites that have the search terms in them. Retrieved pages will be shown in a list based on how popular the page is on the web.

How to improve your search results

The first thing users can do in order to improve their search results is to first carefully think about what they are looking for and be as precise as possible about their search terms. It can be more useful to put only few keywords and make “title only” searches, than looking for many search words at the same time.

Users can also refine their searches using the “advanced search” option. Even though this can really make a great difference in the quantity and quality of pages returned, a BBC article argued that only 10% of people refine the results.

If users still don’t find what they need, they should try to be more technical by using Boolean terms – the AND, NOT, OR, and AND NOT operators.

Moreover, users should keep in mind that search engines are very different and offer different services. They should know which search engine is the right one according to their needs and this will help them to achieve better matches.

However, an article that appeared on the BBC website argues that it is much better “to become as familiar as possible with one search engine and stick to it.”This comes from the fact that a study found that “of 600 queries 60% of results returned for a particular set of terms will be the same across all search sites.”

Most used search engines

According to Darren Waters (2006), technology editor of the BBC News website, Google is “the world's most popular search engine.” It has been argued that its dominance of the search market, which is said to stand in the UK at 75% (followed by MSN and AskJeeves, both on 8.4%, and Yahoo with 8%), is due to the fact that Google has become an habit. In fact, Mr Elliott said that “it isn't much trouble to go to another but people increasingly have Google on their browser window and even for those that type it in each time it has become a habitual thing.”

It has never been proved that Google has better results than other search engines. On the contrary, webreference.com, for example, tested many search engines following these parameters: “One Item Among Many Related Pages”, “Obscure Item”, “Selectivity: Apple trees NOT computers”. On the basis of these tests, it appeared that “Lycos is the official heavy weight search engine champion of the universe”, even though it might be better to “choose different engines for different tasks.” Mr Elliot supports this idea, arguing that “people are unreasonably attached to Google and the issue is that people are not experimenting with other products.”

Nowadays other search engines have been trying to break the so-called “Google habit”, like Ask, WebFetch and its sister search engine in the US DogPile. Even though their position in the search market is still very limited, Jane Wakefield (2007) argued that “the movement to persuade users away from the dominant search engines such as Google and Yahoo may be small but it is gathering momentum from those with more solid radical credentials.”

May 2008: A new functional search engine has been launched: Powerset based on Natural Language Searching to search wikipedia articles and the free database freebase . It makes use of cross-reference tables, topic indexing, and bibliography to ease dynamic navigation in the given information.

The future of search

According to Mr Merrill, the company Google is planning to expand activity very much in the future. Its projects include better searching for mobiles, personalised searches, language translation, accessing offline information and defeating web spam, instant messaging and online mapping.

Moreover, according to Spencer Kelly (2005), Google “is also hoping to build on its online music radio service, Launchcast, which includes a personalised music selector that learns the types of music you like and streams you different tracks accordingly.” He adds that, as regards videos, Google “is planning a service that allows you to search transcripts and plotlines for videos stored on the web, and watch them.”

As regards Yahoo, if you “create a Yahoo user ID and you not only get free e-mail, but you also get a chance to create your own homepage, to which you can add your own elements: calendar, e-mail inbox, news stories, weather, change the colour scheme, etc.” This way, your web interface is going to be really like you want it to be.

The tendency seems to be that in the future, search engines will be even more personal. We are not only talking about personalised homepages, but even your search options will be personalised. According to Jane Wakefield (2005), Jimmy Wales, founder of Wikipedia, wants to offer an alternative "people-powered" search engine. The journalist also said that “his plan is to Wiki-fy the process of internet search, so that human beings decide openly how to rank and organise information, not the huge private servers of Google and Yahoo.” Jimmy Wales called this ambitious project "Search Wikia", saying that it will be “the search engine that changes everything.”

Maybe search engines that visualize search results and their relations (like Nestor or InstaGrok) will also become more popular.

Impact on society

Search engines like Google search, being run by advertising companies, change the organisation of the economy and the directly and indirectly the society. Graham (2017, abstract) argues that “Google’s dominance over the web allows it to dictate various norms and practices that regulate the state of contemporary capitalism online.”

Links

  • Search engine relationships by (Bruce Clay). Shows data traffic between major search engines over time (as of June 2010, the last update was made one 7/2009).
  • Faceted search (Wikipedia). Faceted search is a combination between direct search (à la google) and navigational search (à la DMOZ).

References

  • Graham, Richard. Google and advertising: digital capitalism in the context of Post-Fordism, the reification of language, and the rise of fake news, Palgrave Communicationsvolume 3, Article number: 45 (2017). doi:10.1057/s41599-017-0021-4