A search engine is a computer program that acts as a way of retrieving information from a database, based on certain criteria defined by the user. Modern ones search databases that contain huge amounts of data, collected from the World Wide Web, newsgroups, and directory projects.
Before the World Wide Web existed, but after the advent of the Internet and its ensuing popularity in the university circuit, the first search engine was created. At this point in history — in the late 1980s and early 1990s — one of the main protocols being used on the Internet was the file transfer protocol (FTP). FTP servers existed throughout the world, usually on university campuses, research facilities, or government agencies. Some students at McGill University in Montreal decided that a centralized database of files available on the various popular FTP servers would help save time and offer a great service to others. This was the origination of the Archie search engine.
Archie, which was short for archive, was a program that regularly logged in to FTP servers in its list, and made an index of what files were on the server. Because processor time and bandwidth was still a fairly valuable commodity, Archie only checked for updates every month or so. At first the index that Archie built was meant to be checked using the Unix command grep, but a better user-interface was soon developed to allow for easy searching of the index. Following Archie, a handful of search engines sprang up to search the similar Gopher protocol — two of the most famous being the Jughead and Veronica. Archie became relatively obsolete with the advent of the World Wide Web and subsequent search engines, but Archie servers do still exist.
In 1993, not long after the creation of the World Wide Web, Matthew Grey developed the World Wide Web Wanderer, which was the first web robot. The World Wide Web Wanderer indexed all of the websites that existed in the internet by capturing their URLs, but didn’t track any of the actual content of the websites. The index associated with the Wanderer, which was an early sort of search engine, was called Wandex.
A few other small projects grew up after the Wanderer, which began to approach the modern search engine. These included the World Wide Web Worm, the Repository-Based Software Engineering (RBSE) spider, and JumpStation. All of these three used data collected by web robots to return that information to users. Still, information was just returned unfiltered for the most part, although RBSE did attempt to rank the value of pages.
In 1993 a company founded by some Stanford students, named Excite, released what is arguably the first search engine to actually incorporate analysis of the page content. This initial offering was meant for searching within a site, however, not searching the web as a whole.
In 1994, though, the world of the search engine had a major breakthrough. A company called WebCrawler went live with a search engine that not only captured the title and header of pages on the Internet, but grabbed all of the content as well. WebCrawler was enormously successful — so successful that a great deal of the time it couldn’t even be utilized because its system resources were all being used.
A bit later that year Lycos was released, including many of the same features as WebCrawler, and building on them. Lycos ranked its results based on relevancy, and allowed the user to tweak a number of settings to get results that fit better. Lycos was also huge — within it year it had well over one million websites archived, and within two years it had reached 60 million.