How Search Engines Work
How Search Engines Work
by Jeremy Usher, Co-Founder, President
Search engines are complex, but it takes only a little knowledge and common sense to succeed.
For most companies, the majority of website traffic is derived from search engines like Bing and Google. To reach more customers, websites must deliver content that is at once agreeable to humans and machines. While current ranking algorithms are closely guarded trade-secrets, the basic premises on which the engines operate are well-known.
Those of us who remember the infancy of the Web will recall that search engines prior to Google did not always return quality results. In that era, algorithms frequently focused on meta tags - a site's own description of its contents. The meta approach proved untenable chiefly because author provided meta-information does not necessarily have anything to do with what actually appears on a site. Realizing this, unscrupulous content authors quickly injected sites with whatever tags would receive the most interest (I'll leave you to guess what kind of terms were most prominent), regardless of actual content. Needless to say, today's engines ignore meta tags for purposes of ranking (certain tags can be used for other purposes). To make up for poor automated results during that early period, the dominant engines (e.g. Yahoo!) heavily supplemented indices with high quality, human collated, paid listings. This may have appeared well-enough to a significant population of searchers. However, given the rapid growth of the Web, a manual approach was not sustainable.
Since Tim Berners-Lee launched the first website at CERN in 1990, the Web has grown at a blistering pace - doubling in size every 18 months. As of February 2013, there are roughly 187 Million websites; meanwhile, Cisco reports that Internet traffic will cross the zettabyte (1,000,000,000,000,000 Megabytes transferred annually) threshold in 2016. So how do search algorithms solve the riddle of making sense of so much data?
In 1996, while undergraduates at Stanford, Larry Page and Sergey Brin proposed a simple answer - citations. Earlier research had gone into quantifying what everyone already knew about academic journals - namely that an important paper is cited many times by subsequent papers. Page and Brin had the key insight that the structure of the Web - with pages and sites connected via hyperlinks - was amenable to the very same kind of analysis that academic journals were. Webpages that mattered - with important, authoritative and valuable content - would naturally and organically be linked to by other webpages. It was an elegant and profound criterion, and one that turned web content into a quasi-democracy, with important content (narrowed by matching traditional keywords) rising to the top of search results based on the number of votes (links) from other websites. The approach was also strong because, unlike easily corruptible metadata, a high PageRank depended on contributions from many third-parties - a criterion that is nearly impossible to fake.
Today, a successfully search engine optimization campaign incorporates many factors, but the importance of inbound-links remains undiminished. All things being equal, a site that is linked to by many other sites - particularly those that are also highly ranked, authoritative resources for their subject matter, will achieve a higher search-ranking, and consequentially more traffic than its competitors. Once well-ranked, more people will find and link to a site, meaning that, over time, optimization leads to a self-reinforcing, synergistic effect.
How do you get other people to link to your site?
With an understanding that search rankings are largely determined by the collective behavior of the rest of the Web, strategies for a successful campaign become clear.
a) Write Quality Content, and Write it Often
You need links. Users need something to link to. By providing information of value - well-written, accessible text that provides a unique understanding of your subject matter - users will naturally reference your content with a link. Once you are established as a content leader, users will return to you as a resource. By offering regular articles, tutorials or other insights, there will always be new reasons for users to return and share your content on their own site or on relevant social networks. Syndicating content through RSS/ATOM feeds makes the consumption and sharing of your content that much easier.
b) Be Aware of Link Opportunities
Be certain to take advantage of like opportunities when they arise. Any information that is provided in press releases, trade directories, third-party news coverage and the like should, if at all possible, include a clickable hyperlink back to your site.
c) Utilize Social Media
Because of the exponential mathematics of viral transmission, there is perhaps no faster way to spread word about your site than social media. Consider an idealized example where you post a clever article or video that catches the attention of the social web. Further suppose that each person who shares your link on Facebook has 5 friends that repost the link. Then after only 5 sequential shares you have received 3,125 links. After 10 generations you would have 9,765,625 links. In addition to the startling potential for link building, social media is demonstrably where people spend their time. In 2012, comScore revealed that 1 out of every 5 minutes spent online is spent on social media, and that percentage has been climbing steeply each year.
Keywords & Other Considerations
Search algorithms incorporate a number of ever-changing criteria into their ranking. While the number of inbound links is critical, the content of each page is also of tremendous importance. Remember always that the purpose of a search engine is to return results that correspond to a user's query. For an engine to return the correct results, it needs to have a clear understanding of a page's subject matter. The lion-share of this analysis is done by analyzing the text on each page for descriptive terms that yield insight into the primary content. These keywords are used to succinctly describe the nature of your page and exact matches between what users search for and what appears on your page deliver the best results. The relative frequency that a term appears on a page naturally influences the algorithm's understanding of the relationship between that term and your page. So if your page includes the phrase "Maine's Coast" 5 times, it is a reasonable bet that your page is relevant for a similar search. But be careful - an overabundance of keywords can be seen as a spam attempt against the engine and actually result in a listing penalty. Use common sense and write natural text that contains a reasonable number of repetitions for terms that you believe people will include in search queries.
In addition to the raw information provided, the semantics of keyword presentation is known to significantly impact search rankings. Keywords that appear in a domain name are of utmost value - if you own shoes.com and a user searches for "shoes," you're probably in very good shape. The title of a page is similarly of immense importance and is also what a user sees first when results are returned. Likewise, keywords that appear in headlines carry more weight than those that appear in the body of a text. Finally, keyword analysis is not restricted to the primary website. If inbound links from other sites contain keywords, they will be integrated into the search calculations.
While there are many other factors that ultimately influence rank - from the status of other sites that link to you, to the length of time your domain has be registered - these basics of search engine optimzation are enough for most site owners to achieve their goals. It all begins by delivering great content, so get started!
Brin, Sergey and Page, Lawrence. “The Anatomy of a Large-Scale Hypertextual Web Search Engine.” Retrieved February 17, 2013 from the Stanford University Infolab website: http://infolab.stanford.edu/pub/papers/google.pdf
Cisco. (May 30, 2012). “Cisco Visual Networking Index: Forecast and Methodology, 2011-2016.” Retrieved February 18, 2013 from the Cisco website: http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-481360_ns827_Networking_Solutions_White_Paper.html
Netcraft. (February 1, 2013). “February 2013 Web Server Survey.” Retrieved February 18, 2013 from the Netcraft website: http://news.netcraft.com/archives/2013/02/01/february-2013-web-server-survey.html
Shaw, Mike. (February, 2012). “The State of Social Media.” Retrieved February 20, 2013 from the Slide Share website: http://www.slideshare.net/karanbhujbal/the-state-of-social-media2012-comscore-report