How A Google Algorithm Works
Over the course of Google’s long (in internet years) history, there have been many developments and updates that dramatically changed the course of the company’s progress. What began in 1998 as a relatively crude search engine has evolved into a massively far-reaching search giant.
To wit, Google now makes an astonishing $89 billion in yearly revenue. Their main method for generating that profit is selling ad space on their search results pages. These pages serve as the proving ground for websites, companies, organizations, and everything in between; all of them vying for generous treatment from Google so that they might be seen by more users. That attention economy means that every tweak Google makes – especially ones to its all-important rankings algorithms – has immense impact on companies that use the internet to build an audience.
After all, the ranking algorithms are what Google uses to make decisions on which sites get placed higher in results and which ones are banished to the bottom. There are many factors that go into these rankings, but the entire SEO industry – one that we are pretty proud to call home – is based on parsing out some order from the madness and helping sites adjust accordingly.
Google updates these algorithms pretty much constantly, with entire teams of engineers working to fix bugs and improve overall performance. Sometimes, though, updates are big enough to deserve individual attention, naming, and responses from the SEO community. In fact, some of them are big enough to transcend the initial rollout and enjoy long lives as standalone, rankings-impacting algorithms.
Two of the most famous of these are Panda and Penguin. Let’s take a look at some of the differences between the two and how you might stay on their respective good sides as a webmaster or business owner.
The Panda algorithm, named after one of its engineers, Navneet Panda, arrived on the scene in early 2011. Its main algorithmic goal is to reduce the number of low-quality sites appearing at or near the top of Google’s results. At first, many SEOs thought this was aimed directly at “content farms” – sites that exist solely to build credibility on aggregation of other sites’ high quality content. Eventually, it became clear that Panda has a much more holistic idea of what it means to be a high quality site.
Web forums at the time exploded with ideas about what webmasters and business owners could do to correctly navigate the new, panda-infested SEO waters. Eventually, Google’s very own Webmaster Central Blog published a guide to understanding Panda’s in’s and out’s. The post included a checklist of questions to help webmasters decide whether their site was high enough quality to perform well on the search engine. Some of the most notable and enduring inclusions were:
- Would you trust the information presented in this article?
- Does the article provide original content or information, original reporting, original research, or original analysis?
- Was the article edited well, or does it appear sloppy or hastily produced?
- Would users complain when they see pages from this site?
- Does the article provide substantial value when compared to other pages in search results?
In the years since, Google continued to mold Panda into a quality-seeking algorithm, operating it as a separate entity from Google’s main algorithm. Then, in January of last year, Google gave Panda it’s very own coming of age ritual, rolling it into the main algorithm itself. As a result of this, Panda-specific penalties are somewhat harder to notice and respond to. Still, though, the main concepts apply. As time goes on Google gets better and better at identifying sites that users will actually benefit from using. Accordingly, every decision you make about your website should factor in that ultimate goal.
The Penguin algorithm update came the following year, in 2012. While it shares a central tenet of Panda in that it seeks to show users better, more useful results, it does so by attacking a different strain of unsavory sites. Specifically, it focuses directly on the backlink profiles of the sites it is ranking.
Generally, links have been a major aspect of SEO since the beginning of the industry. As Moz smartly puts it, “A link is like a vote for your site. If a well-respected site links to your site, then this is a recommendation for your site. If a small, unknown site links to you, then this vote is not going to count for as much as a vote from an authoritative site.
In the past, optimizers took advantage of this part of the algorithm in loose, wide-ranging ways that didn’t always line up with showing users the best possible content. The most common misconceptions around Panda are those that focus on specific SEO constructs as opposed to a more general appreciation for quality content. One such tactic was focusing solely on the domain type of a given link, as opposed the quality of the content contained therein. For example, there was a time when .edu links were considered a major asset since they tended to be more legitimate and less spammy than some types of .com links.
While there is reasonable thinking at the core of this, it was ultimately proven to be an overhyped idea. Google engineer John Mu stated in an interview that .edu backlinks are not assigned “additional” credibility from the search engine. Instead, .edu (and .gov) domain links are often extremely valuable simply because of the quality of the content, not because Google has a blind adherence to simplistic rules about domain type.
The most recent major update to the Penguin algorithm (known as Penguin 4.0) started to show signs of life around October of last year. The two major components of this rollout were that Penguin would henceforth run in real time and more granular. In this context, running in real time refers to being folded into the main algorithm. If you remember, this also eventually Panda. It seems likely that Google’s general philosophy about standalone algorithms is that their eventual resting place is within the core algorithm itself; perhaps it just takes a few years of tweaking and growing before they see it as ready for the move. “More granular” means that instead of Penguin-related penalties being administered wholesale, in large site-wide penalties, they will be levied at individual links.
Therefore, there is a forgiving element to its current iteration, giving webmasters time to get rid of bad links over time as opposed to scrambling to recoup the massive losses of a site-wide penalty.
One thing to always remember when optimizing your website for search engines is to think holistically. This means to focus on the demonstrable quality of your site as opposed to tiny details that you may have heard described as SEO cure-alls in the past. After all, there is no such thing as a quick SEO cure-all, and anyone who’d have you think otherwise is probably involved in some blackhat techniques.
Just remember, Panda focuses on your site’s overall quality for the user. Use Google’s checklist of questions to ask yourself as a general guide to achieving that quality. Penguin focuses on the backlink profile of your site and now works in a granular, real-time fashion. To react accordingly, improve your overall backlink profile and remove spammy links. It’s also important to remember that the Google Disavow tool does still have a place in your SEO strategy despite some early prognostications to the contrary. Use it to avoid Penguin-related manual penalties and manual actions that may fall outside the purview of Penguin but still can affect your rankings.