As a seasoned eCommerce marketing agency, LION Digital strives to stay on top of industry news and follow best practices. Yet decades of team expertise in digital marketing and thousands of hours of fieldwork in SEO dictate that it’s unwise to count solely on certain universal approaches in attempts to cut corners. The recent info explosion in the form of the most massive Google Leak and subsequent disputes that it provoked in May 2024 has demonstrated how important it has been to maintain healthy scepticism, curiosity, and testing to determine the subsequent unique insights from that testing. 

A lot has been written and stated within the global SEO community regarding Google’s search documentation leak. Some went as far as to call this occasion “the greatest leak of the century.” Considering the magnitude of the leak and the professional forensics going over the information, even more will be explored and shared. At LION, our stance is: 

SEO has always been and will always be about common sense centralised around creating a quality internet environment with the best user experience. In this article, we want to highlight and support this concept.  

Now, let’s start from the beginning. 

Did Google representatives lie?

Direct communication between Google representatives and the SEO community started in 2005. The first initiatives were launched to create a place for webmasters to learn more about crawling and indexing websites and find helpful tools in one single place. Enthusiasts needed direct communication with Google to understand the best practices, and Google needed SEOs’ requests and recommendations to create a friendly environment for further efficient collaboration. Simultaneously, the market filled up niches, where Google’s reaction was not agile enough or unsatisfactory from the search engine professionals’ points of view. 

We are far from advocating in favour of Google. Still, neither the initial nor subsequent Google direct communication initiatives’ objective was to disclose fully or even partially Google’s proprietary information, which consists of the search engine’s core. Instead, they wanted to gently nudge the community in the right direction and facilitate proper thinking. Following numerous Google’s “Webmaster Hangouts”, “Office Hours”, other videos and podcasts, we presume they wanted to clarify some insights from the observations by answering questions like “Why did this happen?” but not to indicate how to manipulate the search engines flourishing at the dawn of SEO. If we look at the recent event from that angle, the significant part of the “lies” accusations in the leak-revealing statements appears in a different light. 

For instance, consider one of the most frequently applied accusations that Google denied numerous times the usage of click logs of user behaviour in the ranking. Let’s leave the PR disputes behind the brackets and presume that Google representatives assuringly confirmed that Google algorithms use clicks for ranking. What else would have prevented the appearance of the abundance of grey-hat services that imitate clicks in SERP?  

Another considerable discussion was around a sitewide authority score similar to Moz’s domain authority or Ahref’s domain rank. One could claim that Google spokespeople played semantics, denying the existence of such ranking factor as the SiteAuthority score appearing in the leak. However, a relatively rational view allows us to conclude that these indicators have very little in common despite similar names. Google’s site authority score seems to be a much more complex concept that doesn’t rely solely on backlinks as DA and DR. Yet again, let’s imagine if Google reps had confirmed Google’s domain authority could be proven as a ranking factor in the way that the general public understood, the surge of the link purchases would be phenomenal. 

These are just two examples to give you the general idea.      

Why is the leak not fully reliable?

The API documentation described two thousand five hundred ninety-six modules with 14,014 attributes (features). There are a raft of reasons to question the reliability of the findings derived from the leak observations, and in honour of the authors, they have also been named:

  • The context is limited because the modules and attributes are merely described, but there is no source code about their application, what exactly triggers them, and why.
  • What is the purpose of the data? It could be equally used to generate the actual search results as well as for internal data management, search on the web or other verticals, or to train a model to predict outcomes. Originating from Google doesn’t necessarily mean originating from Google Search.
  • There is no information regarding the stages in which the modules are activated and the features are used.
  • There are many references to deprecated features or others with specific notes indicating not to use them further. Google’s search engine continuously evolves, and until the market is granted another leak, no one can be sure what was actively used in the Google ranking systems at that time and what was not.
  • Although the signals and their nuances are somewhat evident, scoring functions are not available, and therefore, outtakes regarding the weight of the features are unknown. 
  • The documentation leak contains links to Google’s corporate network assets requesting personal credentials to access additional details about the system’s nuances. Hence, the picture is far from complete. 
  • There is a clear risk of falling victim to confirmation bias — searching for, interpreting, and favouring the data through the lens of pre-existing theories in a way that confirms or supports these theories rather than objectively assessing them.

Best Practices for SEO

From Google Search Central: “Google’s automated ranking systems are designed to present helpful, reliable information that’s primarily created to benefit people, not to gain search engine rankings, in the top Search results.”

Good content is, by definition, already aligned with all of Google’s requirements. Yet the outlook becomes more challenging for black- and grey-hat SEO-technique followers. They should imitate the quality: consider and manipulate the ranking factors. Even based on the scale of the modules and features that the leak revealed, it is barely possible.  

Understanding the algorithm and how it might impact search efforts is important, though not by paying the price of competing with industry players who prefer an unfair game. It is possible to not focus on the algorithm itself and instead develop sites, content, and experiences that it will reward.

Considering all that has been said and with all respect to efforts to retrieve useful data from the Google leak and interpret it, search engine optimisation was and remains about common sense, and the latest findings don’t contradict it: 

  1. Understanding the target audience, identifying their pain points, aligning the content with the obvious challenges, making it technically accessible, and promoting it to audiences that resonate are key to long-term success.
  2. Content that has the potential to rank continuously should be of quality — relevant, unique, and valued. It is an apparent negative signal if pages are indexed and appear in high positions in the SERP with many impressions but don’t attract user clicks. 
  3. Good content is tailored, deep and original and anticipates the user’s demand.
  4. The authors should have expertise and credibility to guarantee quality content.
  5. Fulfil the various types of information-consuming and their representation options in SERP formats.
  6. Maintain quality, order and consistency in all website elements, including content design, images, fonts, metadata, markup, alt-texts, etc. 
  7. Reinforce SEO with UX. SEO is focused on attracting people to the page, while UX is about guiding them to take action on the page. It’s important to carefully consider how components are organised and presented to help users find the content they’re searching for and to encourage them to remain on the site. 
  8. Google has evolved in its ability to understand the real-time world stage and evaluate credibility and authority. Ensuring understanding, credibility, and deliverability establishes a clear identity for all related entities.
  9. Link building and digital PR strategy should be centred around relevancy, which guarantees interest, and around high-quality publications.
  10. Don’t try to manipulate ranking through questionable techniques. As experience demonstrates, all these actions will be penalised — it is just a matter of time.  

Google spokespeople strive to support and provide value to the community within their allowed constraints. Nevertheless, the SEO Community should take Google communications as input and continue to experiment to see what works best for specific cases. 

The ranking systems are variable. To stay ahead of the competition, not only the website but also the SERP should be tested. When all critical aspects have been assessed, the best practice is to have a personalised experimentation plan for SEO, implement it, learn from it, try to replicate the success and test it once again. Therefore, the cycle remains infinite.