Webmaster Hangouts Archives - Premium eCommerce marketing services

WebMaster Hangout – March 9, 2023

Based on the English Google SEO office-hours from March 2023

What is the most effective way to update the results for the website?

Bob asked about the most effective way to update SEO results when a URL has been redirected to a new WordPress site, but the search results display a mixture of the old and new sites.

John answered that he could, first and foremost, trust Google would automatically update the website upon reprocessing. However, there are a few additional steps to optimise a website’s performance, such as setting up proper redirects from old URLs to new ones to help users and search engines navigate the site easily when changing URLs on the site. If there are no proper redirects, it’s recommended to return a 404 or 410 result code. In cases where you urgently need to remove specific pages, you can submit them for removal in Google Search Console to expedite the removal process and ensure that the search results stop displaying those pages. Finally, when optimising a site for search, you should check that the pages’ titles and descriptions accurately reflect their content. In your case, adding location-specific details such as city and state names to your titles can help improve user and search engine recognition.

What is Google’s method for recognising WEBP images?

Sam asked how Google identifies WEBP images and if it’s based on their file extensions or by analysing the image format.

Gary answered that Google typically looks at the content type header in the HTTP response to identify the format of an image, including WEBP images. Unlike file extensions, which can be misleading or inaccurate, the content type header is a reliable source for determining the format of an image. Sometimes, aside from the content type header, they deploy additional methods to confirm the image format.

How to resolve the ‘could not determine the prominent video’ error?

Jimmy asked about the solution to the error message “Google could not determine the prominent video on the page” in the Search Console.

Lizzi answered that if you encounter the error message “Google could not determine the prominent video on the page” in Search Console, you should examine the list of affected pages provided by Search Console to determine if it needs to be fixed. It’s possible that this error is not an issue and it’s functioning as intended. The size and placement of the video on the page are crucial to its visibility, and it should not be too small or too large. You can refer to the video indexing report documentation for more information.

Would having the same language on two different markets be considered duplicate content by Google?

Mark asked whether Google considers it duplicate content when the same language is used for two different markets.

John stated that there are no penalties for duplicate content when the same language is used for two different markets, but there may be some effects. If the websites have pages with identical content in the same language, one of them may be treated as a duplicate and the other as a canonical page, determined on a per-page basis. This may affect search results, with only one of the pages being shown. Hreflang annotations can be used to swap out URLs, but Search Console primarily reports on the canonical URLs. To avoid confusion, displaying a banner to guide users from other countries to their appropriate version when they visit the website can be helpful.

For the Dutch website, is it better to use .com or .nl?

Jelle inquired about which domain is better for SEO purposes between a recently-purchased .com and .nl domains for their Dutch website.

Gary’s answer suggested that, if possible, it is better to stick with the .nl domain version for a Dutch website. This is because moving a site can be risky, and it is advisable to avoid doing so unless it is necessary. Furthermore, the .nl domain is a clear signal to search engines that the website is specific to the Netherlands, whereas the .com domain does not provide this specificity. While the subject is complex, these two reasons suggest that the .nl domain is the better choice for SEO purposes for a Dutch website.

How could a large number of indexed pages be moved?

An anonymous user asked about moving a large number of indexed pages to another site.

Lizzi answered that when moving a large number of indexed pages to another site, it is important to take steps to minimise the risk of errors, isolate the changes made and avoid implementing other significant changes, such as site re-architecture or redesigning. The focus should be on moving the URLs only and ensuring that they are correctly redirected. It is recommended to keep a record of all URLs before and after the move to monitor the redirection process. There is additional documentation available on site moves to provide more information on how to ensure a successful move. 

Will Google’s AdsBot crawling be covered by crawl requests in the Search Console?

Ellen asked whether the crawl requests in the Search Console will include Google’s AdsBot crawling, as her team has noticed some examples of the Search Console Crawler requesting their product URLs from the Merchant Center.

John answered that the Search Console crawl statistics include AdsBot and will be listed separately in the Googlebot Type section. This is because AdsBot uses the same infrastructure as search Googlebots and is limited by the same crawl rate mechanisms.

What is Google’s procedure for updating an indexed URL?

Shannon asked about how Google updates an indexed URL that is returned from a search.

Gary explained that the crawling and reprocessing processes update indexed URLs in Google’s search results.  Googlebot crawls the URL and then sends it to the indexer for reprocessing. The time it takes to reprocess the URL may vary based on its popularity, but eventually, it will be updated in Google’s index.

Can hreflang be interpreted and comprehended through multiple sitemaps?

Frederick asked about whether hreflang annotations can be interpreted across multiple sitemaps, or if all URLs with the same language need to be included in the same sitemap and if it’s possible to have separate sitemaps for DE, CH, and AT.

Lizzi suggested that either approach may be adopted based on the convenience of the user and recommended referring to the cross-site sitemap submission documentation and hreflang documentation for further information about the necessary requirements and submission methods, depending on the preferred approach.

Is it possible to massively request the re-indexing?

Martyna asked if it is possible to request the re-indexing of all products in their shop in the Search Console in bulk.

John answered there is no direct option available to request a whole website reprocess in Search Console because it is done automatically over time. Website owners can inform Google about changed pages through a sitemap file or a Merchant Center feed. In addition, individual pages can be requested for re-indexing using the URL inspection tool in Search Console. For urgent page removals, website owners can also use Search Console. All of this should work automatically if a good e-commerce setup is used.

Why does Google indicate that a URL is a duplicate of another URL?

Lori asked about the reason for Google showing a URL to be a duplicate of another URL.

Gary answered that one should access the Search Console data for the affected website to better understand why Google considers a particular URL a duplicate. However, in general, duplicate URLs are identified when they were, at some point, similar or nearly identical. In case you are facing this issue, our canonicalisation problems repair documentation provides useful tips for resolving it.

Is the < strong > tag useful for the website?

Varinder asked about the advantages of utilising the <strong> tag on our website and sought clarification on the differences between the <b> tag and the <strong> tag.

Lizzi responded that while both the <b> tag and the <strong> tag serve to emphasise the importance of text, they differ in their intended use. Specifically, the <strong> tag is appropriate for communicating information that is particularly vital, pressing, or grave, such as a warning. It is worth noting that the <strong> tag represents a more intense form of bold.

Why isn’t a specific website being indexed?

Hamza asked for an assessment of his robots.txt file to know why his site wasn’t indexed.

John answered why his website is not being indexed, examining Hamza’s robots.txt file and website reporting that the robots.txt file is not hindering the website’s indexing but the site’s content quality. The website appears to be a free download site, consisting mainly of video and music descriptions with affiliate download links. The responder suggested that Hamza start over with a new website on a topic that he is passionate about and that he has existing knowledge to share. Alternatively, Hamza could collaborate with someone who has content expertise but needs technical assistance.

Are changes to a website an indication of a core ranking system update?

Gary talked about Jason’s question on whether a significant shift in a site’s search results indicated a core ranking system update with a straightforward no response.

Can Google crawl text within images for better SEO?

Bartu asked about image optimisation on his website with images containing text regarding Google’s ability to crawl and index the text within the image for optimal search engine optimization (SEO).

Lizzi noted that Google could generally extract text from images using OCR technology. However, it is advisable to provide this information via alt text to provide a better image context. Alt text is also beneficial for readers using a screen reader, and it enables website owners to explain how an image relates to other content on the page, thereby enhancing the user experience.

Does Google’s page speed algorithm consider viewports optimised for speed?

Nick asked whether Google’s page speed algorithm considers web pages that optimise only the viewport content for faster loading rather than the entire page, highlighting that loading the entire page within acceptable time parameters may not always be feasible.

John responded that Google uses core web vitals to measure page experience. The largest contentful paint (LCP) is a vital metric that closely aligns with your inquiry about page loading time. LCP measures the rendering time of the largest image or text block visible within the viewport relative to when the page initially began loading. We have documentation on core web vitals and a web vitals discussion group for further information.

Are there any problems with Googlebot in the Arabic region?

Mirela asked about any coverage issues for the Arabic region Googlebot, which could potentially affect indexation. GSC could fetch their sitemap as a point of reference.

Gary said that Google might not index sitemaps for various reasons, often related to the quality of the feed and the pages within it. A sitemap will not always appear in search results. It is recommended to continue to produce high-quality content, and over time, Google’s algorithms may reassess your website’s content, potentially resolving the issue.

Having DE, AT, and CH websites on the domain, should DE be marked with hreflang ‘de’?

Frederick asked a question regarding the proper use of hreflang for their domain, specifically whether they should mark their DE site with hreflang ‘de’ instead of ‘de-de’ due to it being their most important market.

Lizzi responded by stating that it ultimately depends on the website owner’s goals and whether targeting by country is necessary. If the user’s landing on the .de country site from Switzerland is acceptable, using ‘de’ would be sufficient. However, if country-specific targeting is necessary, you should use ‘de-de’.

How is a password-protected website that hasn’t launched yet get recrawled by Googlebot?

Leah asked about the process of recrawling an unlaunched website with password access after receiving a 401 error code. In November, there was a failed attempt by Googlebot to crawl the site due to password protection. Leah wanted to know how to recrawl the site.

John responded that after reviewing the site in question, it appears that the bots are currently crawling it in an expected manner. It is a common and recommended practice to secure a staging site with a password. In situations where a page disappears, or the server goes down, the systems will periodically attempt to crawl the site, which will be visible as crawl errors in Search Console. The bots will recrawl and index them when the pages become available again. You may encounter occasional crawl errors if the content has been permanently removed, and you shouldn’t worry about that.

How to generate app-store site links?

Olivia asked the Search Console team regarding site links that direct users to the iOS or Google Play App Stores, as seen on media websites such as The Washington Post or Times UK. She seeks guidance on how to generate similar site links for her own website.

Despite not being a member of the Search Console team, Gary answered that you generate the site link type through the knowledge graph. If your mobile app is linked to your website, either through deep linking or by providing your website in the app stores, it may appear in search results as an app result or as a site link.

Is adding one structured data within another type of structured data allowed?

Prabal asked about the permissibility of adding one structured data type inside another. Specifically about the possibility of incorporating carousel structured data within Q&A structured data.

Lizzi answered that structuring your data in a nested manner can aid in comprehending the main focus of a page. For instance, placing “recipe” and “review” at the same level does not convey a clear message. However, nesting “review” under “recipe” informs us that the page’s primary purpose is to provide a recipe with a supplementary review. It is advisable to review the specific feature documentation to obtain additional information on how to combine various types of structured data. Currently, the supported carousel features include courses, movies, recipes, and restaurants.

Why doesn’t Google display local business map results for some searches?

Tyson asked why there are no local business map results on Google for the “wedding invitations” search term.

Gary responded that Google constantly evolves search feature selection for various queries, taking the user’s location and several factors into account. Data quality also plays a critical role and may be resolved over time, so there is no need for concern because these are natural search engine functions.

How does Google define a page with soft-404 errors?

Rajeev asked for information on how Google defines a page that contains soft-404 errors despite adhering to Google’s guidelines on soft-404 errors.

Lizzi explained that soft-404 errors often indicate some failure on the page. For instance, the error may be an absent resource intended to load or when it fails to load, leading it to resemble a 404 page. You can use the URL inspection tool to examine the page from Googlebot’s perspective and check if it appears correct. In a job search results page, the page should ideally display a list of jobs and not an error message like “no jobs found” due to a widget load failure. If there are genuinely no jobs found, then the page should not be indexed. At times, the soft-404 error is functioning as intended, and you may disregard the error in Search Console if this is the case.

Would having two websites be better for a company offering different services?

Adrian asked whether it is more advantageous, from an SEO perspective, for a company providing diverse services, such as language and IT courses, to maintain two separate websites or a single website.

Gary responded that the decision to maintain two separate websites is primarily a business rather than an SEO consideration because you should carefully evaluate the maintenance costs associated with managing two websites. These costs may outweigh any potential benefits that two separate domains could bring. However, in certain scenarios, having distinct websites for users in different regions or for different categories of courses may prove advantageous, particularly in the case of localised websites. For instance, creating an iOS version for English-speaking users and another for German-speaking users can improve user engagement and increase the likelihood of users staying on your website.

What does ‘This URL is not allowed for a sitemap at this location’ indicate?

Zain asked for clarification regarding an XML hreflang sitemap error message reading, “This URL is not allowed for a sitemap at this location.” He submitted hreflang for both country sites via sitemap submission but received error messages and is uncertain about the troubleshooting process.

Lizzi recommended that the site owner should verify both country sites in Search Console and ensure that the URL is at a higher level than the sitemap storage location. In case the problem persists, he could refer to the sitemaps troubleshooting guidance provided in the Search Console Help Center or post in the forums with specific details about your sitemap and the URL causing the error.

Why do organic searches show a different image than the main image?

Martina asked about the observed discrepancy between the image displayed in organic searches and the main image associated with the linked product.

Gary addressed a question regarding why organic searches sometimes display a different image next to the link than the main image for the linked product. Gary explained that the images shown in Google Search results are not always the page’s main image but the ones most relevant to the user’s query. This relevance is determined by signals that Google has for that particular image, such as alt text, file name, and context. Therefore, optimising images for search engines by providing relevant and descriptive alt text and file names are important.

What are the consequences of utilising an infinite scroll without numbered pagination?

Adam asked for clarification regarding the potential impact of implementing an infinite scroll feature without numbered pagination links on an e-commerce website.

John answered that in terms of Google Search implications, using an infinite scroll feature on an e-commerce website without numbered pagination links presents certain challenges. The search engine needs to scroll through the content in order to access the next segment, which can be inefficient and may result in some of the infinite content being missed. Therefore, it is strongly recommended to include pagination links in addition to any infinite scrolling features. Pagination allows search engines to directly access specific pages, ensuring that all content is indexed effectively. For further guidance, e-commerce site owners can refer to the Search Central documentation provided by Google.

Sign up for our Webmaster Hangouts today!

GET IN CONTACT TODAY AND LET OUR TEAM OF ECOMMERCE SPECIALISTS SET YOU ON THE ROAD TO ACHIEVING ELITE DIGITAL EXPERIENCES AND GROWTH 

WebMaster Hangout – Live from January 31, 2023

WEBMASTER HANGOUT – LIVE FROM JANUARY 31, 2023

Introduction

Lizzi: (00:00) It’s January, it’s 2023, it’s the new year. And what do we have for you today?

It’s office hours, and it’s me. And some questions and some answers, and let’s get into it. I’m Lizzy from the Search Relations team.

Do meta keywords matter? 

Lizzi: (00:15)  Do meta keywords still help with SEO?

A: (00:20) Nope. It doesn’t help. If you’re curious about why there’s a blog post from 2009 that goes into more detail about why Google doesn’t use meta keywords.

Why is my brand name not shown as-is?

Gary: (00:31) Hi, this is Gary from the search team. Kajal is asking; my brand name is Quoality. That is, Quebec Uniform Oscar Alpha Lima India Tango Yankee, and when someone searches for our brand name, Google shows the results for quality. That is the correct spelling. Why is Google doing this?

A: (00:53) Great question. When you search for something that we often see as a misspelling of a common word, our algorithms learn that and will attempt to suggest a correct spelling or even just do a search for the correct spelling altogether As your brand grows, eventually, our algorithms learn your brand name and perhaps stop showing results for what our algorithms initially detected as the correct spelling. It will take time, though.

Which date should I use as lastmod in sitemaps?

John: (01:20) Hi, this is John from the search relations team in Switzerland. Michael asks, the lastmod in a sitemap XML file for a news article. Should that be the date of the last article update or the last comment?

A: (01:36) Well since the site map file is all about finding the right moment to crawl a page based on its changes, the lastmod date should reflect the date when the content has significantly changed enough to merit being re-crawled. If comments are a critical part of your page, then using that date is fine. Ultimately, this is a decision that you can make. For the date of the article itself, I’d recommend looking at our guidelines on using dates on a page. In particular, make sure that you use the dates on a page consistently and that you structured data, including the time zone, within the markup

Can I have both a news and a general sitemap?

Gary: (02:14 ) Helen is asking, do you recommend having a news sitemap and a general sitemap on the same website? Any issue if the news site map and general sitemap contain the same URL?

A: (02:24) You can have just one site map, a traditional web sitemap as defined by sitemaps.org, and then add the news extension to the URLs that need it. Just keep in mind that, you’ll need to remove the news extension from URLs that are older than 30 days. For this reason, it’s usually simpler to have a separate site map for news and for the web. Just remove the URLs altogether from the news site map when they become too old for news. Including the URLs in both site maps, while not very nice, but it will not cause any issues for you..

What can I do about irrelevant search entries?

John: (03:02) Jessica asks. In the suggested search, in Google, at the bottom of the page, there’s one suggestion that is not related to our website. And after looking at the results, our website is not to be found for that topic. Kind of hard for me to determine exactly what you mean, but it sounds like you feel that something showing up in the search isn’t quite the way that you’d expect it, something perhaps in one of the elements of the search results page.

A: (03:29) For these situations, we have a feedback link on the bottom for the whole search results page, as well as for many of the individual features. If you submit your feedback there, it’ll be sent through your appropriate teams. They tend to look for ways to improve these systems for the long run for everyone. This is more about feedback for a feature overall and less that someone will explicitly look at your site and try to figure out why this one page is not showing up there

Why is my site’s description not shown?

Lizzi: (03:57 ) Claire is asking, I have a site description on my Squarespace website, but the Google description is different. I have reindexed it. How do I change it?

A: (04:08) Something to keep in mind here is that it’s not guaranteed that Google will use a particular meta description that you write for a given page. Snippets are actually auto-generated and can vary based on what the user was searching for. Sometimes different parts of the page are more relevant for a particular search query. We’re more likely to use the description that you write if it more accurately describes the page than what Google can pull from the page itself. We have some best practices about how to write meta descriptions in our documentation, so I recommend checking that out.

How can I fix the spam score for a used domain?

John: (04:40) Mohamed asks, I bought this domain, and I found out it got banned or has a spam score, so what do I need to do?

A: (04:49) Well, like many things, if you want to invest in a domain name, it’s critical that you do your due diligence ahead of time or that you get help from an expert. While for many things, a domain name can be reused, sometimes it comes with a little bit of extra ballast that you have to first clean up. This is not something that Google can do for you. That said, I couldn’t load your website at all when I tried it here. So perhaps the primary issue might be a technical one instead.

Are spammy links from porn sites bad for ranking?

Lizzi: (05:20) Anonymous is asking; I’ve seen a lot of spammy backlinks from porn websites

linking to our site over the past month using the Google Search Console link tool. We do not want these. Is this bad for ranking, and what can I do about it?

A: (05:35) This is not something that you need to prioritise too much since Google Systems are getting better at figuring out if a link is spammy. But if you’re concerned or you’ve received a manual action, you can use the disavow tool in Search Console. You’ll need to create a list of the spammy links and then upload it to the tool. Do a search for disavow in Search Console for more steps on how to do this.

Does Google use keyword density?

John: (05:59) The next question I have here is, does Google consider keyword density for the content?

A: (06:05) Well, no, Google does not have a notion of optimal keyword density. Over the years, our systems have gotten quite well at recognising what a page is about, even if the keywords are not mentioned at all. That said, it is definitely best to be explicit. Don’t rely on search engines guessing what your page is about and for which queries it should be shown. If your homepage only mentions that you “add pizazz to places” and show some beautiful houses, both users and search engines won’t know what you’re trying to offer. If your business paints houses, then just say that. If your business sells paints, then say that. Think about what users might be searching for and use the same terminology. It makes it easier to find your pages, and it makes it easier for users to recognise that they have found what they want. Keyword density does not matter, but being explicit does matter and contrary to the old SEO myth, story, joke, and commentary, you don’t need to mention all possible variations either.

Why is our title mixed up with the meta description?

Lizzi: (07:12) Michael is asking, what we should do if we are seeing that certain pages have meta descriptions in SERPs displaying the exact same text as the title tag, not our custom descriptions or snippets from the page.

A: (07:26) Hey, Michael. Well, first, I’d check that the HTML is valid and that there’s not any issue with how it’s being rendered with the URL inspection tool. It’s hard to give any more advice without seeing more context, so I’d head to the Search Central forums, where you can post some examples of the page and the search results you’re seeing for it. The folks there can take a look and give some more specific advice on how to debug the issue further.

How can I remove my staging sub-domain?

Gary: (07:50) Anonymous is asking, I have a staging site which is on a subdomain, and unfortunately, it got indexed. How can I remove it from search results?

A: (07:59) Well, these things happen, and it’s not a reason to be worried. First, ensure that your staging site is actually returning a 404 or 410 status code, so Googlebot can update our records about that site. And then if it’s a bother that the staging site appears in search, submit a site removal request in Search Console. Just mind that you are going to need to verify the staging site in Search Console first.

Will disavowing links make my site rank better?

John: (08:25) Jimmy asks, will disavowing spammy links linking to my website help recover from an algorithmic penalty?

A: (08:33) So first off, I’d try to evaluate whether your site really created those spammy links. It’s common for sites to have random, weird links, and Google has a lot of practice ignoring those. On the other hand, if you actively built significant spammy links yourself, then yes, cleaning those up would make sense. The disavow tool can help if you can’t remove the links at the source. That said, this will not position your site as it was before, but it can help our algorithms to recognise that they can trust your site again, giving you a chance to work up from there. There’s no low-effort, magic trick that makes a site pop up back afterwards. You really have to put in the work, just as if you did it from the start.

How can I best move my site?

Gary: (09:21) Clara Diepenhorst is asking, I want to implement a new name for my company while the product and site stay mostly the same. This new name changes my URLs. How do I keep my credits of the old name?

A: (09:36) Great question. And this is, again, a site move question. Site moves are always fun and scary. The most important thing you need to do is to ensure that the old URLs are redirecting to the new URLs. This is the most important thing. Once you have your new domain, make sure that you verify it in Search Console. See if you get any red flags in the security section and other reports. And once you are already with the redirections, you can submit a site move request in Search Console also. Since it’s a really big undertaking to do a site move, we have very detailed documentation about this topic. Try searching for something like “Google site move” on your favourite search engine and really just have a read, prepare yourself

Why doesn’t my site show up in Google?

John: (10:22) Rob asks, my site does not show up on Google searches. I can’t get it indexed.

A: (10:28) So Rob mentioned the URL, and I took a quick look, and it turns out that the homepage returns a 404 status code to us. Essentially for Google, the page does not exist at all. Trying it out a bit more, it looks like it returns a 404 status code to all Googlebot user agents and users can see it normally. You can test that using a user agent switcher in Chrome in the developer tools there. This seems to be more of a misconfiguration of your server, so you might need help from your hosting provider to resolve it. Google will keep retrying the page, and once it’s resolved, it should be visible in the search results again without maybe, a week or so.

How can I get my mobile version into Google?

Lizzi: (11:11) Matheus is asking Google Search Console looks at the desktop version of some, but not all articles on my website, even though it has a mobile version. How can I tell Google to look at the mobile version?

A: (11:24) Well, we’ve got a list of things that you can check in our documentation on mobile-first indexing, so I’d recommend going through that checklist and the troubleshooting section. Most of it boils down to this. Make sure that you’re providing the same content on both versions of your site and that both your users and Google can access both versions. If you’re still having issues, we recommend posting in the forum so folks there can take a look at those specific pages that are not showing up as mobile-friendly.

Why does a competitor’s social account with the same name show up?

John: (11:54) Anthony asks, my company’s social media account is no longer appearing in the search results. Only my competitors are appearing now, and we have the same name.

A: (12:06) It looks like it’s more than just two sites using the particular name that you mentioned, and this kind of situation will always be hard to find your site, and it won’t be clear to us or to users, which one is the so-called right one? I mean, they’re all called the same; they’re all essentially legitimate results. If you want people to find your site by name, then you should make sure that the name is a clear identifier and not a term that many others also use.

What could be a reason for a URL removal not to work?

Gary: (12:36) Lou is asking, why my link is still showing up on Google after I used the content removal tool and it got approved? Please help me understand this phenomenon.

A: (12:47) Using the URL removal tool is very fast. Usually, it removes the specified URL from search results within a few hours. If it didn’t remove a URL that was approved for removal by the tool, that usually means that you specified the wrong URL. Try to click the actual result and see where you land. Is it the same URL that’s shown in the search? If not, submit another removal request for that particular URL.

Which structured data should I use on a service-website?

John: (13:13) The next question I have here is; our website is a service, not a product. The price will vary on the estimate. How do I fix the invalid item for a service like ours when I use product structured data?

A: (13:28) For a local business, like the one that you mentioned, I’d recommend looking at the local business structure data. This also lets you specify a price range for your services. We have more information about this markup in the search developer documentation.

Why might my content not be indexed?

Gary: (13:43) Anonymous is asking what could be the reason for our relatively healthy and content-rich country site to repeatedly be de-indexed and our old 404 subdomains and sub holders to be reindexed instead?

A: (13:55) Well, without the site URL, it’s fairly impossible to give an exact answer, but it sounds like we just haven’t visited all the URLs on those old subdomains and in subfolders, and that’s why those URLs are still surfacing in search. If you are certain that the country site keeps falling out of Google’s index and not just, for example, not appearing for the keywords you’d like, that could be a sign of both technical and quality issues. I suggest you drop by the Google Search Central Forums and see if the community can identify what’s up with your site.

Can I get old, moved URLs ignored by Search?

John: (14:29) Alex asks, if you move a ton of content with 301 redirects, do you need to request the removal of the old URLs from the index? Because even a decade later, Google still crawls the old URLs. What’s up? Thank you.

A: (14:44) No, you do not need to request re-indexing of moved URLs or request them to be removed. This happens automatically over time. The effect that you’re seeing is that our systems are aware of your content has been on other URLs. So if a user explicitly looks for those old URLs, we’ll try to show them, and this can happen for many years. It’s not a sign of an issue. There is nothing that you need to fix in a case like this. If you check the URL’s in Search Console, you’ll generally see that the canonical URL has shifted when the redirect is being used. In short, don’t worry about these old URLs showing up when you specifically search for those old URLs.

Does Search Console verification affect Search?

Gary: (15:27) Avani is asking, changing Search Console ownership or verification code – does it affect website indexing?

A: (15:35) Having your site verified in Search Console or changing the verification code and method has no effect on indexing or ranking whatsoever. You can use the data that Search Console gives you to improve your site and thus potentially do better in search with your site, but otherwise has no effect on search whatsoever.

Why might my translated content not appear in Google?

John: (15:54 ) Now, a question from Allan, about two months ago, I added another language to my website. I can’t find the translated version through Google Search. What could be the reason for that?

A: (16:07) When adding another language to a website, there are things that you need to do and things you could additionally do. In particular, you need to have separate URLs for each language version. This can be as little as adding a parameter to the URL, like question mark language equals German, but you must have separate URLs that specifically lead to that language version. Some systems automatically swap out the content on the same URL. This does not work for search engines. You must have separate URLs. The other important thing is that you should have links to the language versions. Ideally, you’d link from one language version to all versions of that page. This makes it easy for users and search engines to find that language version. Without internal links to those pages, Google might not know that they exist. And finally, using the hreflang annotations is a great way to tell us about connections between pages. I’d see this more as an extra; it’s not required. You can find out more about sites that use multiple language versions in our developer’s documentation.

Does the URL depth of an image affect ranking?

Lizzi: (17:21) Sally is asking does the URL depth of an image affects image ranking and will adding the srcset and size code of an image in the HTML be good for image ranking?

A: (17:33) Whether an image is three levels deep or five levels deep isn’t really going to matter. What’s more important is using a structure that makes sense for your site, and it makes it easy for you to organise your images in some kind of logical pattern, while still making sure that the file names are descriptive. For example, it might make sense to have a directory called photos slash, dog slash, havanese slash molly dot png, but if you don’t have a ton of Havanese photos, then maybe just photos and then Molly Havanese dog dot png might make sense. As far as srcset and size code goes, add those if it makes sense for your image. We recommend these for responsive images in particular so we can understand more about the different versions of a given image. Hope that helps.

What happens when a part of an hreflang cluster is bad?

Gary: (18:20) Anonymous is asking, is there a difference in how hreflang clusters are treated, depending on if the hreflang tag is broken or includes a noindex or a different canonical in the clusters?

A: (18:34) Complicated topic. Hreflang clusters are formed with the hreflang links that we could validate. Validate in this context, meaning the backlinks between the hreflang tags. If an hreflang link couldn’t be validated, that link will simply not appear in the cluster. The cluster will be created regardless of the other valid links. If one of the links is no index, then that won’t be eligible for getting into the cluster.

Are sitewide footer links bad?

Lizzi: (19:05) Nazim is asking, are sitewide footer links that refer to the designer companies or the CMS harmful for SEO?

A: (19:54) In general, if the links are boilerplate stuff like “made by Squarespace” that comes with the website theme, this is not something that you need to worry about. If you have control over the link, we recommend that you add nofollow to these types of links. Also, check to make sure that the anchor text is something reasonable. For example, make sure that the link isn’t gratuitously keyword rich, for example, “made by the best Florida SEO.”

How can I speed up a site move?

Gary: (19:39) Mohamed is asking; I made a transfer request because I changed the domain name of our website in Search Console. What can I do to speed up this process? This is very, very important for me. 

A: (19:48) This is a good question. The most important thing you need to do is to ensure that your old URLs are redirecting to your new site. This will have the largest positive impact on your site move. The site move request in Search Console is a nice thing to submit, but even without it, site moves should do just fine, if you redirect the old URLs to the new ones and they are working properly. Search for something like “Google Site move” on your favourite search engine and check out our documentation about site moves, if you want to learn more.

How do I link desktop and mobile versions for m-dot sites?

Lizzi: (20:24) Nilton is asking, at the moment, my site is not responsive. It has a desktop version and an m-dot site. In the documentation, it says the treatment we need to do is something in relation to canonical and alternate. My question is, do I need to put the canonical in the desktop version? The documentation doesn’t make it very clear. 

A: (20:42)Thank you for your feedback; I will definitely try to make this clearer in the docs. The desktop URL is always the canonical URL, and the m-dot is the alternate version of that URL. So on the desktop version, you’ll need a rel canonical that points to itself and a rel alternate that points to the m-dot version. And then, on your m-dot page, you’ll have only a rel-canonical that points to the desktop version of that page. Hope that helps.

How important is EXIF data?

Gary: (21:14) Sagar is asking, how important is EXIF data from an SEO perspective for an e-commerce site or sites where images play key roles?

A: (21:25) Well, this is an easy question. I really like easy questions. The answer is that Google doesn’t use EXIF data for anything at the moment. The only image data, or metadata, that we currently use is IPTC.

Conclusion

John: (21:41) And that was it for this episode. I hope you found the questions and the answers useful. If there’s anything you submitted, which didn’t get covered here, I’d recommend posting in the Search Central help community. There are lots of passionate experts active there who can help you to narrow things down. And of course, if there’s more on your mind, please submit those questions with a form linked below. Your questions here are useful to us and to those who catch up on recordings, so please keep them coming. If you have general feedback about these episodes, let us know in the comments or ping us on social media. 

I hope the year has started off well for you. For us, well, it’s been a mixed bag, as you’ve probably seen in the news, things are a bit in flux over here. You can imagine that it’s been challenging for the team, those we interact with internally, and also me. In any case, the questions you submit give us a chance to do something small and useful, hopefully, so please keep them coming. In the meantime, may your site’s traffic go up and your crawl errors go down. Have a great new year and see you soon. Bye.

Sign up for our Webmaster Hangouts today!

GET IN CONTACT TODAY AND LET OUR TEAM OF ECOMMERCE SPECIALISTS SET YOU ON THE ROAD TO ACHIEVING ELITE DIGITAL EXPERIENCES AND GROWTH 

WebMaster Hangout – Live from DECEMBER 29, 2022

WEBMASTER HANGOUT – LIVE FROM DECEMBER 29, 2022

Introduction

Lizzi: (00:00) Hello, hello, and welcome to the December edition of the Google SEO Office Hours, a monthly audio recording coming to you from the Google Search team answering questions about search submitted by you. Today, you’ll be hearing from Alan, Gary, John, Duy and me, Lizzy. All right, let’s get started.

How to reduce my site from 30,000 products to 2,500?

Alan: (00:22) Vertical web asks. My old site is going from 30,000 products down to two and a half thousand. I will generate 400 thousand 301 redirects. Is it better to start on a clean URL and redirect what needs to be redirected to the new site or do it on an old URL? 

  • A: (00:44) We generally recommend keeping your existing domain name where possible. We support redirecting to a new domain name as Google will recognise the 301 permanent redirects and so understand your content is moved. However, there’s a greater risk of losing traffic if a mistake is made in the migration project, it is fine to clean up old pages and either have them return a 404 or redirect to new versions, even if this affects lots of pages on your site.

Does Google ignore links to a page that was a 404?

Gary: (01:09) Sina is asking; it’s been formally asserted that Google ignores links to a 404 page. I want to know whether links to that page will still be ignored when it is no longer 404.

  • A: (01:22) Well, as soon as a page comes back online, the links will be counted again to that page after the linking pages have been recrawled and the fillings have been deemed still relevant by our systems.

Do speed metrics other than Core Web Vitals affect my site’s rankings?

John: (01:37) If my website is failing on the Core Web Vitals but performs excellently on the GTMetrix speed test, does that affect my search rankings?

  • A: (01:47) Well, maybe. There are different ways to test speed and different metrics, and there’s testing either on the user side or in a lab. My recommendation is to read up on the different approaches and work out which one is appropriate for you and your website.

Why doesn’t Google remove all spam

Duy: (02:06) Somebody asked, why does Google not remove spam webpages? 

  • A: (02:11) Well, over the years we blogged about several spam-specific algorithms that either demote or remove spam results completely. One such example is Spambrain, our artificial intelligence system that’s very good at catching spam. Sometimes for some queries where we don’t have any good results to show, you might still see low-quality results. If you see spam sites are still ranking, please continue to send them to us using the spam report form. We don’t take immediate manual actions on user spam reports, but we do actually use the spam reports to monitor and improve our coverage in future spam updates. Thank you so much.

Do too many 301 redirects have a negative effect?

John: (02:55) Lisa asked. I create 301 redirects for every 404 error that gets discovered on my website. Do too many 301 redirects have a negative effect on search ranking for a website? And if so, how many are too many?

  • A: (03:13) You can have as many redirecting pages as you want. Millions are fine if that’s what you need or want. That said, focus on what’s actually a problem so that you don’t create more unnecessary work for yourself. It’s fine to have 404 pages and to let them drop out of the search. You don’t need to redirect. Having 404 errors listed in Search Console is not an issue if you know that those pages should be returning 404.

How does Google determine what a product review is?

Alan: (03:42) John asks, how does Google determine what a product review is for the purposes of product review updates? If it’s affecting non-product pages, how can site owners prevent that?

  • A: (03:54) Check out our Search Central documentation on best practices for product reviews. For examples of what we recommend, including in product reviews. It is unlikely that a non-product page would be mischaracterized as a product review. And it is unlikely that it would have a significant effect on ranking, even if it was, it’s more likely to be other ranking factors or algorithm changes that have impacted the ranking of your page.

Should I delete my old website when I make a new one?

John: (04:23) I bought a Google domain that came with a free webpage. I now decided to self-host my domain, and I wanted to know if I should delete my free Google page.I don’t want to have two web pages. 

  • A: (04:37) If you set up a domain name for your business and have since moved on to a new domain, you should ideally redirect the old one to the new domain, or at least delete the old domain. Keeping an old website online when you know that it’s obsolete is a bad practice and can confuse both search engines and users.

Should paginated pages be included in an XML sitemap?

Alan: (04:59) Should paginated pages such as /category?page=2 be included in an XML sitemap? It makes sense to me, but I almost never see it.

  • A: (05:12) You can include them, but assuming each category page has a link to the next category page that may not be many benefits, we will discover the subsequent pages automatically. Also, since subsequent pages are for the same category, we may decide to only index the first category page on the assumption that the subsequent pages are not different enough to return separately in search results.

My site used to be hacked, do I have to do something with the hacked pages?

John: (05:37) Nacho asks, we were hacked early in 2022 and still see Search Console 404 error pages from spammy pages created by the hacker. These pages were deleted from our database. Is there anything else that I should do?

  • A: (05:55) Well, if the hack is removed, if the security issue is resolved, and if the pages are removed, then you’re essentially all set. These things can take a while to disappear completely from all reports, but if they’re returning 404, that’s fine. 

Does Google care about fast sites?

Alan: (06:11) Tarek asks, does Google care about fast sites?

  • A: (06:15) Yes. Google measures core web vitals for most sites, which includes factors such as site speed, and core web vitals is used as a part of the page experience ranking factor. While it’s not something that overrides other factors like relevance, it is something that Google cares about and equally important users care about it too.

Can Google follow links inside a menu that shows on mouseover?

Lizzi: (06:38) Abraham asks, can Google follow links inside a menu that appears after a mouseover on an item?

  • A: (06:45) Hey, Abraham. Great question. And yes, Google can do this. The menu still needs to be visible in the HTML, and the links need to be crawlable, which means they need to be proper A tags with an href attribute. You can use the URL inspection tool in Google Search Console to see how Google sees the HTML on your site, and check to see if the menu links are there. Hope that helps. 

Why did the reporting shift between my mobile and desktop URLs?

John: (07:10) Luki asked, we use sub-domains for desktop and mobile users. We found a strange report in Search Console in early August where the desktop performance has changed inversely with the mobile performance. And the result is that our traffic has decreased. 

  • A: (07:30) The technical aspect of the indexing and reporting, shifting to the mobile version of a site is normal and expected. This happens with mobile-first indexing and can be visible in reports if you look at the host names individually. However, assuming you have the same content on mobile and desktop, that wouldn’t affect ranking noticeably. If you see ranking or traffic changes, they would be due to other reasons. 

Does having many redirects affect crawling or ranking?

Gary: (07:56) Marc is asking, Do many redirects, let’s say twice as many as actual URLs affect crawling or ranking in any way? 

  • A: (08:05) Well, you can have as many redirects as you like on your site overall; there shouldn’t be any problem there. Just make sure that individual URLs don’t have too many hops in the redirect chains if you are chaining redirects, otherwise, you should be fine.

Can I use an organization name instead of an author’s name?

Lizzi: (08:21) Anonymous is asking, when an article has no author, should you just use organization instead of a person on author markup? Will this have a lesser impact on results?

  • A: (08:36) It’s perfectly fine to list an organization as the author of an article. We say this in our article, structured data documentation. You can specify an organization or person as an author, both are fine. You can add whichever one is accurate for your content. 

What can we do if someone copies our content?

Duy: (08:53) Somebody asked a competitor’s copying all of our articles with small changes. In time, it ranks higher than us. DMCA doesn’t stop them or seem to lower their ranking. What else can we do, if their site has more authority?

  • A: (09:09) If the site simply scrapes content without creating anything of the original value, that’s clearly a violation of our spam policies, and you can report them to us using our spam report form so that we can improve our algorithms to catch similar sites. Otherwise, you can start a thread on our Search Central Help community, so product experts can advise on what would be some of the possible solutions. They would also be able to escalate to us for further assessment.

Do URL, page title, and H1 tag have to be the same?

Lizzi: (09:35) Anonymous is asking: the URL page title and H1 tag. Do they have to be the same? 

  • A: (09:44) Great question, and no, they don’t need to be exactly the same. There’s probably going to be some overlap in the words you’re using. For example, if you have a page that’s titled “How to Knit a Scarf”, then it probably makes sense to use some of those words in the URL too, like /how-to-knit-a-scarf or /scarf-knitting-pattern, but it doesn’t need to be a word for word match. Use descriptive words that make sense for your readers and for you when you’re maintaining your site structure and organization. And that’ll work out for search engines as well.

Is redirecting through a page blocked by robots.txt a valid way to prevent passing PageRank?

John: (10:17) Sha asks, is redirecting through a page blocked by robots.txt still a valid way of preventing links from passing PageRank?

  • A: (10:28) Yes, if the goal is to prevent signals from passing through a link, it’s fine to use a redirecting page that’s blocked by robots.txt.

Why is my site flagged as having a virus?

Alan: (10:37) Some pages on my website collect customer information, but my site is always reported as being infected via a virus or deceptive by Google. How can I avoid this happening again without removing those pages?

  • A: (10:53) Your site might have been infected by a virus without you knowing it. Check out https://web.dev/request-a-review for instructions on how to register your site in Search. Console, check for security alerts, then request Google to review your site again after removing any malicious files, some break-ins hide themselves from the site owner so they can be hard to track down.

Is there any way to get sitelinks on search results?

Lizzi: (11:20) Rajath is asking, is there any way to get sitelinks on SERPs?

  • A: (11:25) Good question. One thing to keep in mind is that there’s not really a guarantee that sitelinks or any search feature will show up. Sitelinks specifically only appear if they’re relevant to what the user was looking for and if it’ll be useful to the user to have those links. There are some things that you can do to make it easier for Google to show sitelinks. However, like making sure you have a logical site structure and that your titles, headings, and link text are descriptive and relevant. There’s more on that in our documentation on sitelinks, so I recommend checking that out. 

Does having two hyphens in a domain name have a negative effect?

John: (11:59) My site’s domain name has two hyphens. Does that have any negative effect on its rankings? 

  • A: (12:06) There’s no negative effect from having multiple dashes in a domain.

How important are titles for e-commerce category page pagination?

Alan: (12:12) Bill asks, how important are unique page titles for e-commerce category product listing page pagination? Would it be helpful to include the page number in the title?

  • A: (12:25) There is a good chance that including the page number in your information about a page will have little effect. I would include the page number if you think it’s gonna help users understand the context of a page. I would not include it on the assumption it’ll help with ranking or increasing the likelihood of the page being indexed. 

Is it better to post one article a day or many a day?

John: (12:44) Is it better for domain ranking to regularly post one article every day or to post many articles every day or to post many articles every day? 

  • A: (12:53) So here’s my chance to give the SEO answer: it depends. You can decide how you want to engage with your users: on the downside, that means there’s no absolute answer for how often you should publish,on the upside, this means that you can decide for yourself.

What is the main reason for de-indexing a site after a spam update?

Gary: (13:12) Faiz Ul Ameen is asking, what is the main reason for de-indexing of sites after the Google spam update?

  • A: (13:19) Well, glad you asked. If you believe you were affected by the Google Spam update. You have to take a really, really deep look at your content and, considerably improve it. Check out our spam policies, and read more about the Google spam update on Search Central.

Can Google read infographic images?

John: (13:38) Zaid asks, can Google read infographic images? What’s the best recommendation there?

  • A: (13:45) While it’s theoretically possible to scan images for text, I wouldn’t count on it when it comes to a web search. If there’s a text that you want your pages to be recognized for, then place that as text on your pages. For infographics, that can be in the form of captions and all texts, or just generally, well, you know, text on the page.

Is it possible to remove my site completely if it was hacked?

Gary: (14:08) Anonymous is asking whether it’s possible to completely remove a site from Google Search because it has been hacked and leads to thousands of invalid links.

  • A: (14:20) Well, first and foremost, sorry to hear that your site was hacked. Our friends at Web.dev have great documentation about how to prevent this from happening in the future, but they also have documentation about how to clean up after a hack. To answer your specific question, you can remove your site from search by serving a 404 or similar status code, or by adding noindex rules to your pages. We will need to recrawl your site to see the status codes and noindex rules. But that’s really the best way to do it.

Why does my Search Console miss a period of data?

John: (14:54) I’m missing months of data from my domain property on Search Console from April 2022it connects directly to August 2022. What happened?

  • A: (15:07) This can happen if a website loses verification in Search Console for a longer period of time. Unfortunately, there is no way to get this data back. One thing you could try, however, is to verify a different part of your website and see if it shows some of the data there. 

How can I deindex some bogus URLs?

Gary: (15:25) Anonymous is asking, I want to deindex some bogus URLs. 

  • A: (15:30) There’s really only a handful of ways to deindex URLs: removing the page and serving a 404 or 410 or similar status code. Or by adding a noindex rule to the pages and allowing Googlebot to crawl those pages. These you can all do on your own site. You don’t need any specific tool. But Googlebot will need to recrawl those pages to see the new statuses and rules. If we are talking about only a couple of pages, then you can request indexing of these pages in the Search Console.

Why is some structured data detected only in the schema validator?

Lizzi: (16:04) Frank asks why is some structured data markup detected on the schema validator, but not on Google’s rich result test?

  • A: (16:14) Hey, Frank. This is a really common question. These tools are actually measuring different things. I think you’re referencing the Schema.org markup validator, which checks if your syntax in general, is correct, whereas the rich result test checks if you have markup that may enable you to get a rich result in Google Search. it doesn’t actually check every type that’s on schema.org, it only checks those that are listed in the list of structured data markup that Google supports, which is about 25 to 30 features, so it’s not fully comprehensive of everything that you’d see on Schema.org, for example. 

Do you have people who can make a website for me?

John: (16:47) Do you have people that I can work with to create a functioning site?

  • A: (16:52) Unfortunately, no. We don’t have a team that can create a website for you. If you need technical help, my recommendation would be to use a hosted platform that handles all of the technical details for you. There are many fantastic platforms out there now, everything from Blogger from Google to Wix, or Squarespace, Shopify, and many more. They all can work very well with search and usually, they can help you to get your site off the ground.

Why are some sites crawled and indexed faster?

Gary: (17:21) Ibrahim is asking why are some websites crawled and indexed faster than others?

  • A: (17:25) This is a great question. Much of how fast a site is crawled and indexed depends on how the site is perceived on the internet. For example, if there are many people talking about the site, it’s likely the site’s gonna be crawled and indexed faster. However, the quality of the content also matters a great deal. A site that’s consistently publishing high-quality content is going to be crawled and indexed faster. 

Why do Google crawlers get stuck with a pop-up store selector?

Alan: (17:51) Why do Google crawlers get stuck with a pop-up store selector? 

  • A: (17:56) It can depend on how the store selector is implemented in HTML. Google follows a href links on a page. If the selector is implemented in JavaScript, Google might not see that the other stores exist and so not find the product pages for those stores.

How can I verify my staging site in Search Console?

Gary: (18:13) Anonymous is asking if we have a staging site that is allow-listing only specific developers’ IP addresses, if we upload a Search Console HTML file, which I suppose is the verification file, will Search Console be able to verify that site?

  • A: (18:30) Well, the short answer is no. To remove your staging site from search, using the removal tool for site owners first, you need to ensure that Googlebot can actually access the site, so you can verify it in Search Console. We publish our list of IP addresses on Search Central. So you can use that list to allow-list the IPs that belong to Googlebot so it can access the verification file. Then you can use the removal tool to remove the staging site. Just make sure that the staging site, in general, is serving a status code that suggests it cannot be indexed, such as 404 or 410.

How can I get a desktop URL indexed?

John: (19:08) How can we get a desktop URL indexed?  The message Search Console says the page is not indexed because it’s a page with a redirect. We have two separate URLs for our brand, desktop and mobile.

  • A: (19:21) With mobile-first indexing. That’s normal. Google will focus on the mobile version of a page. There’s nothing special that you need to do about that, and there’s no specific trick to index just the desktop version…

Is it possible to report sites for stolen content?

Lizzi: (19:36) Christian is asking, is it possible to report sites for stolen content, such as text, original images, that kind of thing?

  • A: (19:46) Yes, you can report a site. Do a search for “DMCA request Google”, and use the “report content on Google” troubleshooter to file a report. 

Is adding Wikipedia links a bad practice?

John: (19:57) Is adding Wikipedia links to justify the content bad practice?

  • A: (20:03) Well, I’d recommend adding links to things that add value to your pages. Blindly adding Wikipedia links to your pages doesn’t add value.

Is there any difference if an internal link is under the word “here”?

Lizzi: (20:14) Gabriel is asking, is there any difference if an internal link is under the word “here” or if it is linked in a keyword?

  • A: (20:23) Hey Gabriel, good question. It doesn’t matter if it’s an internal link to something on your site or if it’s an external link pointing to something else, “here” is still bad link text. It could be pointing to any page, and it doesn’t tell us what the page is about. It’s much better to use words that are related to that topic so that users and search engines know what to expect from that link.

Why does my news site’s traffic go up and down?

Gary: (20:46) Niraj is asking, I follow the same pattern of optimization, but my news website traffic is up and down.

  • A: (20:53) Well, for most sites, it’s actually normal to have periodic traffic fluctuations. For example, seasonality affects e-commerce sites quite a bit. For news sites, specifically user interest in the topics you cover can cause fluctuations, but all in all, it is normal and not something that you have to worry about usually. 

Is changing the URL often impacting my SEO performance?

John: (21:16) Is changing the URL often impacting my SEO performance? For example, a grocery site might change a URL from /christmas/turkey-meat to /easter/turkey-meat. The page is the same, and the URL is just changed with a redirect. 

  • A: (21:35) I wouldn’t recommend constantly changing URLs. At the same time, if you must change your URLs, then definitely make sure to redirect appropriately. 

How does freshness play a role in ranking seasonal queries like Black Friday deals?

Alan: (21:45) How does freshness play a role in the ranking? For seasonal queries like Black Friday deals, it makes sense to update frequently as news or deals are released, but what about something less seasonal?

  • A: (21:58) You may decide to update a Black Friday deals page frequently to reflect the latest offers as they come out. Remember, however, that Google does not guarantee how frequently a page will be reindexed, so not all of the updates are guaranteed to be indexed. Also, a good quality page that does not change much may still be returned in search results if we think its content is still relevant. I would recommend focusing on creating useful content and not spending too much time thinking about how to make static pages more dynamic.

Is there a way to appeal Safe Search results?

John: (22:33) Adam asks, is there a way to appeal Safe Search results? I work with a client that has been blocked from their own brand term while resellers and affiliates are still appearing. 

  • A: (22:44) So first off, I think it’s important to realize that Safe Search is not just about adult content. There’s a bit of nuance involved there, so it’s good to review the documentation. Should you feel that your website is ultimately incorrectly classified, there’s a review request link in an article called “SafeSearch and your website” in the Search developer documentation. 

How can I update my site’s brand name?

Lizzi: (23:08) Danny is asking. My site name in search reflects the old domain’s brand name, even with structured data and metatags. What else can I do to update this information? 

  • A: (23:22) Hello, Danny. The site name documentation has a troubleshooting section with a list of things to check that’s more detailed than what I can cover here. You want to make sure that your site name is consistent across the entire site, not just in the markup. And also, check any other versions of your site and make sure that those are updated too. For example, HTTP and HTTPS. If you’re still not having any luck, go to the Search Console help forum and make posts there. The folks there can help.

When migrating platforms, do URLs need to remain the same?

John: (23:51) Aamir asks while migrating a website from Blogger to WordPress, do the URLs need to be the same, or can I do a bulk 301 redirect?

  • A: (24:02) You don’t need to keep the URLs the same. With many platform migrations, that’s almost impossible to do. The important part is that all old URLs redirect to whatever specific new URLs are relevant. Don’t completely redirect from one domain to another. Instead, redirect on a per URL basis.

How much do I have to do to update an algorithmic penalty?

Duy: (24:24) Johan asked if a website gets algorithmically penalized for thin content, how much of the website’s content do you have to update before the penalty is lifted? 

  • A: (24:34) Well, it’s generally a good idea to clean up low-quality content or spammy content that you may have created in the past. For algorithmic actions. It can take us several months to reevaluate your site again to determine that it’s no longer spammy. 

How can I fix long indexing lead times for my Google-owned site?

John: (24:49) Vinay asks, we’ve set up Google Search Console for a Google-owned website where the pages are dynamically generated. We’d like to get insights into what we should do to fix long indexing lead times.

  • A: (24:05) Well, it’s interesting to see someone from Google posting here. As you listeners might know, my team is not able to give any Google sites SEO advice internally, so they have to pop in here like anyone else. First off, as with any bigger website, I’d recommend finding an SEO agency to help with this holistically. Within Google, in the marketing organization, there are folks that work with external SEO companies, for example. Offhand, one big issue I noticed was that the website doesn’t use normal HTML links, which basically makes crawling it a matter of chance. For JavaScript sites, I’d recommend checking out the guidance in our documentation and our videos. 

How does the helpful content system determine that visitors are satisfied?

Duy: (25:49) Joshua asked, how exactly does the helpful content system determine whether visitors feel they’ve had a satisfying experience?

  • A: (25:58) We published a pretty comprehensive article called “What creators should know about Google’s August 2022 helpful content update” where we outline the type of questions you can ask yourself to determine whether or not you’re creating helpful content for users. Such as, are you focusing enough on people-first content? Are you creating content to attract search users using lots of automation tools? Did you become an expert on a topic overnight and create many articles seemingly out of nowhere? Personally, I think not just SEOs, but digital marketers, content writers, and site owners should be familiar with these concepts in order to create the best content and experience for users. 

Should we have 404 or noindex pages created by bots on our website?

John: (26:40) Ryan asks, bots have swarmed our website and caused millions of real URLs with code tacked on to be indexed on our website through a vulnerability in our platform. Should we 404 these pages or noindex them?

  • A: (26:56) Either using a 404 HTTP result code or a noindex robots metatag is fine. Having these on millions of pages doesn’t cause problems. Depending on your setup. You could also use robots.txt to disallow the crawling of those URLs. The effects will linger in Search Console’s reporting for a longer time, but if you’re sure that it’s fixed, you should be all set.

Will adding a single post in Spanish to my English site affect my search rankings?

Lizzi: (27:20) Bryan asks if my site is all in English and I add a single post in Spanish, will that affect search rankings? 

  • A: (27:29) Hey, Bryan. Sure. That’s totally fine. It’s not going to harm your search rankings. I also recommend checking out our guide to managing multilingual websites, as there’s a lot more to cover when you’re thinking about publishing content in multiple languages.

Do all penalties show up in Search Console?

Duy: (27:44) Stepan asked In Google Search Console exists a section called Manual Actions. Do Google show all penalties there and always notify domain owners when a domain is hit with some penalties?

  • A: (27:58) We have manual actions, which are issued by human reviewers and algorithmic actions, which are driven entirely by our spam algorithms, such as Spambrain. We only communicate manual actions to site owners through Search Console. You can search for manual actions reports. There’s a page there that lists a lot of information to help you understand more about our different types of manual actions, as well as how to file a reconsideration request when you receive and already address the manual action.

Will SEO decline? Should I study something different?

John: (28:33) Caroline asks, will SEO decline in favour of SEA and SMA? I’m starting my internship and need to know if I better redirect my path or continue on my way and specialise myself in accessibility.

  • A: (28:49) I’m not quite sure what SMA is, but regardless, there are many critical parts that lead to a website’s and a business’s success. I definitely wouldn’t say that you shouldn’t focus on SEO, but at the same time, it’s not, well, the answer to everything. My recommendation would be to try things out. Find where your passions and your talents lie, and then try more of that. Over the years things will definitely change, as will your interests. In my opinion, it’s better to try and evolve than towait for the ultimate answer. 

Does the number of outgoing links affect my rankings?

Duy: (29:24) Jemmy asked, does the number of outgoing links both internaland external, dilute PageRank, or is PageRank distributed differently for each type of link?

  • A: (29:35) I think you might be overthinking several things. First of all, focusing too much on PageRank, through building unnatural links whether it violates a policy or not, it takes time and effort away from other more important factors on your, such as helpful content and great user experience. Second of all, sites with internal links allowed us to discover not only new pages, but also understand your site better. Limiting them explicitly would likely do more harm than good.

Conclusion

John: (30:07) And that was it for this episode. I hope you found the questions and answers useful. If there’s anything you submitted which didn’t get covered here, I’d recommend posting in the Search Central Help community. There are lots of passionate experts there who can help you to narrow things down. And of course, if there’s more on your mind, please submit those questions with the form linked below. Your questions here are useful to us and to those who catch up on these recordings, so please keep them coming. If you have general feedback about these episodes, let us know in the comments or ping us on social media. I hope the year has gone well. For us things have certainly evolved over the course of the year with well ups and downs and a bunch of new launches. I’m looking forward to catching up with you again next year, perhaps in another episode of these office hours. In the meantime, may your site’s traffic go up and your crawl errors go down.
Have a great new year and see you soon. Bye!

Sign up for our Webmaster Hangouts today!

GET IN CONTACT TODAY AND LET OUR TEAM OF ECOMMERCE SPECIALISTS SET YOU ON THE ROAD TO ACHIEVING ELITE DIGITAL EXPERIENCES AND GROWTH 

WebMaster Hangout – Live from SEPTEMBER 07, 2022

A site that connects seekers and providers of household-related services.

LIZZI SASSMAN: (01:03) So the first question that we’ve got here is from Dimo, a site that connects seekers and providers of household related services. Matching of listings is based on zip code, but our users come from all over Germany. The best value for users is to get local matches. Is there some kind of schema markup code to tell Google algorithm show my site also in the Local Pack. Please note we do not have local businesses and cannot utilise the local business markup code. The site is… and I’m going to redact that.

  • A: (01:36) Yes, so Dimo– as you noted, local business markup is for businesses with physical locations. And that means that there’s typically one physical location in the world for that place. So it should only show up for that city. And your other question, currently, there’s no rich result feature for online services only that you could use structured data for, but you can claim your business profile with Google Business Profile manager and specify a service area there. So I think that that could help.

Is there a way to measure page experience or the core web vitals on Safari browsers, and is there a way to improve the experience?

MARTIN SPLITT: (02:12)  Indra asks, Google Search Console shows excellent core web vital scores for our site, but I understand that it only shows details of Chrome users. A majority of our users browse using Safari. Is there a way to measure page experience or the core web vitals on Safari browsers, and is there a way to improve the experience?

  • A: (02:35) Well, you can’t really use Google Search Console for this, but you can definitely measure these things yourself with the browser developer tools in a Safari browser and maybe ask around if you have any data from Safari users through analytics, for instance. There’s nothing here that we can do for the page experience or Search Console’s page experience resource because the data is just not available.

How can I best switch from one domain to a new one?

JOHN MUELLER: (03:01) For the next question, I’ll paraphrase. How can I best switch from one domain to a new one? Should I clone all the content or just use 80% of the content? What is the fastest way to tell Google that they’re both my sites?

  • A: (03:17) We call this process a site migration. It’s fairly well documented, so I would look up the details in our documentation. To simplify and leave out a lot of details, ideally, you’d move the whole website, 1 to 1, to the new domain name and use permanent 301 redirects from the old domain to the new one. This is the easiest for our system to process. We can transfer everything directly. If you do other things like removing content, changing the page URLs, restructuring, or using a different design on the new domain, that all adds complexity and generally makes the process a little bit slower. That said, with a redirect, users will reach your new site, regardless of whether they use the old domain or the new one.

Do you support the use and the full range of schema.org entities when trying to understand the content of a page, outside of use cases such as rich snippets?

LIZZI SASSMAN: (04:04) And our next question is from IndeQuest1. Do you support the use and the full range of schema.org entities when trying to understand the content of a page outside of use cases such as rich snippets? Can you talk about any limitations that might exist that might be relevant for developers looking to make deeper use of the standard?

  • A: (04:26) So, to answer your question, no, Google does not support all of the schema.org entities that are available on schema.org. We have the search gallery which provides a full list of what we do support for rich snippets, like you mentioned, in Google Search results. But not all of those things are visual. We do talk about certain properties that might be more metadata-like, and that aren’t necessarily visible as a rich result. And that still helps Google to understand things, like authors or other metadata information about a page. So we are leveraging that kind of thing.

What could be the reason that the sitemap cannot be read by the Googlebot?

GARY ILLYES: (05:07) Anton Littau is asking, in Search Console, I get the message “sitemap could not be read” in the sitemap report. No other information is provided. What could be the reason that the sitemap cannot be read by the Googlebot?

  • A: (05:21) Good question. The “sitemap could not be read” message in Search Console may be caused by a number of issues, some of them technical, some of them related to the content quality of the site itself. Rarely, it may also be related to the hosting service, specifically, if you are hosting on a free domain or subdomain of your hoster, and the hoster is overrun by spam sites, that may also cause issues with fetching sitemaps.

We’ve got guides and tips that are illustrated on our website, and they’re not performing well in the SERP.

LIZZI SASSMAN: (05:53) Our next question is from Nicholas. We would like to know how algorithms treat cartoon illustrations. We’ve got guides and tips that are illustrated on our website, and they’re not performing well in the SERP. We tried to be unique, using some types of illustrations and persona to make our readers happy. Do you think we did not do it right?

  • A: (06:18) I don’t know because I don’t think I’ve ever seen your cartoons, but I can speak to how to improve your cartoon illustrations in SERP. So our recommendation would be to add text to the page to introduce the cartoons, plus alt text for each of the images. Think about what people will be searching for in Google Images to find your content. And use those kinds of descriptive words versus just saying the title of your cartoon. Hope that helps.

Does posting one content daily increase rankings?

GARY ILLYES: (06:46) Chibuzor Lawrence is asking, does posting one content daily increase rankings?

  • A: (06:53) No, posting daily or at any specific frequency, for that matter, doesn’t help with ranking better in Google Search results. However, the more pages you have in the Google index, the more your content may show up in Search results.

Does Google agree with the word count or not?

LIZZI SASSMAN: (07:09) OK, and the next question is from Suresh. About the helpful content update that only 10% write quality content, and the rest, 90%, don’t right quality content, lengthy content, but how should they write quality content? Does Google agree with the word count or not?

  • A: (07:29) Well, nope, content can still be helpful whether it’s short or long. It just depends on the context and what that person is looking for. It doesn’t matter how many words, if it’s 500, 1,000. If it’s answering the user’s intent, then it’s fine. It can be helpful. These are not synonymous things.

When using words from a page title in the URL, should I include stopper words too?

JOHN MUELLER: (07:49) I’ll paraphrase the next question, hopefully, correctly. In short, when using words from a page title in the URL, should I include stopper words too? For example, should I call a page whyistheskyblue.HTML or whyskyblue.HTML?

  • A: (08:08) Well, thanks for asking. Words in URLs only play a tiny role in Google Search. I would recommend not overthinking it. Use the URLs that can last over time, avoid changing them too often, and try to make them useful for users. Whether you include stop words in them or not or decide to use numeric IDs, that’s totally up to you.

Do different bots type, image, and desktop share crawl budgets?

GARY ILLYES: (08:31) Sanjay Sanwal is asking: do different bots type, image, and desktop share crawl budget? And what about different hosts?

  • A: (08:40) Fantastic question. The short answer is yes, Google Bots and its friends share a single crawl budget. What this means to your site is that if you have lots of images, for example, Googlebot Images may use up some of the crawl budgets that otherwise could have been used by Googlebot. In reality, this is not a concern for the vast majority of the sites. So unless you have millions of pages and images or videos, I wouldn’t worry about it. It’s worth noting that the crawl budget is per host. So, for example, if you have subdomain.example.com, and you have another subdomain.example.com, they have different crawl budgets.

Request to 301 redirect the subdirectory to their new German site. Would you advise against it?

JOHN MUELLER: (09:24) Christopher asks: we’ve sold the German subdirectory of our website to another company. They request us to 301 redirect the subdirectory to their new German site. Would you advise against it? Would it hurt us?

  • A: (09:40) Well, on the one hand, it all feels kind of weird to sell just one language version of a website to someone else. On the other hand, why not? I don’t see any problems redirecting from there to a different website. The only thing I would watch out for, for security reasons, is that you avoid creating so-called open redirects, where any URL from there is redirected to an unknown third party. Otherwise, that sounds fine.

Can I expect to see clicks and impressions from this in the search appearance filter as we can see with some other rich results?

LIZZI SASSMAN: (10:08) Sam Gooch is asking: I’m experimenting with a new learning video, rich result, and can see it’s being picked up in Google Search Console. Can I expect to see clicks and impressions from this in the search appearance filter as we can see with some other rich results?

  • A: (10:23) Well, to answer this question specifically, there’s no guaranteed time that you’ll be able to see a specific rich result in Google Search after adding structured data. But I think what you’re asking about here is for a specific thing to be added to Search Console, and we’ll have to check with the team on the timeline for that. And we don’t pre-announce when certain things will be added to Search Console. But you can check the rich result status report for the learning video and make sure that you’re adding all of the right properties and that it’s valid and ready to go for Google to understand what it needs in order to generate a rich result. Hope that helps.

How big is the risk of penalising action if we use the same HTML structure, same components, layout, and same look and feel between the different brands?

JOHN MUELLER: (11:02) Roberto asks: we’re planning to share the same backend and front end for our two brands. We’re ranking quite well with both of them in Google. How big is the risk of penalising action if we use the same HTML structure, same components, layout, and same look and feel between the different brands? What would be different are the logos, fonts, and colours. Or would you suggest migrating to the same front end but keeping the different experience between the two brands?

  • A: (11:33) Well, this is a great question. Thanks for submitting it. First off, there’s no penalty or web spam manual action for having two almost identical websites. That said, if the URLs and the page content are the same across these two websites, then what can happen for identical pages is that our systems may pick one of the pages as a canonical page. This means we would focus our crawling, indexing, and ranking on that canonical page. For pages that aren’t identical, we generally index both of them. For example, if you have the same document on both websites, we’d pick one and only show that one in Search. In practice, that’s often fine. If you need both pages to be shown in Search, just make sure they’re significantly different, not just with a modified logo or colour scheme.

JavaScript SEO, what to avoid along with JavaScript links?

MARTIN SPLITT: (12:23) Anna Giaquinto asks, JavaScript SEO, what to avoid along with JavaScript links?

  • A: (12:30) Well, the thing with links is that you want to have a proper link, so avoid anything that isn’t a proper link. What is a proper link? Most importantly, it’s an HTML tag that has an href that lists a URL that is resolvable, so not like a JavaScript colon URL. And that’s pretty much it. If you want to learn more about JavaScript-specific things for Search, you can go to the JavaScript beginner’s guide on developers.google.com/search and see all the things that you might want to look out for.

I research a keyword that has no volume or keyword density, but we are appearing for those keywords on the first page. Should we target that keyword?

LIZZI SASSMAN: (13:05) Our next question is from Sakshi Singh. Let’s say I research a keyword that has no volume or keyword density, but we are appearing for those keywords on the first page. Should we target that keyword?

  • A: (13:19) Well, Sakshi, you can optimise for whatever keywords you want, and it’s not always about the keywords that have the most volume. I would think about how people should find your page and target those keywords.

Will audio content be given more priority and independent ranking following the helpful content algorithm update?

GARY ILLYES: (13:32) Kim Onasile is asking, hello, you previously advised that there are no SEO benefits to audio versions of text content and that audio-specific content doesn’t rank separately like video content. However, given you also said it might be that there are indirect effects like if users find this page more useful and they recommend it more, that’s something that could have an effect. Will audio content be given more priority and independent ranking following the helpful content algorithm update?

  • A: (14:07) This is an interesting question. And ignoring the helpful content algorithm update part, no, audio content, on its own, doesn’t play a role in the ranking of text results.

Is it OK to fetch meta contents through JavaScript?

MARTIN SPLITT: (14:33) Someone asked, is it OK to fetch meta contents through JavaScript? I think that means it is OK to update metatag data with JavaScript?

  • A: (14:44) While that is possible to do, it is best to not do that. It may give Google Search mixed signals, and some features may not pick up the changes. Like, some specific search result types might not work the way you expect them. Or it might have incorrect information, or it might miss something. So I would suggest not doing that.

Both of my websites have been hit by different updates, around 90% drops, and are suffering from some type of flag that is suppressing our sites until the soft penalty is lifted.

GARY ILLYES: (15:08) Anonymous is asking, both of my websites have been hit by different updates, around 90% drops, and are suffering from some type of flag that is suppressing our sites until the soft penalty is lifted. Or is there even a soft penalty?

  • A: (15:26) Good question. No, the named updates that we publish on the Rankings Updates page on Search Central are not penalties in any shape or form. They are adjustments to our ranking algorithms, so they surface even higher quality and more relevant results to Search users. If your site has dropped in rankings after an update, follow our general guidelines for content, take a look at how you could improve your site as a whole, both from content and user experience perspective, and you may be able to increase your rankings again.

When would be the next possible update for the Search results?

JOHN MUELLER: (16:03) Ayon asks, when would be the next possible update for the Search results?

  • A: (16:09) Well, on our How Search Works site, we mentioned that we did over 4,000 updates in 2021. That’s a lot of updates. Personally, I think it’s critical to keep working on things that a lot of people use. Our users and your users expect to find things that they consider to be useful and relevant. And what that means can change over time. Many of these changes tend to be smaller and are not announced. The bigger ones, and especially the ones which you, as a site owner, can work on, are announced and listed in our documentation. So in short, expect us to keep working on our systems, just like you, hopefully, keep working on yours.

Does having a star aggregated ranking on recipes improve its position?

LIZZI SASSMAN: (16:54) And our next question is from Darius. So Darius is asking, does having a star aggregated ranking on recipes improve its position?

  • A: (17:05) I think what Darius is asking about is the stars that show up for recipes and with structured data and whether or not that has an effect on ranking. So while the stars are more visual and eye-catching, structured data in and of itself is not a ranking signal. And it isn’t guaranteed that these rich results will show up all the time. The Google algorithm looks at many things when it’s creating what it thinks is the best Search experience for someone. And that can depend on a lot of things, like the location, language, and device type.

When I don’t set a rel-canonical, then I can see the internal links in the search console in the links report. Is this normal?

JOHN MUELLER: (17:37) Christian asks: I have set the rel-canonical together with a noindex meta tag. When Google does not accept a canonical at all, all internal links are dropped. When I don’t set a rel-canonical, then I can see the internal links in the search console in the links report. Is this normal?

  • A: (17:55) Well, this is a complex question since it mixes somewhat unrelated things. A noindex says to drop everything and the rel-canonical hints that everything should be forwarded. So what does using both mean? Well, it’s essentially undefined. Our systems will try to do the best they can in a conflicting case like this, but a specific outcome is not guaranteed. If that’s fine with you, for example, if you need to use this setup for other search engines, then that’s fine with us too. If you want something specific to happen, then be as clear as possible for all search engines.

If a video is indexed in the video indexing report, is it still worth adding the video structured data on that page and why?

LIZZI SASSMAN: (18:33) And our next question is from Thijs.  If a video is indexed in the video indexing report, is it still worth adding the video structured data on that page and why?

  • A: (18:47) Well, yes. Just because something’s indexed doesn’t mean that there’s not an opportunity to improve how it appears. Structured data helps Google understand more about your video, like what it’s about, the title, interaction statistics, and that kind of stuff. And adding structured data can make your videos eligible for other video features, like key moments. So it’s not just, oh, get your video indexed, and that’s it. There are other things that you can do to improve how your content appears on Google.

Can I cloak a list with lots of products to Googlebot and show users a Load More button?

MARTIN SPLITT: (19:20) Tamás asks, can I cloak a list with lots of products to Googlebot and show users a Load More button?

  • A: (19:26) I think this is not cloaking, as what users see when they click on the Search result roughly matches what Googlebot sees. And if you have a Load More button, users will click that if they don’t see the product they are expecting there. So I don’t think this is cloaking, and that’s a solution that I think works from a crawling point of view.

Sign up for our Webmaster Hangouts today!

GET IN CONTACT TODAY AND LET OUR TEAM OF ECOMMERCE SPECIALISTS SET YOU ON THE ROAD TO ACHIEVING ELITE DIGITAL EXPERIENCES AND GROWTH 

WEBMASTER HANGOUT – LIVE FROM JULY 01, 2022

Which number is correct, Page Speed Insights or Search Console?

Q: (00:30) Starting off, I have one topic that has come up repeatedly recently, and I thought I would try to answer it in the form of a question while we’re at it here. So, first of all, when I check my page speed insight score on my website, I see a simple number. Why doesn’t this match what I see in Search Console and the Core Web Vitals report? Which one of these numbers is correct?

  • (01:02) I think maybe, first of all, to get the obvious answer out of the door, there is no correct number when it comes to speed when it comes to an understanding of how your website is performing for your users. In PageSpeed Insights, by default, I believe we show a single number that is a score from 0 to 100, something like that, which is based on a number of assumptions where we assume that different things are a little bit faster or slower for users. And based on that, we calculate a score. In Search Console, we have the Core Web Vitals information based on three numbers: speed, responsiveness, and interactivity. And these numbers are slightly different because it’s three numbers, not just one. But, also, there’s a big difference in the way these numbers are determined. Namely, there’s a difference between so-called field data and lab data. Field data is what users see when they go to your website. And this is what we use in Google Search Console. That’s what we use for search, as well, whereas lab data is a theoretical view of your website, like where our systems have certain assumptions where they think, well, the average user is probably like this, using this kind of device, and with this kind of a connection, perhaps. And based on those assumptions, we will estimate what those numbers might be for an average user. And you can imagine those estimations will never be 100% correct. And similarly, the data that users have seen will change over time, as well, where some users might have a really fast connection or a fast device, and everything goes really fast on their website or when they visit your website, and others might not have that. And because of that, this variation can always result in different numbers. Our recommendation is generally to use the field data, the data you would see in Search Console, as a way of understanding what is kind of the current situation for our website, and then to use the lab data, namely, the individual tests that you can run yourself directly, to optimise your website and try to improve things. And when you are pretty happy with the lab data you’re getting with your new version of your website, then over time, you can collect the field data, which happens automatically, and double-check that users see it as being faster or more responsive, as well. So, in short, again, there is no absolutely correct number when it comes to any of these metrics. There is no absolutely correct answer where you’d say this is what it should be. But instead, there are different assumptions and ways of collecting data, and each is subtly different.

How can our JavaScript site get indexed better?

Q: (04:20) So, first up, we have a few custom pages using Next.js without a robots.txt or a sitemap file. Simplified, theoretically, Googlebot can reach all of these pages, but why is only the homepage getting indexed? There are no errors or warnings in Search Console. Why doesn’t Googlebot find the other pages?

  • (04:40) So, maybe taking a step back, Next.js is a JavaScript framework, meaning the whole page is generated with JavaScript. But as a general answer, as well, for all of these questions like, why is Google not indexing everything? It’s important first to say that Googlebot will never index everything across a website. I don’t think it happens to any kind of non-trivial-sized website where Google would completely index everything. From a practical point of view, it’s impossible to index everything across the web. So that kind of assumption that the ideal situation is everything is indexed, I would leave that aside and say you want Googlebot to focus on the important pages. The other thing, though, which became a little bit clearer when, I think, the person contacted me on Twitter and gave me a little bit more information about their website, was that the way that the website was generating links to the other pages was in a way that Google was not able to pick up. So, in particular, with JavaScript, you can take any element on an HTML page and say, if someone clicks on this, then execute this piece of JavaScript. And that piece of JavaScript can be used to navigate to a different page, for example. And Googlebot does not click on all elements to see what happens. Instead, we go off and look for normal HTML links, which is the kind of traditional way you would link to individual pages on a website. And, with this framework, it didn’t generate these normal HTML links. So we could not recognise that there’s more to crawl and more pages to look at. And this is something that you can fix in how you implement your JavaScript site. We have a tonne of information on the Search Developer Documentation site around JavaScript and SEO, particularly on the topic of links because that comes up now and then. There are many creative ways to create links, and Googlebot needs to find those HTML links to make them work. Additionally, we have a bunch of videos on our YouTube channel. And if you’re watching this, you must be on the YouTube channel since nobody is here. If you’re watching this on the YouTube channel, go out and check out those JavaScript SEO videos on our channel to get a sense of what else you could watch out for when it comes to JavaScript-based websites. We can process most kinds of JavaScript-based websites normally, but some things you still have to watch out for, like these links.

Does it affect my SEO score negatively if I link to HTTP pages?

Q: (07:35)Next up, does it affect my SEO score negatively if my page is linking to an external insecure website?

  • (07:44) So on HTTP, not HTTPS. So, first off, we don’t have a notion of an SEO score. So you don’t have to worry about the kind of SEO score. But, regardless, I kind of understand the question is, like, is it wrong if I link to an HTTP page instead of an HTTPS page. And, from our point of view, it’s perfectly fine. If these pages are on HTTP, then that’s what you would link to. That’s kind of what users would expect to find. There’s nothing against linking to sites like that. There is no downside for your website to avoid linking to HTTP pages because they’re kind of old or crusty and not as cool as on HTTPS. I would not worry about that.

Q: (08:39) With Symantec and voice search, is it better to use proper grammar or write how people actually speak? For example, it’s grammatically correct to write, “more than X years,” but people actually say, “over X years,” or write a list beginning with, “such as X, Y, and Z,” but people actually say, “like X, Y, and Z.”

  • (09:04) Good question. So the simple answer is, you can write however you want. There’s nothing holding you back from just writing naturally. And essentially, our systems try to work with the natural content found on your pages. So if we can crawl and index those pages with your content, we’ll try to work with that. And there’s nothing special that you need to do there. The one thing I would watch out for, with regards to how you write your content, is just to make sure that you’re writing for your audience. So, for example, if you have some very technical content, but you want to reach people who are non-technical, then write in the non-technical language and not in a way that is understandable to people who are deep into that kind of technical information. So kind of the, I would guess, the traditional marketing approach of writing for your audience. And our systems usually are able to deal with that perfectly fine.

Should I delete my disavow file?

Q: (10:20) Next up, a question about links and disavows. Over the last 15 years, I’ve disavowed over 11,000 links in total. I never bought a link or did anything unallowed, like sharing. The links that I disavowed may have been from hacked sites or from nonsense, auto-generated content. Since Google now claims that they have better tools to not factor these types of hacked or spammy links into their algorithms, should I just delete my disavow file? Is there any risk or upside, or downside to just deleting it?

  • (10:54) So this is a good question. It comes up now and then. And disavowing links is always kind of one of those tricky topics because it feels like Google is probably not telling you the complete information. But, from our point of view, we do work hard to avoid taking this kind of link into account. And we do that because we know that the disavow links tool is a niche tool, and SEOs know about it, but the average person who runs a website doesn’t know about it. And all those links you mentioned are the links that any website gets over the years. And our systems understand that these are not things you’re trying to do to game our algorithms. So, from that point of view, if you’re sure that there’s nothing around a manual action that you had to resolve with regards to these links, I would just delete the disavow file and move on with life and leave all of that aside. I would personally download it and make a copy so that you have a record of what you deleted. But, otherwise, if you’re sure these are just the normal, crusty things from the internet, I would delete it and move on. There’s much more to spend your time on when it comes to websites than just disavowing these random things that happen to any website on the web.

Can I add structured data with Google Tag Manager?

Q: (12:30) Adding schema markup with Google Tag Manager is that good or bad for SEO? Does it affect ranking?

  • (12:33) So, first of all, you can add structure data with Google Tag Manager. That’s an option. Google Tag Manager is a simple piece of JavaScript you add to your pages and then does something on the server-side. And it can modify your pages slightly using JavaScript. For the most part, we’re able to process this normally. And the structured data you generally like can be counted, just like any other structured data on your web pages. And, from our point of view, structured data, at least the types that we have documented, is primarily used to help generate rich results, we call them, which are these fancy search results with a little bit more information, a little bit more colour or detail around your pages. And if you add your structured data with the Tag Manager, that’s perfectly fine. From a practical point of view, I prefer to have the structured data on the page or your server so that you know exactly what is happening. It makes it a little bit easier to debug things. It makes it easier to test things. So trying it out with Tag Manager, from my point of view, I think, is legitimate. It’s an easy way to try things out. But, in the long run, I would try to make sure that your structured data is on your site directly, just to make sure that it’s easier to process for anyone who comes by to process your structured data and it’s easier for you to track and debug and maintain over time, as well, so that you don’t have to check all of these different separate sources.

Is it better to block by robots.txt or with the robots meta tag?

Q: (14:20) Simplifying a question a little bit, which is better, blocking with robots.txt or using the robots meta tag on the page? How do we best prevent crawling? 

  • (14:32) So this also comes up from time to time. We did a podcast episode recently about this, as well. So I would check that out. The podcasts are also on the YouTube channel, so you can click around a little bit, and you’ll probably find them quickly. In practice, there is a subtle difference here where, if you’re in SEO and you’ve worked with search engines, then probably you understand that already. But for people who are new to the area, it’s sometimes unclear exactly where these lines are. And with robots.txt, which is the first one you mentioned in the question, you can essentially block crawling. So you can prevent Googlebot from even looking at your pages. And with the robot’s meta tag, you can do things like blocking indexing when Googlebot looks at your pages and sees that robot’s meta tag. In practice, both of these results in your pages do not appear in the search results, but they’re subtly different. So if we can’t crawl, we don’t know what we’re missing. And it might be that we say, well, there are many references to this page. Maybe it is useful for something. We just don’t know. And then that URL could appear in the search results without any of its content because we can’t look at it. Whereas with the robot’s meta tag, if we can look at the page, then we can look at the meta tag and see if there’s no index there, for example. Then we stop indexing that page and drop it completely from the search results. So if you’re trying to block crawling, then definitely, robots.txt is the way to go. If you just don’t want the page to appear in the search results, I would pick whichever is easier for you to implement. On some sites, it’s easier to set a checkbox saying that I don’t want this page found in Search, and then it adds a noindex meta tag. For others, maybe editing the robots.txt file is easier. Kind of depends on what you have there.

Q: (16:38) Are there any negative implications to having duplicate URLs with different attributes in your XML sitemaps? For example, one URL in one sitemap with an hreflang annotation and the same URL in another sitemap without that annotation.

  • (16:55) So maybe, first of all, from our point of view, this is perfectly fine. This happens now and then. Some people have hreflang annotations in sitemap files separated away, and then they have a normal sitemap file for everything. And there is some overlap there. From our point of view, we process these sitemap files as we can, and we take all of that information into account. There is no downside to having the same URL in multiple sitemap files. The only thing I would watch out for is that you don’t have conflicting information in these sitemap files. So, for example, if with the hreflang annotations, you’re saying, oh, this page is for Germany and then on the other sitemap file, you’re saying, well, actually this page is also for France or in French, then our systems might be like, well, what is happening here? We don’t know what to do with this kind of mix of annotations. And then we may pick one or the other. Similarly, if you say this page was last changed 20 years ago, which doesn’t make much sense but say you say 20 years. And in the other sitemap file, you say, well, actually, it was five minutes ago. Then our systems might look at that and say, well, one of you is wrong. We don’t know which one. Maybe we’ll follow one or the other. Maybe we’ll just ignore that last modification date completely. So that’s kind of the thing to watch out for. But otherwise, if it’s just mentioned multiple sitemap files and the information is either consistent or kind of works together, in that maybe one has the last modification date, the other has the hr flange annotations, that’s perfectly fine.

How can I block embedded video pages from getting indexed?

Q: (19:00) I’m in charge of a video replay platform, and simplified, our embeds are sometimes indexed individually. How can we prevent that?

  • (19:10) So by embeds, I looked at the website, and basically, these are iframes that include a simplified HTML page with a video player embedded. And, from a technical point of view, if a page has iframe content, then we see those two HTML pages. And it is possible that our systems indexed both HTML pages because they are separate. One is included in the other, but they could theoretically stand on their own, as well. And there’s one way to prevent that, which is a reasonably new combination with robots meta tags that you can do, which is with the indexifembedded robots meta tag and a noindex robots meta tag. And, on the embedded version, so the HTML file with the video directly in it– you would add the combination of noindex plus indexifembedded robots meta tags. And that would mean that, if we find that page individually, we would see, oh, there’s a noindex. We don’t have to index this. But with the indexifembedded, it essentially tells us that, well, actually, if we find this page with the video embedded within the general website, then we can index that video content, which means that the individual HTML page would not be indexed. But the HTML page embedded with the video information would be indexed normally. So that’s kind of the setup that I would use there. And this is a fairly new robots meta tag, so it’s something that not everyone needs. Because this combination of iframe content or embedded content is kind of rare. But, for some sites, it just makes sense to do it like that.

Q: (21:15)Another question about HTTPS, maybe. I have a question around preloading SSL via HSTS. We are running into an issue where implementing HSTS into the Google Chrome preload list. And the question kind of goes on with a lot of details. But what should we search for?

  • (21:40) So maybe take a step back when you have HTTPS pages and an HTTP version. Usually, you would redirect from the HTTP version to HTTPS. And the HTTPS version would then be the secure version because that has all of the properties of the secure URLs. And the HTTP version, of course, would be the open one or a little bit vulnerable. And if you have this redirect, theoretically, an attacker could take that into account and kind of mess with that redirect. And with HSTS, you’re telling the browser that once they’ve seen this redirect, it should always expect that redirect, and it shouldn’t even try the HTTP version of that URL. And, for users, that has the advantage that nobody even goes to the HTTP version of that page anymore, making it a little more secure. And the pre-load list for Google Chrome is a static list that is included, I believe, in Chrome probably in all of the updates, or I don’t know if it’s downloaded separately. Not completely sure. But, essentially, this is a list of all of these sites where we have confirmed that HSTS is set up properly and that redirect to the secure page exists there so that no user ever needs to go to the HTTP version of the page, which makes it a little bit more secure. From a practical point of view, this difference is very minimal. And I would expect that most sites on the internet just use HTTPS without worrying about the pre-load list. Setting up HSTS is always a good practice, but it’s something that you can do on your server. And as soon as the user sees that, their Chrome version keeps that in mind automatically anyway. So from a general point of view, I think using the pre-load list is a good idea if you can do that. But if there are practical reasons why that isn’t feasible or not possible, then, from my point of view, I would not worry about only looking at the SEO side of things. When it comes to SEO, for Google, what matters is essentially the URL that is picked as the canonical. And, for that, it doesn’t need HSTS. It doesn’t need the pre-load list. That does not affect at all on how we pick the canonical. But rather, for the canonical, the important part is that we see that redirect from HTTP to HTTPS. And we can kind of get a confirmation within your website, through the sitemap file, the internal linking, all of that, that the HTTPS version is the one that should be used in Search. And if we use the HTTPS version in Search, that automatically gets all of those subtle ranking bonuses from Search. And the pre-load list and HSTS are not necessary there. So that’s kind of the part that I would focus on there.

How can I analyse why my site dropped in ranking for its brand name?

Q: (25:05) I don’t really have a great answer, but I think it’s important to at least mention, as well what are the possible steps for investigation if a website owner finds their website is not ranking for their brand term anymore, and they checked all of the things, and it doesn’t seem to be related to any of the usual things?

  • (25:24) So, from my point of view, I would primarily focus on the Search Console or the Search Central Health Community and post all of your details there. Because this is where all of those escalations go and where the product and the Help forum, they can take a look at that. And they can give you a little bit more information. They can also give you their personal opinion on some of these topics, which might not match 100% what Google would say, but maybe they’re a little bit more practical, where, for example, probably not relevant to this site, but you might post something and say, well, my site is technically correct and post all of your details. And one of the product experts looks at it and says it might be technically correct, but it’s still a terrible website. You need to get your act together, write, and create better content. And, from our point of view, we would focus on technical correctness. And you need someone to give you that, I don’t know, personal feedback. But anyway, in the Help forums, if you post the details of your website with everything that you’ve seen, the product experts are often able to take a look and give you some advice on, specifically, your website and the situation that it’s in. And if they’re not able to figure out what is happening there, they also have the ability to escalate these kinds of topics to the community manager of the Help forums. And the community manager can also bring things back to the Google Search team. So if there are things that are really weird and now and then, something really weird does happen with regards to Search. It’s a complex computer system. Anything can break. But the community managers and the product experts can bring that back to the Search team. And they can look to see if there is something that we need to fix, or is there something that we need to tell the site owner, or is this kind of just the way that it is, which, sometimes, it is. But that’s generally the direction I would go for these questions. The other subtly mentioned here is that I think the site does not rank for its brand name. One of the things to watch out for, especially with regards to brand names, is that it can happen that you say something is your brand name, but it’s not a recognised term from users. For example, you might say I don’t know. You might call your website bestcomputermouse.com. And, for you, that might be what you call your business or what you call your website. Best Computer Mouse. But when a user goes to Google and enters “best computer mouse,” that doesn’t necessarily mean they want to go directly to your website. It might be that they’re looking for a computer mouse. And, in cases like that, there might be a mismatch of what we show in the search results with what you think you would like to have shown for the search results for those queries if it’s something more of a generic term. And these kinds of things also play into search results overall. The product experts see these all the time, as well. And they can recognise that and say, actually, just because you call your website bestcomputermouse.com I hope that site doesn’t exist. But, anyway, just because you call your website doesn’t necessarily mean it will always show on top of the search results when someone enters that query. But that’s kind of something to watch out for. But, in general, I would go to the Help forums here and include all of the information you know that might play a role here. So if there was a manual action involved and you’re kind of, I don’t know, ashamed of that which, it’s kind of normal. But all of this information helps the product experts better understand your situation and give you something actionable that you can do to take as a next step or to understand the situation a little bit better. So the more information you can give them from the beginning, the more likely they’ll be able to help you with your problem.

Sign up for our Webmaster Hangouts today!

GET IN CONTACT TODAY AND LET OUR TEAM OF ECOMMERCE SPECIALISTS SET YOU ON THE ROAD TO ACHIEVING ELITE DIGITAL EXPERIENCES AND GROWTH 

WebMaster Hangout – Live from June 03, 2022

Can I use two HTTP result codes on a page?

Q: (01:22) All right, so the first question I have on my list here is it’s theoretically possible to have two different HTTP result codes on a page, but what will Google do with those two codes? Will Google even see them? And if yes, what will Google do? For example, a 503 plus a 302.

  • (01:41) So I wasn’t aware of this. But, of course, with the HTTP result codes, you can include lots of different things. Google will look at the first HTTP result code and essentially process that. And you can theoretically still have two HTTP result codes or more there if they are redirects leading to some final page. So, for example, you could have a redirect from one page to another page. That’s one result code. And then, on that other page, you could serve a different result code. So that could be a 301 redirect to a 404 page is kind of an example that happens every now and then. And from our point of view, in those chain situations where we can follow the redirect to get a final result, we will essentially just focus on that final result. And if that final result has content, then that’s something we might be able to use for canonicalization. If that final result is an error page, then it’s an error page. And that’s fine for us too.

Does using a CDN improve rankings if my site is already fast in my main country?

Q: (02:50) Does putting a website behind a CDN improve ranking? We get the majority of our traffic from a specific country. We hosted our website on a server located in that country. Do you suggest putting our entire website behind a CDN to improve page speed for users globally, or is that not required in our case?

  • (03:12) So obviously, you can do a lot of these things. I don’t think it would have a big effect on Google at all with regards to SEO. The only effect where I could imagine that something might happen is what users end up seeing. And kind of what you mentioned, if the majority of your users are already seeing a very fast website because your server is located there, then you’re kind of doing the right thing. But of course, if users in other locations are seeing a very slow result because perhaps the connection to your country is not that great, then that’s something where you might have some opportunities to improve that. And you could see that as something in terms of an opportunity in the sense that, of course, if your website is really slow for other users, then it’s going to be rarer for them to start going to your website more because it’s really annoying to get there. Whereas, if your website is pretty fast for other users, then at least they have an opportunity to see a reasonably fast website, which could be your website. So from that point of view, if there’s something that you can do to improve things globally for your website, I think that’s a good idea. I don’t think it’s critical. It’s not something that matters in terms of SEO in that Google has to see it very quickly as well or anything like that. But it is something that you can do to kind of grow your website past just your current country. Maybe one thing I should clarify, if Google’s crawling is really, really slow, then, of course, that can affect how much we can crawl and index from the website. So that could be an aspect to look into. In the majority of websites that I’ve looked at, I haven’t really seen this as being a problem with regards to any website that isn’t millions and millions of pages large. So from that point of view, you can double-check how fast Google is crawling in the Search Console and the crawl stats. And if that looks reasonable, even if that’s not super fast, then I wouldn’t really worry about that.

Should I disallow API requests to reduce crawling?

Q: (05:20) Our site is a live stream shopping platform. Our site currently spends about 20% of the crawl budget on the API subdomain and another 20% on image thumbnails of videos. Neither of these subdomains has content which is part of our SEO strategy. Should we disallow these subdomains from crawling, or how are the API endpoints discovered or used?

  • (05:49) So maybe the last question there first. In many cases, API endpoints end up being used by JavaScript on our website, and we will render your pages. And if they access an API that is on your website, then we’ll try to load the content from that API and use that for the rendering of the page. And depending on how your API is set up and how your JavaScript is set up, it might be that it’s hard for us to cache those API results, which means that maybe we crawl a lot of these API requests to try to get a rendered version of your pages so that we can use those for indexing. So that’s usually the place where this is discovered. And that’s something you can help by making sure that the API results can also be cached well, that you don’t inject any timestamps into URLs, for example, when you’re using JavaScript for the API, all of those things there. If you don’t care about the content that’s returned with these API endpoints, then, of course, you can block this whole subdomain from being crawled with the robots.txt file. And that will essentially block all of those API requests from happening. So that’s something where you first of all need to figure out are these API results are actually part of the primary content or important critical content that I want to have indexed from Google?

Q: (08:05) Is it appropriate to use a no-follow attribute on internal links to avoid unnecessary crawler requests to URLs which we don’t wish to be crawled or indexed?

  • (08:18) So obviously, you can do this. It’s something where I think, for the most part, it makes very little sense to use nofollow on internal links. But if that’s something that you want to do, go for it. In most cases, I will try to do something like using the rel=canonical to point at URLs that you do want to have indexed or using the robots.txt for things that you really don’t want to have crawled. So try to figure out is it more like a subtle thing that you have something that you prefer to have indexed and then use rel=canonical for that? Or is it something where you say actually, when Googlebot accesses these URLs, it causes problems for my server. It causes a large load. It makes everything really slow. It’s expensive or what have you. And for those cases, I would just disallow the crawling of those URLs. And try to keep it kind of on a basic level there. And with the rel=canonical, obviously, we’ll first have to crawl that page to see the rel=canonical. But over time, we will focus on the canon that you’ve defined. And we’ll use that one primarily for crawling and indexing.

Why don’t site:-query result counts match Search Console counts?

Q: (09:35) Why don’t the search results of a site query, which returns so many giant numbers of results, match what Search Console and the index data have for the same domain?

  • (09:55) Yeah, so this is a question that comes up every now and then. I think we’ve done a video on it separately as well. So I would double-check that out. I think we’ve talked about this a long time already. Essentially, what happens there is that there are slightly different optimisations that we do for site queries in terms of we just want to give you a number as quickly as possible. And that can be a very rough approximation. And that’s something where when you do a site query, that’s usually not something that the average user does. So we’ll try to give you a result as quickly as possible. And sometimes, that can be off. If you want a more exact number of the URLs that are actually indexed for your website, I would definitely use Search Console. That’s really the place where we give you the numbers as directly as possible, as clearly as possible. And those numbers will also fluctuate a little bit over time. They can fluctuate depending on the data centre sometimes. They go up and down a little bit as we crawl new things, and we kind of have to figure out which ones we keep, all of those things. But overall, the number in Search Console for in, I think the indexing report that’s really the number of URLs that we have indexed for your website. I would not use the about number for any diagnostics purposes in the search results. It’s really meant as a very, very rough approximation.

What’s the difference between JavaScript and HTTP redirects?

Q: (11:25) OK, now a question about redirects again, about the differences between JavaScript versus 301, HTTP, status code redirects, and which one would I suggest for short links.

  • (11:43) So, in general, when it comes to redirects, if there’s a server-side redirect where you can give us a result code as quickly as possible, that is strongly preferred. The reason that it is strongly preferred is just that it can be processed immediately. So any request that goes to your server to one of those URLs, we’ll see that redirect URL. We will see the link to the new location. We can follow that right away. Whereas, if you use JavaScript to generate a redirect, then we first have to render the JavaScript and see what the JavaScript does. And then we’ll see, oh, there’s actually a redirect here. And then we’ll go off and follow that. So if at all possible, I would recommend using a server-side redirect for any kind of redirect that you’re doing on your website. If you can’t do a server-side redirect, then sometimes you have to make do. And a JavaScript redirect will also get processed. It just takes a little bit longer. The meta refresh type redirect is another option that you can use. It also takes a little bit longer because we have to figure that out on the page. But server-side redirects are great. And there are different server-side redirect types. So there’s 301 and 302. And I think, what is it, 306? There’s 307 and 308, something along those lines. Essentially, the differences there are whether or not it’s a permanent redirect or a temporary redirect. A permanent redirect tells us that we should focus on the destination page. A temporary redirect tells us we should focus on the current page that is redirecting and kind of keep going back to that one. And the difference between the 301, 302, and the 307, and I forgot what the other one was, is more of a technical difference with regards to the different request types. So if you enter a URL in your browser, then you do what’s called a GET request for that URL, whereas if you send something to a form or use specific types of API requests, then that can be a POST request. And the 301, 302 type redirects would only redirect the normal browser requests and not the forms and the API requests. So if you have an API on your website that uses POST requests, or if you have forms where you suspect someone might be submitting something to a URL that you’re redirecting them, then obviously, you would use the other types. But for the most part, it’s usually 301 or 302.

Should I keep old, obsolete content on my site, or remove it?

Q: (14:25) I have a website for games. After a certain time, a game might shut down. Should we delete non-existing games or keep them in an archive? What’s the best option so that we don’t get any penalty? We want to keep informed of the game through videos, screenshots, et cetera.

  • (14:42) So essentially, this is totally up to you. It’s something where you can remove the content of old things if you want to. You can move them to an archive section. You can make those old pages no-index so that people can still go there when they’re visiting your website. There are lots of different variations there. The main thing that probably you will want to do if you want to keep that content is moving it into an archive section, as you mentioned. The idea behind an archive section is that it tends to be less directly visible within your website. That means it’s easy for users and for us to recognise this is the primary content, like the current games or current content that you have. And over here is an archive section where you can go in, and you can dig for the old things. And the effect there is that it’s a lot easier for us to focus on your current live content and to recognise that this archive section, which is kind of separated out, is more something that we can go off an index. But it’s not really what you want to be found for. So that’s kind of the main thing I would focus on there. And then whether or not you make the archive contains no index after a certain time or for other reasons, that’s totally up to you.

Q: (16:02) Is there any strategy by which desired pages can appear as a site link in Google Search results?

  • (16:08) So site links are the additional results that are sometimes shown below a search result, where it’s usually just a one-line link to a different part of the website. And there is no meta tag or structured data that you can use to enforce a site link to be shown. And it’s a lot more than our systems try to figure out what is actually kind of related or relevant for users when they’re looking at this one web page as well? And for that, our recommendation is essentially to have a good website structure, to have clear internal links so that it’s easy for us to recognise which pages are related to those pages, and to have clear titles that we can use and kind of show as a site link. And with that, it’s not that there’s a guarantee that any of this will be shown like that. But it kind of helps us to figure out what is related. And if we do think it makes sense to show a site link, then it’ll be a lot easier for us to actually choose one based on that information.

Our site embeds PDFs with iframes, should we OCR the text?

Q: (17:12) More technical one here. Our website uses iframes and a script to embed PDF files onto our pages and our website. Is there any advantage to taking the OCR text of the PDF and pasting it somewhere into the document’s HTML for SEO purposes, or will Google simply parse the PDF contents with the same weight and relevance to index the content?

  • (17:40) Yeah. So I’m just momentarily thrown off because it sounds like you want to take the text of the PDF and just kind of hide it in the HTML for SEO purposes. And that’s something I would definitely not recommend doing. If you want to have the content indexable, then make it visible on the page. So that’s the first thing there that I would say. With regards to PDFs, we do try to take the text out of the PDFs and index that for the PDFs themselves. From a practical point of view, what happens with a PDF is as one of the first steps, we convert it into an HTML page, and we try to index that like an HTML page. So essentially, what you’re doing is kind of framing an indirect HTML page. And when it comes to iframes, we can take that content into account for indexing within the primary page. But it can also happen that we index the PDF separately anyway. So from that point of view, it’s really hard to say exactly kind of what will happen. I would turn the question around and frame it as what do you want to have to happen? And if you want your normal web pages to be indexed with the content of the PDF file, then make it so that that content is immediately visible on the HTML page. So instead of embedding the PDF as a primary piece of content, make the HTML content the primary piece and link to the PDF file. And then there is a question of do you want those PDFs indexed separately or not? Sometimes you do want to have PDFs indexed separately. And if you do want to have them indexed separately, then linking to them is great. If you don’t want to have them indexed separately, then using robots.txt to block their indexing is also fine. You can also use the no index [? x-robots ?] HTTP header. It’s a little bit more complicated because you have to serve that as a header for the PDF files if you want to have those PDF files available in the iframe but not actually indexed. I don’t know. Timing, we’ll have to figure out how long we make these.

Q: (20:02) We want to mask links to external websites to prevent the passing of our link juice. We think the PRG approach is a possible solution. What do you think? Is the solution overkill, or is there a simpler solution out here?

  • (20:17) So the PRG pattern is a complicated way of essentially making a POST request to the server, which then redirects somewhere else to the external content so Google will never find that link. From my point of view, this is super overkill. There’s absolutely no reason to do this unless there’s really a technical reason that you absolutely need to block the crawling of those URLs. I would either just link to those pages normally or use the rel nofollow to link to those pages. There’s absolutely no reason to go through this weird POST redirect pattern there. It just causes a lot of server overhead. It makes it really hard to cache that request and take users to the right place. So I would just use a nofollow on those links if you don’t want to have them followed. The other thing is, of course, just blocking all of your external links. That rarely makes any sense. Instead, I would make sure that you’re taking part in the web as it is, which means that you link to other sites naturally. They link to you naturally, taking part of the normal part of the web and not trying to keep Googlebot locked into your specific website. Because I don’t think that really makes any sense.

Does it matter which server platform we use, for SEO?

Q: (21:47) For Google, does it matter if our website is powered by WordPress, WooCommerce, Shopify, or any other service? A lot of marketing agencies suggest using specific platforms because it helps with SEO. Is that true?

  • (22:02) That’s absolutely not true. So there is absolutely nothing in our systems, at least as far as I’m aware, that would give any kind of preferential treatment to any specific platform. And with pretty much all of these platforms, you can structure your pages and structure your website however you want. And with that, we will look at the website as we find it there. We will look at the content that you present, the way the content is presented, and the way things are linked internally. And we will process that like any HTML page. As far as I know, our systems don’t even react to the underlying structure of the back end of your website and do anything special with that. So from that point of view, it might be that certain agencies have a lot of experience with one of these platforms, and they can help you to make really good websites with that platform, which is perfectly legitimate and could be a good reason to say I will go with this platform or not. But it’s not the case that any particular platform has an inherent advantage when it comes to SEO. You can, with pretty much all of these platforms, make reasonable websites. They can all appear well in search as well.

Does Google crawl URLs in structured data markup?

Q: (23:24) Does Google crawl URLs located in structured data markup, or does Google just store the data?

  • (23:31) So, for the most part, when we look at HTML pages, if we see something that looks like a link, we might go off and try that URL out as well. That’s something where if we find a URL in JavaScript, we can try to pick that up and try to use it. If we find a link in a text file on a site, we can try to crawl that and use it. But it’s not really a normal link. So it’s something where I would recommend if you want Google to go off and crawl that URL, make sure that there’s a natural HTML link to that URL, with a clear anchor text as well, that you give some information about the destination page. If you don’t want Google to crawl that specific URL, then maybe block it with robots.txt or on that page, use a rel=canonical pointing to your preferred version, anything like that. So those are the directions I would go there. I would not blindly assume that just because it’s in structured data, it will not be found, nor would I blindly assume that just because it’s in structured data, it will be found. It might be found. It might not be found. I would instead focus on what you want to have to happen there. If you want to have it seen as a link, then make it a link. If you don’t want to have it crawled or indexed, then block crawling or indexing. That’s all totally up to you.

Sign up for our Webmaster Hangouts today!

GET IN CONTACT TODAY AND LET OUR TEAM OF ECOMMERCE SPECIALISTS SET YOU ON THE ROAD TO ACHIEVING ELITE DIGITAL EXPERIENCES AND GROWTH 

WebMaster Hangout – Live from May 06, 2022

Can I use Web Components for SEO?

Q. (03:04) Is there any problem with using web components for SEO, or is it OK to use them for my website?

  • (03:15) So web components are essentially a way of using predefined components within a website. Usually, they’re integrated with a kind of JavaScript framework. And the general idea there is that as a web designer, as a web developer, you work with these existing components, and you don’t have to reinvent the wheel every time something specific is supposed to happen. For example, if you need a calendar on one of your pages, then you could just use a web component for a calendar, and then you’re done. When it comes to SEO, these web components are implemented using various forms of JavaScript, and we can process pretty much most JavaScript when it comes to Google Search. And while I would like to say just kind of blindly everything will be supported, you can test this, and you should test this. And the best way to test this is in Google Search Console; there’s the Inspect URL tool. And there, you can insert your URL, and you will see what Google will render for that page, the HTML. You can see it on a screenshot essentially, first of all, and then also in the rendered HTML that you can look at as well. And you can double-check what Google is able to pick up from your web components. And if you think that the important information is there, then probably you’re all set. If you think that some of the important information is missing, then you can drill down and try to figure out what is getting stuck there? And we have a lot of documentation on JavaScript websites and web search nowadays. So I would double-check that. Also, Martin Splitt on my team has written a lot of these documents. He’s also findable on Twitter. He sometimes also does office hours like this, where he goes through some of the JavaScript SEO questions. So if you’re a web developer working on web components and you’re trying to make sure that things are being done well, I would also check that.

Is it ok to use FAQ schema across different parts of a page?

Q. (05:18) Is it OK to use the FAQ schema to markup questions and answers that appear in different sections of a blog post that aren’t formatted as a traditional FAQ list? For example, a post may have ten headings for different sections. A few of those are questions with answers.

  • (05:35) So I double-checked the official documentation. That’s where I would recommend you go for these kinds of questions as well. And it looks like it’s fine. The important part when it comes to FAQ snippets and structured data, in general, is that the content should be visible on the page. So it should really be the case that both the question and the answer are visible when someone visits that page, not that it’s kind of hidden away in a section of a page. But if the questions and the answers are visible on the page, even if they’re in different places on the page, that’s perfectly fine. The other thing to keep in mind is that, like all structured data, FAQ snippets are not guaranteed to be shown in the search results. Essentially, you make your pages eligible to have these FAQ snippets shown, but it doesn’t guarantee that they will be shown. So you can use the testing tool to make sure that everything is implemented properly. And if the testing tool says that’s OK, then probably you’re on the right track. But you will probably still have to kind of wait and see how Google actually interprets your pages and processes them to see what is actually shown in the search results. And for structured data, I think it’s the case for FAQs, but at least for some of the other types of schema, there are specific reports in Search Console as well that give you information on the structured data that was found and the structured data that was actually shown in the search results, so that you can roughly gauge, is it working the way that you want it to, or is it not working the way that you want it to? And for things like this, I would recommend trying them out and making a test page on your website, kind of seeing how things end up in the search results, double-checking if it’s really what you want to do, and then going off to actually implement it across the rest of your website.

Our site is not user-friendly with JavaScript turned off. Is it a problem?

Q.  (10:23) Our website is not very user-friendly if JavaScript is turned off. Most of the images are not loaded, and our flyout menu can’t be opened. However, the Chrome ‘inspect’ feature in the ‘all menu links are there in the source code. Might our dependence on JavaScript still be a problem for Googlebot?

  • (10:45) From my point of view, like with the first question that we had on web components, I would test it. So probably everything will be OK. And probably, I would assume if you’re using JavaScript reasonably, if you’re not doing anything special to block the JavaScript on your pages, it will probably just work. But you’re much better off not just believing me but rather using a testing tool to try it out. And the testing tools that we have available are quite well-documented. There are lots of variations on things that we recommend with regards to improving things if you run into problems. So I would double-check our guides on JavaScript and SEO and think about maybe, I don’t know, trying things out, making sure that they actually work the way that you want, and then taking that to improve your website overall. And you mentioned user-friendly with regards to JavaScript. So from our point of view, the guidance that we have is essentially very technical, in the sense that we need to make sure that Googlebot can see the content from a technical point of view and that it can see the links on your pages from a technical point of view. It doesn’t primarily care about user-friendliness. But, of course, your users care about user-friendliness. And that’s something where maybe it makes sense to do a little bit more so that your users are really for sure having a good experience on your pages. And this is often something that isn’t just a matter of a simple testing tool, but rather something where maybe you have to do a small user study, or kind of interview some users, or at least do a survey on your website to understand where do they get stuck? What kind of problems are they facing? Is it because of these, I don’t know. You mentioned the fly-out menus. Or is it something may be completely different where they see problems, that may be the text is too small, or they can’t click the buttons properly, those kinds of things which don’t really align with technical problems, but are more user-side things, if you can improve those, and if you can make your users happier, they’ll stick around, and they’ll come back, and they’ll invite more people to visit your website as well.

We use some static HTML pages and some WordPress pages, does that matter?

Q. (13:07) Our static page is built with HTML, and our blog is built with WordPress. The majority of our blog posts are experiencing indexing issues in Google. How do I fix this?

  • (13:21) So I think, first of all, it’s important to know that these are essentially just different platforms. And essentially, with all of these platforms, you’re creating HTML pages. And the background or the backend side of your website that ends up creating these HTML pages that’s something that Googlebot doesn’t really look at. Or at least, that’s something that Googlebot doesn’t really try to evaluate. So if your pages are written in HTML, and you write them in an editor, and you load them on your server, and they serve like that, we can see that they’re HTML pages. If they’re created on the fly on your server based on a database in WordPress or some other kind of platform that you’re using, and then it creates HTML pages, we see those final HTML pages, and we essentially work with those. So if you’re seeing kind of issues with regards to your website overall when it comes to things like crawling, indexing, or ranking, and you can kind of exclude the technical elements there, and  Googlebot is able to actually see the content, then usually what remains is kind of the quality side of things. And that’s something that definitely doesn’t rely on the infrastructure that you use to create these pages, but more it’s about the content that you’re providing there and the overall experience that you’re providing on the website. So if you’re seeing something that, for example, your blog posts are not being picked up by Google or not ranking well at Google, and your static HTML pages are doing fine on Google, then it’s not because they’re static HTML pages that they’re doing well on Google, but rather because Google thinks that these are actually good pieces of content that it should recommend to other users. And on that level, that’s where I would take a look, and not focus so much on the infrastructure, but really focus on the actual content that you’re providing. And when it comes to content, it’s not just the text that’s like the primary part of the page. It’s like everything around the whole website that comes into play. So that’s something where I would really try to take a step back and look at the bigger picture. And if you don’t see kind of from a bigger picture point of view where maybe some quality issues might lie or where you could improve things, I would strongly recommend doing a user study. And for that, maybe invite, I don’t know, a handful of people who aren’t directly associated with your website and have some tasks on your website, and then ask them really tough questions about where they think maybe there are problems on this website, or if they would trust this website or any kind of other question around understanding the quality of the website. And we have a bunch of these questions in some of our blog posts that you can also use for inspiration. It’s not so much that I would say you have to ask the questions that we ask in our blog post. But sometimes, having some inspiration for these kinds of things is useful. In particular, we have a fairly old blog post on one of the early quality updates, and we have a newer blog post; maybe I guess it’s like two years old now. It’s not a super new blog post on core updates. And both of these have a bunch of questions on them that you could ask yourself about your website. But especially if you have a group of users that are willing to give you some input, then that’s something that you could ask them, and really take their answers to heart and think about ways that you can improve your website overall.

I have some pages with rel=canonical tags, some without. Why are they both shown in the search?

Q. (17:10) I have a set of canonical tags or I have set canonical URLs on five pages, but Google is showing it on the third page as well. Why is it not only showing the URLs which I’ve set a canonical on it for?

  • (17:30) So I’m not 100% sure I understand this question correctly. But kind of paraphrasing, it sounds like on five pages of your website, you set a rel=canonical. And there are other pages on your website where you haven’t set a rel=canonical. And Google is showing all of these pages kind of indexed essentially in various ways. And I think the thing to keep in mind is the rel=canonical is a way of specifying which of the pages within a set of duplicate pages you want to have indexed. Or essentially, which address do you want to have used. So, in particular, if you have one page, maybe with the file name in uppercase, and one page with the file name in lowercase, then in some situations, your server might show exactly the same content; technically, they are different addresses. Uppercase and lowercase are slightly different. But from a practical point of view, your server is showing exactly the same thing. And Google, when it looks at that, says, well, it’s not worthwhile to index two addresses with the same content. Instead, I will pick one of these addresses and use it kind of to index that piece of content. And with the rel=canonical, you give Google a signal and tell it, hey, Google, I really want you to use maybe the lowercase version of the address when you’re indexing this content. You might have seen the uppercase version, but I really want you to use the lowercase version. And that’s essentially what the rel=canonical does. It’s not a guarantee that we would use the version that you specify there, but it’s a signal for us. It helps us to figure out all things else being kind of equal; you really prefer this address, so we will try to use that address. And that’s kind of the preference part that comes into play here. And it comes into play when we’ve recognised there are multiple copies of the same piece of content on your website. And for everything else, we will just try to index it to the best of our abilities. And that also means that for the pages where you have a rel=canonical on it, sometimes it will follow that advice that you give us. Sometimes our systems might say, well, actually, I think maybe you have it wrong. You should have used the other address as the primary version. That can happen. It doesn’t mean it will rank differently, or it will be worse off in search. It’s just, well, Google systems are choosing a different one. And for other pages on your website, you might not have a rel=canonical set at all. And for those, we will just try to pick one ourselves. And that’s also perfectly fine. And in all of these cases, the ranking will be fine. The indexing will be fine. It’s really just the address that is shown in the search results that varies. So if you have the canonical set on some pages but not on others, we will still try to index those pages and find the right address to use for those pages when we show them in search. So it’s a good practice to have the rel=canonical on your pages because you’re trying to take control over this vague possibility that maybe a different address will show. But it’s not an absolute necessity to have a rel=canonical tag.

What can we do about spammy backlinks that we don’t like?

Q. (20:56) What can we do if we have thousands of spammy links that are continuously placed as backlinks on malicious domains? They contain spammy keywords and cause 404s on our domain. We see a strong correlation between these spammy links and a penalty that we got after a spam update in 2021. We disavowed all the spammy links, and we reported the domain which is listed as a source of the links of spam. What else can we do?

  • (21:25) Yeah. I think this is always super frustrating as a site owner when you look at it and you’re like, someone else is ruining my chances in the search results. But there are two things I think that are important to mention in this particular case. On the one hand, if these links are pointing at pages on your website that are returning 404, so they’re essentially linking to pages that don’t exist, then we don’t take those links into account because there’s nothing to associate them with on your website. Essentially, people are linking to a missing location. And then we’ll say, well, what can we do with this link? We can’t connect it to anything. So we will drop it. So that’s kind of the first part. Like a lot of those are probably already dropped. The second part is you mentioned you disavowed those spammy backlinks. And especially if you mention that these are like from a handful of domains, then you can do that with the domain entry in the disavow backlinks tool. And that essentially takes them out of our system as well. So we will still list them in Search Console, and you might still find them there and kind of be a bit confused about that. But essentially, they don’t have any effect at all. If they’re being disavowed, then we tell our systems that these should be taken into account neither in a positive nor a negative way. So from a practical point of view, both from the 404 sides and from the disavow, probably those links are not doing anything negative to your website. And if you’re seeing kind of significant changes with regards to your website in Search, I would not focus on those links, but rather kind of look further. And that could be within your own website kind of to understand a little bit better what is actually the value that you’re providing there. What can you do to really stand up above all of the other websites with regards to kind of the awesome value that you’re providing users? How can you make that as clear as possible to search engines? That’s kind of the direction I would take there. So not lose too much time on those spammy backlinks. You can just disavow the whole domain that they’re coming from and then move on. There’s absolutely nothing that you need to do there. And especially if they’re already linking to 404 pages, they’re already kind of ignored.

What’s the stand on App Indexing?

Q. (26:51) What’s the latest stand on app indexing? Is the project shut? How to get your app ranked on Google if app indexing is no longer working?

  • (26:58) So app indexing was a very long time ago, I think a part of Search Console and some of the things that we talked about, where Google will be able to crawl and index parts of an app as it would appear, and try to show that in the search results. And I think that migrated a long time ago over to Firebase app indexing. And I double-checked this morning when I saw this question, and Firebase has also migrated to yet another thing. But it has a bunch of links there for kind of follow-up things that you can look at with regards to that. So I would double-check the official documentation there and not kind of listen to me talk about app indexing as much because I don’t really know the details around Android app indexing. The one thing that you can do with regards to any kind of an app, be it a mobile phone, smartphone app like these things, or a desktop app that you install, you can absolutely make a homepage for it. And that’s something that can be shown in Search like anything else. And for a lot of the kinds of smartphone apps, there will also be a page on the Play Store or the App Store. I don’t know what they’re all called. But usually, they’re like landing pages that also exist, which are normal web pages which can also appear in Search. And these things can appear in search when people search for something around your app. Your own website can appear in search when people are searching for something around the app. And especially when it comes to your own website, you can do all of the things that we talk about when it comes to SEO for your own website. So I would not like to say, oh, app indexing is no longer the same as it was 10 years ago. Therefore, I’m losing out. But rather, you have so many opportunities in different ways to be visible in Search. You don’t need to rely on just one particular aspect.

Our CDN blocks Translate. Is that bad for SEO?

Q. (29:08) The bot crawling is causing a real problem on our site. So we have our CDN block unwanted bots. However, this also blocks the Translate This Page feature. So my questions are, one, is it bad for Google SEO if the Translate This Page feature doesn’t work? Does it also mean that Googlebot is blocked? And second, is there a way to get rid of the ‘Translate This Page’ link for all of our users?

  • (29:37) So I think there are different ways or different infrastructures on our side to access your pages. And there’s, on the one hand, Googlebot and the associated infrastructure. And I believe the translate systems are slightly different because they don’t go through robots.txt, but rather they look at the page directly kind of thing. And because of that, it can be that these are blocked in different ways. So, in particular, Googlebot is something you can block on an IP level using a reverse DNS lookup, or you can allow it on an IP level. And the other kinds of elements are slightly different. And if you want to block everything or every bot other than Googlebot, or other than official search engine bots, that’s totally up to you. When it comes to SEO, Google just needs to be able to crawl with Googlebot. And you can test that in Search Console to see does Googlebot have access? And through Search Console, you can get that confirmation that it’s working OK. How it works for the Translate This Page backend systems, I don’t know. But it’s not critical for SEO. And the last question, how can you block that Translate This Page link? There is a “no translate” meta tag that you can use on your pages that essentially tells Chrome and the systems around translation that this page does not need to be translated or shouldn’t be offered as a translation. And with that, I believe you can block the Translate This Page link in the search results as well. And the “no translate” meta tag is documented in our search developer’s documentation. So I would double-check that.

Sign up for our Webmaster Hangouts today!

GET IN CONTACT TODAY AND LET OUR TEAM OF ECOMMERCE SPECIALISTS SET YOU ON THE ROAD TO ACHIEVING ELITE DIGITAL EXPERIENCES AND GROWTH 

WebMaster Hangout – Live from April 29, 2022

Should we make a UK version of our English blog?

Q. (03:15) The company I work at is working on expanding its market in the UK and recently launched a UK subsite. 70% of our US traffic comes from our blog, and we don’t currently have a blog on the UK subsite. Would translating our US blog post into Queen’s English be beneficial for UK exposure?

  • (03:42) I doubt that would make a big difference. So, in particular, when it comes to hreflang, which is the way that you would connect different language or country versions of your pages, we would not be ranking these pages better just because we have a local version. It’s more that if we understand that there is a local version, and we have that local version indexed, and it’s a unique page, then we could swap out that URL at the same position in the search results to the more local version. So it’s not that your site will rank better in the UK if you have a UK version. It’s just that we would potentially show the link to your UK page instead of the current one. For, essentially, informational blog posts, I suspect you don’t really need to do that. And one of the things I would also take into account with internationalisation is that fewer pages are almost always much better than having more pages. So, if you can limit the number of pages that you provide by not doing a UK version of the content that doesn’t necessarily need to have a UK version, then that’s almost always better for your site, and it’s easier for you to maintain.

How is a language mismatch between the main content & the rest of a page treated?

Q. (05:04) How might Google treat a collection of pages on a site that is in one language per URL structure, for example, example.com/de/blogarticle1, and the titles may be in German, but the descriptions are in English, and the main content is also in English?

  • (05:25) So the question goes on a little bit with more variations of that. In general, we do focus on the primary content on the page, and we try to use that, essentially, for ranking. So, if the primary content is in English, that’s a really strong sign for us that this page is actually in English. That said, if there’s a lot of additional content on the page that is in a different language, then we might either understand this page is in two languages, or we might be a little bit confused if most of the content is in another language and say, well, perhaps this page is actually in German, and there’s just a little bit of English on it as well. So that’s one thing to kind of watch out for. I would really make sure that, if the primary content is in one language, that it’s really a big chunk of primary content, and that it’s kind of useful when people go to that page who are searching in that language. The other thing to watch out for is the titles of the page and things like the primary headings on the page. They should also match the language of the primary content. So if the title of your page is in German, and the primary content of your page is in English, then it’s going to be really, really hard for us to determine what we should show in the search results because we try to match the primary content. That means we would try to figure out what would be a good title for this page. That also means that we would need to completely ignore the titles that you provide. So if you want to have your title shown, make sure that they also match the primary language of the page.

Does an age-interstitial block crawling or indexing?

Q.  (08:20)  If a website requires users to verify their age before showing any content by clicking a button to continue, is it possible that Google would have problems crawling the site? If so, are there any guidelines around how to best handle this?

  • (08:40) So, depending on the way that you configure this, yes, it is possible that there might be issues around crawling the site. In particular, Googlebot does not click any buttons on a page. So it’s not that Google would be able to navigate through an interstitial like that if you have something that is some kind of a legal interstitial. And especially if it’s something that requires verifying an age, then people have to enter something and then click Next. And Googlebot wouldn’t really know what to do with those kinds of form fields. So that means, if this interstitial is blocking the loading of any other content, then probably that would block indexing and crawling as well. A really simple way to test if this is the case is just to try to search for some of the content that’s behind that interstitial. If you can find that content on Google, then that probably means that we were able to actually find that content. From a technical point of view, what you need to watch out for is that Google is able to load the normal content of the page. And, if you want to show an interstitial on top of that, using JavaScript or HTML, that’s perfectly fine. But we need to be able to load the rest of the page as well. So that’s kind of the most important part there. And that also means that if you’re using some kind of a redirect to a temporary URL and then redirecting that to your page, that won’t work. But, if you’re using JavaScript/CSS to kind of display an interstitial on top of your existing content that’s already loaded, then that would work for Google Search. And, from a kind of a policy point of view, that’s fine. That’s not something that we would consider to be cloaking because the content is still being loaded there. And especially if people can get to that content after navigating through that interstitial, that’s perfectly fine.

Is using the Indexing API for a normal website good or bad?

Q. (15:35)  Is using API index or the indexing API good or bad for a normal website?

  • (08:40) So the indexing API is meant for very specific kinds of content, and using it for other kinds of content doesn’t really make sense. That’s similar, I think, I don’t know– using construction vehicles as photos on your website. Sure, you can put it on a medical website, but it doesn’t really make sense. And, if you have a website about building houses, then, sure, put construction vehicles on your website. It’s not that it’s illegal or that it will cause problems if you put construction vehicles on your medical website, but it doesn’t really make sense. It’s not really something that fits there. And that’s similar with the indexing API. It’s really just for very specific use cases. And, for everything else, that’s not what it’s there for.

Does Googlebot read the htaccess file?

Q. (16:31) Does Googlebot read the htaccess file?

  • (16:36) The short answer is no because, usually, a server is configured in a way that we can’t access that file, or nobody can access that file externally. The kind of longer answer is that the htaccess file controls how your server responds to certain requests, assuming you’re using an Apache server, which uses this as a control file. And essentially, if your server is using this file to control how it responds to certain requests, then, of course, Google and anyone will see the effects of that. So it’s not– again, assuming this is a control file on your server, it’s not that Google would read that file and do something with it, but rather Google would see the effects of that file. So, if you need to use this file to control certain behavior on your website, then go for it.

How does Google Lens affect SEO?

Q. (17:39) How does Multisearch in Google Lens can affect SEO?

  • (17:44) So this is something, I think, that is still fairly new. We recently did a blog post about this, and you can do it in Chrome, for example, and on different types of phones. Essentially, what happens is you can take a photo or an image from a website, and you can search using that image. For example, if you find a specific piece of clothing or a specific don’t know anything, basically, that you would like to find more information on, you can highlight that section of the image and then search for more things that are similar. And, from an SEO point of view, that’s not really something that you would do manually to make that work, but rather, if your images are indexed, then we can find your images, and we can highlight them to people when they’re searching in this way. So it’s not that there’s like a direct effect on SEO or anything like that. But it’s kind of like, if you’re doing everything right, if your content is findable in Search, if you have images on your content, and those images are relevant, then we can guide people to those images or to your content using multiple ways.

I’m unsure what to do to make my Blogger or Google Sites pages indexable.

Q. (21:24) I’m unsure what is needed to have my Blogger and Google Sites pages searchable. I assumed Google would crawl its own platforms

  • (21:34) So I think the important part here is that we don’t have any special treatment for anything that is hosted on Google’s systems. And, in that regard, you should treat anything that you host on Blogger or on Google Sites or anywhere else essentially the same as any content that you would host anywhere on the web. And you have to assume that we need to be able to crawl it. We need to be able to, well, first, discover that it actually exists there. We need to be able to crawl it. We need to be able to index it, just like any other piece of content. So just because it’s hosted on Google’s systems doesn’t give it any kind of preferential treatment when it comes to the search results. It’s not that there is a magic way that all of this will suddenly get indexed just because it’s on Google, but rather, we have these platforms. You’re welcome to use them, but they don’t have any kind of special treatment when it comes to Searching. And also, with these platforms, it’s definitely possible to set things up in a way that won’t work as well for Search as they could. So, depending on how you configure things with Blogger and with regards to Google Sites, how you set that up, and which kind of URL patterns that you use, it may be harder than the basic setup. So just because something is on Google’s systems doesn’t mean that there’s a preferential way of that being handled.

Why is a specific URL on my site not crawled?

Q. (25:29) Is there anything like a URL format penalty. I’m facing a weird problem where a particular URL doesn’t get crawled. It’s the most linked page on a website, and still, not even one internal link is found for this URL, and it looks like Google is simply ignoring this kind of URL. However, if we slightly change the URL, like an additional character or word, the URL gets crawled and indexed. The desired URL format is linked within the website, present and rendered HTML, and submitted in a sitemap as well, but still not crawled or indexed.

  • (26:06) So it’s pretty much impossible to say without looking at some examples. So this is also the kind of thing where I would say, ideally, go to the Help forums and post some of those sample URLs and get some input from other folks there. And the thing to also keep in mind with the Help forums is that the product experts can escalate issues if they find something that looks like something is broken in Google. And sometimes things are broken in Google, so it’s kind of good to have that path to escalate things.

Does adding the location name to the description help ranking?

Q. (26:50) Does adding the location name in the meta description matter to Google in terms of a ranking if the content quality is maintained?

  • (26:59) So the meta description is primarily used as a snippet in the search results page, and that’s not something that we would use for ranking. But, obviously, having a good snippet on a search results page can make it more interesting for people to actually visit your page when they see your page ranking in the search results.

Which structured data from schema do I add to a medical site?

Q. (27:20) How does schema affect a medical niche’s website? What kind of structure data should be used there?

  • (27:27) So I would primarily when it comes to structured data, I would primarily focus on the things that we have documented in our developer documentation and the specific features that are tied to that. So, instead of saying, what kind of structured data should I use for this type of website, I would kind of turn it around and say, what kind of visible attributes do I want to have found in the search results? And then, from there, look at what are the requirements for those visual attributes, and can I implement the appropriate structure data to fulfill those requirements? So that’s kind of the direction I would head there.

Does every page need schema or structured data?

Q. (28:06) Does every page need schema or structured data?

  • (28:10) No, definitely not. As I mentioned, use the guide of what visual elements I want to have visible for my page, and then find the right structured data for that. It’s definitely not the case that you need to put structured data on every page.

Sign up for our Webmaster Hangouts today!

GET IN CONTACT TODAY AND LET OUR TEAM OF ECOMMERCE SPECIALISTS SET YOU ON THE ROAD TO ACHIEVING ELITE DIGITAL EXPERIENCES AND GROWTH 

WebMaster Hangout – Live from April 01, 2022

Crawling of Website.

Q. (00:55) My question is about the crawling of our website. We have different numbers from the crawling of the Search Console in our server log. For instance, we have three times the number of the crawling from Google in our server, and in Search Console, we have one of the three bars. Could it be possible that there maybe something wrong as the numbers are different.

  • (1:29) I  think the numbers would always be different, just because of the way that in Search Console we report on all crawls that go through the infrastructure of Googlebot. But that also includes other types of requests. So for example, I think Adsbot also uses the Googlebot infrastructure. Those kinds of things. And they have different user agents. So if you look at your server logs, and only look at the Googlebot user agent that we use for web search, those numbers will never match what we show in that Search Console.

Why Did Discover Traffic Drop?

Q. (03:10) I have a question in mind. We have a website. And from the last three to four months, we have been working on Google Web Stories.  It was going very well. On the last one, the 5th of March, actually, we were having somewhere around 400 to 500 real-time results coming from Google Discover on our Web Stories. But suddenly, we saw an instant drop in our visitors from Google Discover, and our Search Console is not even showing any errors. So what could be a possible reason for that?

  • (3:48) I don’t think there needs to be any specific reason for that, because, especially with Discover, it’s something that we would consider to be additional traffic to a website. And it’s something that can change very quickly. And anecdotally, I’ve seen that from, or I’ve heard that from sites in these Office Hours that sometimes, they get a lot of traffic from Discover and then suddenly it goes away. And then it comes back again. So it doesn’t necessarily mean that you’re doing anything technically wrong. It might just be that, in Discover, things have slightly changed, and then suddenly you get more traffic or suddenly you get less traffic. We do have a bunch of policies that apply to content that we show in Google Discover, and I would double-check those, just to make sure that you’re not accidentally touching on any of those areas. That includes things, I don’t know offhand, but I think something around the lines of clickbaity content, for example, those kinds of things. But I would double-check those guidelines to make sure that you’re all in line there. But even if everything is in line with the guidelines, it can be that suddenly you get a lot of traffic from Discover, and then suddenly, you get a lot less traffic.

Can We Still Fix the robots.txt for Redirects?

Q.  (14:55) We have had a content publishing website since 2009, and we experienced a bad migration in 2020, where we encountered a huge drop in organic traffic. So the question here is that we had a lot of broken links, so we use the 301 redirect to redirect these broken links to the original articles. But what we did in robots.txt, we disallowed these links so that the crawling budget won’t be gone on crawling this for all four pages. So the main thing here, if we fixed all these redirects, we redirected to the same article with the proper name, can we remove these links from the robots.txt, and how much time does it take to actually be considered by Google.

  • (15:53) So if the page is blocked by the robots.txt, we wouldn’t be able to see the redirect. So if you set up a redirect, you would need to remove that block in the robots.txt. With regards to the time that takes, there is no specific time, because we don’t crawl all pages at the same speed. So some pages we may pick up within a few hours, and other pages might take several months to be recrawled. So that’s, I think, kind of tricky. The other thing, I think, is worth mentioning here is, that if this is from a migration that is two years back now, then I don’t think you would get much value out of just making those 404 links show content now. I can’t imagine that would be the reason why a website would be getting significantly less traffic. Mostly, because it’s– unless these pages are the most important pages of your website, but then you would have noticed that. But if these are just generic pages on a bigger website, then I can’t imagine that the overall traffic to a website would drop because they were no longer available.

Some of the Blogs Posts Aren’t Indexed, What Can We Do?

Q. (18:54) My question is a crawling question pertaining to discovered not indexed. We have run a two-sided marketplace since 2013 that’s fairly well established. We have about 70,000 pages, and about 70% of those are generally indexed. And then there’s kind of this budget that crawls the new pages that get created, and those, we see movement on that, so that old pages go out, new pages come in. At the same time, we’re also writing blog entries from our editorial team, and to get those to the top of the queue, we always use this request indexing on those. So they’ll go quicker. We add them to the sitemap, as well, but we find that we write them and then we want them to get in to Google as quickly as possible.  As we’ve kind of been growing over the last year, and we have more content on our site, we’ve seen that that sometimes doesn’t work as well for the new blog entries. And they also sit in this discovered not indexed queue for a longer time. Is there anything we can do to internal links or something? Is it content-based, or do we just have to live with the fact that some of our blogs might not make it into the index?

  • (20:13) Now, I think overall, it’s kind of normal that we don’t index everything on a website. So that can happen to the entries you have on the site and also the blog post on the site. It’s not tied to a specific kind of content. I think using the Inspect URL tool to submit them to indexing is fine. It definitely doesn’t cause any problems. But I would also try to find ways to make those pages as clear as possible that you care about them. So essentially, internal linking is a good way to do that. To really make sure that, from your home page, you’re saying, here are the five new blog posts, and you link to them directly. So that it’s easy for Googlebot when we crawl and index your home page, to see, oh, there’s something new, and it’s linked from the home page. So maybe it’s important. Maybe we should go off and look at that.

Can a Low Page Speed Score Affect the Site’s Ranking?

Q. (27:28) Does low-rating mobile results on Google page speed like LCP, FID, might have affected our website rank after the introduction of the new algorithm last summer? Because we were like the fourth in my city? If I check a web agency or a keyword that we saw after the introduction of this algorithm and go on Google Search Console, we find out that these parameters like LCP, FID for mobile, have a bad rating, like 48, not for desktop, there is 90. So it’s OK. So could this be the problem?

  • (28:24) Could be. It’s hard to say just based on that. So I think there are maybe two things to watch out for. The number that you gave me sounds like the PageSpeed Insights score that is generated, I think, on desktop and mobile. Kind of that number from 0 to 100, I think. We don’t use that in search, for the rankings. We use the Core Web Vitals, where there is LCP FID and CLS, I think. And the metrics that we use are based on what users actually see. So if you go into the Search Console, there’s the Core Web Vitals report. And that should show you those numbers. If it’s within good or bad, kind of in those ranges.

Can Google Crawl Pagination With “View More” Buttons?

Q. (39:51) I recently redesigned my website and changed the way I list my blog posts and other pages from pages one, two, three, four to a View More button. Can Google still crawl the ones that are not shown on the main blog page? What is the best practice? If not, let’s say those pages are not important when it comes to search and traffic, would the whole site as a whole be affected when it comes to how relevant it is for the topic for Google?

  • (40:16) So on the one hand, it depends a bit on how you have that implemented. A View More button could be implemented as a button that does something with JavaScript, and those kinds of buttons, we would not be able to crawl through and actually see more content there. On the other hand, you could also implement a View More button, essentially as a link to page two of those results, or from page two to page three. And if it’s implemented as a link, we would follow it as a link, even if it doesn’t have a label that says page two on it. So that’s, I think, the first thing to double-check. Is it actually something that can be crawled or not? And with regards to if it can’t be crawled, then usually, what would happen here is, we would focus primarily on the blog posts that would be linked directly from those pages. And it’s something where we probably would keep the old blog posts in our index because we’ve seen them and indexed them at some point. But we will probably focus on the ones that are currently there. One way you can help to mitigate this is if you cross-link your blog post as well. So sometimes that is done with category pages or these tag pages that people add. Sometimes, blogs have a mechanism for linking to related blog posts, and all of those kinds of mechanisms add more internal linking to a site and that makes it possible that even if we, initially, just see the first page of the results from your blog, we would still be able to crawl to the rest of your website. And one way you can double-check this is to use a local crawler. There are various third-party crawling tools available. And if you crawl your website, and you see that oh, it only picks up five blog posts, then probably, those are the five blog posts that are findable. On the other hand, if it goes through those five blog posts. And then finds a bunch more and a bunch more, then you can be pretty sure that Googlebot will be able to crawl the rest of the site, as well.

To What Degree Does Google Follow the robots.txt Directives?

Q. (42:34) To what degree does Google honour the robots.txt? I’m working on a new version of my website that’s currently blocked with a robots.txt file and I intend to use robots.txt to block the indexing of some URLs that are important for usability but not for search engines. So I want to understand if that’s OK.

  • (42:49) That’s perfectly fine. So when we recognise disallow entries in a robots.txt file, we will absolutely follow them. The only kind of situation I’ve seen where that did not work is where we were not able to process the robots.txt file properly. But if we can process the robots.txt file properly, if it’s properly formatted, then we will absolutely stick to that when it comes to crawling. Another caveat here is, that usually, we update the robots.txt files, maybe once a day, depending on the website. So if you change your robots.txt file now, it might take a day until it takes effect. With regards to blocking crawling– so you mentioned blocking indexing, but essentially, the robots.txt file would block crawling. So if you blocked crawling of pages that are important for usability but not for search engines, usually, that’s fine. What would happen, or could happen, is that we would index the URL without the content. So if you do a site query for those specific URLs, you would still see it. But if the content is on your crawlable pages, then for any normal query that people do when they search for a specific term on your pages, we will be able to focus on the pages that are actually indexed and crawled, and show those in the search results. So from that point of view, that’s all fine.

If 40% Of Content Is Affiliate, Will Google Consider Site a Deals Website?

Q. (53:27) So does the portion of content created by a publisher matter, and I mean that in the sense of affiliate, or maybe even sponsored context. Context is a Digiday newsletter that went out today that mentioned that publishers were concerned that if you have, let’s say 40%, of your traffic or content as commerce or affiliate, your website will become or considered by Google, a deals website, and then your authority may be dinged a little bit. Is there such a thing that’s happening in the ranking systems algorithmically?

  • (54:05) I don’t think we would have any threshold like that. Partially, because it’s really hard to determine a threshold like that. You can’t, for example, just take the number of pages and say, this is this type of website because it has 50% pages like that. Because the pages can be visible in very different ways. Sometimes, you have a lot of pages that nobody sees. And it wouldn’t make sense to judge a website based on something that, essentially, doesn’t get shown to users.

Sign up for our Webmaster Hangouts today!

GET IN CONTACT TODAY AND LET OUR TEAM OF ECOMMERCE SPECIALISTS SET YOU ON THE ROAD TO ACHIEVING ELITE DIGITAL EXPERIENCES AND GROWTH 

WebMaster Hangout – Live from March 25, 2022

Search Console Says I’m Ranking #1, but I Don’t See It.

Q. (00:56) The question is about the average position in Google Search Engine. A few days ago, I realised that over 60 queries show that I am in position #1. I still don’t understand how that works because some of the keywords I am searching on search results do not appear first. I have seen some of them coming from the Google Business Profile, but some of them are not really appearing. And I don’t really understand why it says that the average position is #1. What does it really mean? Does it sometimes say I am on position #1 and it takes time to appear, or how does it really work?

  • (54:13): The average top position is based on what was actually shown to users. It is not a theoretical ranking where you would appear at #1 if you did the right kind of thing. It is really: “We showed it to users, and there was ranking at #1 at that time”. Sometimes it is tricky to reproduce that. Which I think kind of makes it a little bit hard in that, and sometimes it is for users in a specific location or the search in a specific way – may be just mobile or just on the desktop. Sometimes it is something in image one box or knowledge graph or local business entry that you mentioned. All of these things also paces where could be the link to your website. And if you see something listed ranking #1, usually what I recommend doing is trying to narrow that down into what exactly was searched by using the different filters in the Search Console and figure out which country that was in and what type of query was it – was it mobile or on a desktop and see if you can try to reproduce it that way. Another thing that I sometimes do is look at the graph over time, so specifically for the query, your keywords’ average position ranking #1 but the total impressions and the total clicks don’t seems like make much sense, and you think that a lot of people are searching, but you should expect to see a lot of traffic, a lot of impressions at least, that also could be a sign of that you were shown at position #1 in whatever way, but you aren’t always shown that way. That could be something very temporary or something that fluctuates a little bit, anything along those lines. But it is not the case that it is a theoretical position. You are shown in the search results, and when we were shown, you were shown like this.

Why Is My Homepage Ranking Instead of an Article?

Q. (05:55) When I am checking the Search Console for the pages, the only page that is actually shown is mostly the home page. Some of the queries of home pages are ranking have their own pages.

  • (6:21) I guess it is just our algorithms that see your home page as a more important page overall. Sometimes it is something more if you continue to work on your website and make it better and better, it is easier for our systems to recognise that actually there is a specific page that is more suited for this particular query. But it is a matter of working on your site. Working on your site shouldn’t be something that just generates more and more content than you actually creating something better. And that could be by deleting a bunch of content and combining things together. So that is improving your website isn’t the same as adding more content.

Does Locating a Sitemap in a Folder Affect Indexing?

Q.  (07:11) We have our sitemaps in the subfolder of our website. I have noticed recently that a lot more pages say ‘indexed’ but not ‘submitted’. Do you think that might be due to moving the sitemaps into the subfolder? We used to have them in our ‘root’ but due to technology change, we had to move them.

  • (7:30) The locations of the sitemaps shouldn’t really matter. It is something we can put in a subdirectory or subdomain. It depends on how you submitted the sitemap file. For example, list it in your ‘robots.txt’ file. You can put it anywhere; it doesn’t matter.

I Have a New Blog, No Links. Should I Submit in Search Console?

Q. (07:59) I am new to blogging and starting a new blog. It has been an amazing experience to start from scratch. Brand new blog, no links to it. Would you recommend that you submit URLs as you published them using the Google Search Console and then request indexing for new blogs that have no links to them, or there is no point, and it is not really helpful?

  • (8:30) It is not that there is any disadvantage to doing that. If it is a new site that we really have absolutely no signals, no information about it at all, at least telling us about URLs is the way of getting down the initial foot in the door, but it is not a guarantee that it will pick that up. That is something where you probably know someone else who is blogging, and you can work together and maybe get the link to your site. Something along those lines probably would do a lot more than just going to Search Console and saying, ‘I want this URL indexed immediately’. 

How Fast Does the Links Report Update?

Q. (09:22) How long does it typically take for Google to add the links on a brand new blog on the Google Search Console ‘Links’ report?

  • (9:53) A lot of the reports on Search Console are recalculated every 3-4 days. In terms of about a week, you should probably see some data there. The tricky part is we show the samples of the links on your website, and it doesn’t mean that we immediately populate that one links that we found to your side. It is not a matter of months, and it is not a matter of hours as it is in the performance report. From one up to a week is a reasonable time.

Should the Data in Search Console Align With Analytics?

Q. (10:56) Google and my website have stated that the data we are getting from Google Analytics and Google Search Console will not match exactly, but they will make sense directly. This means for your organic search. All your clicks always are under the sessions that you get it. Is there any understanding, correct?

  • (11:17) I guess it depends on what you have set up and what you are looking at specifically. Suppose you are looking at it at the site level that is about to be correct. If you are looking at it per year level on a very large site, it could happen that those individuals are not tracked in the Search Console, and you would see slightly different changes, or you would see changes over time. Some days it is tracked, and some days it is not tracked, particularly if you have a very large website. But if you look at the website overall, that should be pretty close. Search Console measures what is shown on the search results – the clicks and impressions from there, and Analytics uses Java Script to track what is happening on the website side. Those tracking methods are slightly different and probably have slightly different ways of deduplicating things. So I would never expect two of these two to line up completely. However, overall, they should be pretty close.    

Why Are Disallowed Pages Getting Traffic?

Q. (16:58) And my next question is, in my robots file, what I’ve done is, I have disallowed some of the pages, certain pages that I’ve disallowed. But it is quite possible that Google had probably in the past indexed those pages. And when I have blocked them, disallowed crawling, today, to this date, I see them getting organic sessions. Why is that happening, and how can I fix that? And read there is something called ‘‘noindex’’ directive. But is there the right way to go about it? Or how should I pursue this?

  • (17:30) If these are pages that you don’t want to have indexed, then using ‘noindex’ would be better than using the disallow and robots.txt. The ‘noindex’ would be a metatag on the page though. So it’s a robots metatag with ‘noindex’. And you would need to allow crawling for that to happen.

Does Google Use Different Algorithms per Niche?

Q. (23:47) Is it true that Google has different algorithms for the indexing and ranking of different niches? We have two websites of the same type, and we’ve built them with the same process. The only difference is that the two sites are different niches. And currently, one is working while the other one has lost all ranking.

  • (24:07) So I don’t think we have anything specific with regards to different niches. But obviously, different kinds of content is differently critical to our search results. And if you look at something like our quality raters guidelines, we talk about things like your money your life sites, where we do kind of work to have a little bit more critical algorithms involved in the crawling, indexing, and ranking of their sites. But it’s not the case that you would say it’s like a bicycle shop has completely different algorithms than, I don’t know, a shoe store, for example. They’re essentially both e-commerce type stores. But the thing that you also mentioned in the question is that these are content aggregator sites, and they’re built with the same process. And some of them do work, and some of them don’t. That, to me, feels like it’s– I don’t know your sites. It feels a bit like low effort affiliate sites, where you’re just taking content feeds and publishing them. And that’s the kind of thing where our algorithms tend not to be so invested in making sure that we can crawl and index all of that content. Because essentially, it’s the same content that we’ve already seen elsewhere on them. So from that point of view, if you think that might apply to your site, I would recommend focusing on making fewer sites and making them significantly better. So that it’s not just aggregating content from other sources, but actually that you’re providing something unique and valuable in the sense that if we were to not index your website properly, then the people on the internet would really miss a resource that provides them with value. Whereas, if it’s really the case that if we didn’t index your website, then people would just go to one of the other affiliate aggregator sites, then there is no real reason for us to focus and invest on crawling and indexing your site. So that’s something where, again, I don’t know your websites. But that’s something that I would look into a little bit more rather than just, oh, “Google doesn’t like bicycle stores; they like shoe stores instead”.

What Counts as a Click in FAQ Rich Snippets?

Q.  (26:23) Two related questions. What counts as a click for an FAQ rich snippet? Does Google ever add links to FAQ answers, even if there isn’t one included in the text?

  • (26:29) You link to the help centre article on that, which I think is pretty much the definitive source on the clicks and impressions and position counting in Search Console. In general, we count it as a click if it’s really a link to your website and someone clicked on it. And with regards to the rich results, I can’t say for sure that we would never add a link to a rich result that we show in the search results. Sometimes I could imagine that we do. But it’s not the case that we would say, Oh, there’s a rich result on this page. Therefore we’ll count it as a click, even though nobody clicked on it. It’s really, if there’s a link there and people click on it and go to your website, that’s what we would count. And similarly, for impressions, we would count it as an impression if one of those links to your sites were visible in the search results. And it doesn’t matter where it was visible on the page if it’s on the top or the bottom of the search results page. If it’s theoretically visible to someone on that search results page, we’ll count it as an impression.

Why Do Parameter Urls Get Indexed?

Q. (30:31) Why do parameter URLs end up in Google’s index even though we’ve excluded them from crawling with the robots.txt file and with the parameter settings in Search Console. How do we get parameter URLs out of the index again without endangering the canonical URLs?

  •  (30:49) So, I think there’s a general assumption here that parameter URLs are bad for a website. And that’s not the case. So it’s definitely not the case that you need to fix the indexed URLs of your website to get rid of all parameter URLs. So from that point of view, it’s like, I would see this as something where you’re polishing the website a little bit to make it a little bit better. But it’s not something that I would consider to be critical. With regards to the robots.txt file and the parameter handling tool, usually, the parameter handling tool is the place where you could do these things. My feeling is the parameter handling tool is a little bit hard to find and hard to use by people. So personally, I would try to avoid that and instead use the more scalable approach in the robots.txt file. But you’re welcome to use it in Search Console. With the robots.txt file, you essentially prevent the crawling of those URLs. You don’t prevent indexing of those URLs. And that means that if you do something like a site query for those specific URLs, it’s very likely that you’ll still find those URLs in the index, even without the content itself being indexed. And I took a look at the forum thread that you started there, which is great. But there, you also do this fancy site query, where you pull out these specific parameter URLs. And that’s something where if you’re looking at URLs that you’re blocking by robots.txt, then I feel that is a little bit misleading. Because you can find them if you look for them, it doesn’t mean that they cause any problems, and it doesn’t mean that there is any kind of issue that a normal user would see in the search results. So just to elaborate a little bit. If there is some kind of term on those pages that you want to be found for, and you have one version of those pages that is indexable and crawlable and another version of the page that is not crawlable, where we just have that URL indexed by itself, if someone searches for that term, then we would pretty much always show that page that we actually have crawled and indexed. And the page that we theoretically also have indexed, because it has– it’s blocked by robots.txt, and theoretically, it could also have that term in there, that’s something where it wouldn’t really make sense to show that in search results because we don’t have as much confirmation that it matches that specific query. So from that point of view for normal queries, people are not going to see those ‘robotic’ URLs. And it’s more if someone searches for that exact URL, or does a specific site query for those parameters, then they could see those pages. If it’s a problem that these pages are findable in the search results, then I would use the URL removal tool for that, if you can. Or you would need to allow crawling and then use a ‘noindex’ directive, robots.txt directive, to tell us that you don’t want these pages indexed. But again, for the most part, I wouldn’t see that as a problem. It’s not something where you need to fix that with regards to indexing. It’s not that we have a cap on the number of pages that we index for a website. It’s essentially, that we’ve seen a link to these. We don’t know what is there. But we’ve indexed that URL should someone search specifically for that URL.

Does Out-Of-Stock Affect the Ranking of a Product Page?

Q. (37:41) Let’s say my product page is ranking for a transactional keyword. Would it affect its ranking if the product is out of stock?

  • (37:50) Out of stock, it’s possible. Let’s kind of simplify like that. I think there are multiple things that come into play when it comes to products themselves, in that they can be shown as a normal search result. They can also be shown as an organic shopping result as well. And if something is out of stock, I believe the organic shopping result might not be shown. Not 100% sure. And when it comes to the normal search results, it can happen that when we see that something is out of stock, we will assume it’s more like a soft 404 error, where we will drop that URL from the search results as well. So theoretically, it could essentially affect the visibility in Search if something goes out of stock. It doesn’t have to be the case. In particular, if you have a lot of information about that product anyway on those pages, then that page can still be quite relevant for people who are searching for a specific product. So it’s not necessarily that something goes out of stock, and that page disappears from search. The other thing that is also important to note here is that even if one product goes out of stock, the rest of the site’s rankings are not affected by that. So even if we were to drop that one specific product, because we think it’s more like a soft 404 page, then people searching for other products on the site, we would still show those normally. It’s not that there would be any kind of a negative effect that swaps over into the other parts of the site.

Could a Banner on My Page Affect Rankings?

Q. (39:30) Could my rankings be influenced by a banner popping up on my page?

  • (39:35) And yes, they could be as well. There are multiple things that kind of come into play with regards to banners. On the one hand, we have within the Page Experience Report; we have that aspect of intrusive interstitials. And if this banner comes across as an intrusive interstitial, then that could negatively affect the site there. The other thing is that often with banners, you have side effects on the cumulative layout shift, how the page renders when it’s loaded, or with regards to the – I forgot what the metric is when we show a page, the LCP I think, also from the Core Web Vitals side with regards to that page. So those are different elements that could come into play here. It doesn’t mean it has to be that way. But depending on the type of banner that you’re popping up, it can happen.

Do Links on Unindexed Pages Count?

Q. (49:03) How about you have been linked from some pages, but those pages have not been indexed. But those mentions or the link has been already present on those particular things. So it is still counted just because the page is not indexed, and so those links won’t be counted? Or even if the page is not indexed but if there is a link, those things can be counted as well?

  • (49:39) Usually, that wouldn’t count. Because for a link, in our systems at least, we always need a source and a destination. And both of those sides need to be canonical indexed URLs. And if we don’t have any source at all in our systems, then essentially, that link disappears because we don’t know what to do with it. So that means if the source page is completely dropped out of our search results, then we don’t really have any link that we can use there. Obviously, of course, if another page were to copy that original source and also show that link, and then we go off and index that other page, then that would be like a link from that other page to your site. But that original link, if that original page is no longer indexed, then that would not count as a normal link.

What Can We Do About Changed Image Urls?

Q. (50:50) My question is on the harvesting of images for the recipe gallery. Because we have finally identified something which I think has affected some other bloggers and it’s really badly affected us, which is that if you have lots and lots of recipes indexed in the recipe gallery, and you change the format of your images, as the metadata is refreshed, you might have 50,000 of the recipes get new metadata. But there is a deferral process for actually getting the new images. And it could be months before those new images have been picked up. And while they’re being harvested, you don’t see anything. But when you do a test on Google Search Console, it does it in real-time and says, yeah, everything’s right because the image is there. So there’s no warning about that. But what it means is you better not make any changes or tweaks to slightly improve the formatting of your image URL. Because if you do, you disappear.

  • (51:39) Probably what is happening there is the general crawling and indexing of images, which is a lot slower than normal web pages. And if you remove one image URL and you add a new one on a page, then it does take a lot of time to be picked up again. And probably that’s what you see there. What we would recommend in a case like that is to redirect your old URLs to new ones, also for the images. So if you do something, like you have an image URL which has the file size attached to the URL, for example, then that URL should redirect to a new one. And in that case, it’s like we can keep the old one in our systems, and we just follow the redirect to the new one.

Does Crawl Budget Affect Image Re-Indexing?

Q. (54:25) Does crawl budget affect image re-indexing?

  • (54:31) Yeah, I mean, what you can do is make sure that your site is really fast in that regard. And that’s something in the crawl stats report; you should be able to see some of that, where you see the average response time. And I’ve seen sites that have around 50 milliseconds. And other sites that have 600 and 700 milliseconds. And obviously, if it’s faster, it’s easier for us to request a lot of URLs. Because otherwise, we just get bogged down because we send, I don’t know, 50 Google bots to your site at one time. And we’re waiting for responses before we can move forward.

Sign up for our Webmaster Hangouts today!

GET IN CONTACT TODAY AND LET OUR TEAM OF ECOMMERCE SPECIALISTS SET YOU ON THE ROAD TO ACHIEVING ELITE DIGITAL EXPERIENCES AND GROWTH