July 27, 2022

7 SEO Crawling Tool Warnings & Errors You Can Safely Ignore

SEO crawlers are essential tools; however, they require an SEO expert’s expertise and knowledge in determining the warnings to follow or disregard.

In many instances, the thing that the Google SEO spider declares as an error that could be fatal needs immediate attention. But occasionally, it’s not in the slightest.

It can happen when using the most popular SEO crawling tools like Semrush Site Audit, Ahrefs Site Audit, Sitebulb, and Screaming Frog.

What can you do to tell the difference between prioritizing an issue that doesn’t have to be addressed?

Here are some real-life examples of these warnings and errors, along with explanations of the reasons they could be taking place on your site.

1. Indispensability Problems (Noindex pages on the site)

Each SEO crawler will show and alert you about unindexable pages on your site. No-index pages could be flagged as errors, warnings or even insights based on the type of crawler.

This is how the Ahrefs site audit identifies the issue:

This Google Search Console Report on Coverage may also flag non-indexable sites as errors (if the website has non-indexable pages on the submitted sitemap) or excluded even if they’re not actually problems.

It is just the reason that can’t be searched for.

Here’s how it appears in GSC:

The presence of the “no index” tag does not guarantee it is a mistake. It just signifies that the site is not indexed by Google or different search engines.

“no index” tag is one of the tags that can be used “noindex” tag could be just one of two possible crawler directives and the other one is to index the page.

Every website is likely to contain URLs that are not indexable by Google.

They could include tags, for instance, pages (and sometimes category pages too), login pages, password reset pages, or even a thank you page.

Your role in your capacity as an SEO professional is to go through the pages that are not indexable and determine if they are not indexed or if they are not. The “noindex” tag might have been added accidentally.

2. Meta Description Too short or empty

SEO crawlers also examine the website’s meta elements, including meta descriptions elements. If the site doesn’t contain meta descriptions or is too brief (usually less than 110 characters), the crawler will flag the site as infected.

Here’s what it looks like in Ahrefs:

Here’s the way Screaming Frog shows it:

Based what the dimensions of your site It isn’t always possible or feasible to develop distinctive meta descriptions for all its pages. They may not be needed also.

One good example of a website that might not be logical is a massive e-commerce website with millions of URLs.

The larger the site the bigger, the less significant this aspect becomes.

The meta description tag, as opposed to the contents contained in the title tag is not considered by Google and is not a factor in the rankings.

Search snippets can use the meta description, but Google usually changes them.

Here’s what Google is saying about the topic in their Advanced SEO document:

“Snippets are generated automatically from content on the page. They are designed to draw attention to and show the page content most relevant to the user’s query. This means that pages may display various snippets of information for different types of searches.”

The most important thing you, as an SEO, should be aware of is that every site is different. Utilize your common sense in determining whether meta descriptions are relevant to that particular site or whether you can disregard the warning.

3. Meta Keywords Not Found

Meta keywords were utilized more than 20 years ago to indicate to search engines like Altavista the key phrases that the URL is trying to be found for.

It was extensively used to get people to sign up for services. Meta keywords were a kind of “spam attractor,” so most search engines have stopped supporting this feature.

Screaming Frog continuously checks if meta keywords are available on the website automatically.

Since it is an outdated SEO component, 99 percent of websites employ meta keywords no longer.

Here’s how it appears on the screen in Screaming Frog:

Professionals or customers new to SEO could be misled by the idea they should be aware that when a web crawler flags an item as not being present, the element is supposed to add to the website. However, that is not the case in this instance!

If there are no meta keywords on the site, you’re reviewing. It’s not a good idea to suggest adding them.

4. Images of more than 100 KB

Images must be optimized and compressed on the website to ensure that a vast PNG logo that weighs 10 MB does not have to be loaded on every website.

But, all photos can’t be compressed to less than 100 KB.

Screaming Frog can show and warn you of more than 100 KB images. Here is what it looks same when you use the software:

If the website contains images of more than 100 KB doesn’t indicate that the site is experiencing problems with image optimization or is extremely slow.

If you encounter this error, check the overall website’s performance and speed using Google’s PageSpeed Insights and The Google Search Console Core Web Vitals report.

If the website is performing well and is passing through the core Web Vitals test generally, there is no need to reduce the images further.

Tips: You may use the Screaming Frog report to sort the images by size, starting from the heaviest to the lightest, to see whether there are large images on particular websites.

5. Pages with low content or low Word Pages with a Low Count

According to the setting of the SEO crawler, most software for SEO auditing will flag websites with less than 50-100 words as pages with low content.

This is how this issue appears to be in Ahrefs:

Screaming Frog Screaming Frog, on the contrary, considers pages with less than 200 words as low content by default (you can alter this setting when you configure your crawl).

Here’s the way Screaming Frog reports on that:

The fact that a website contains a few words does not mean that there is a problem or error.

Numerous types of pages are designed to have a minimum word count, such as login pages, password reset pages, tag pages, or even a contact page.

The crawler will mark these pages as having low content, but it is not an issue that can hinder the site from ranking highly on Google.

This software is trying to communicate to users: that if you desire a website to rank high on Google and generate a large amount of organic traffic, this website might need to be very thorough and detailed.

This can include as well a large word number. There are many kinds of search-related intents, and the depth of the content isn’t always what people seek to meet their requirements.

When reviewing low-word count pages flagged by the crawler, you should always consider whether the pages are actually intended to be full of information. Many times there is no need to be.

6. Low HTML-Text Ratio

Semrush Site Audit will also inform you of sites with the lowest text-to-HTML ratio.

This is the way Semrush describes it:

This alert is meant to advise you of:

Pages with low word count.
Pages may be built in a complex way and include a large HTML code file.

This advice is often misinterpreted by people who aren’t experienced or SEO experts. You might need a skilled technical SEO professional to decide if this is something you should be concerned about.

There are a several factors that affect the HTML-text ratio, and it’s not always a concern in the case of a site with a percentage of high/low HTML. There isn’t such something as an ideal HTML-text ratio.

As an SEO professional, you could focus on ensuring that the website’s speed and performance are in the best possible condition.

7. Sitemap of XML is not displayed in robots.txt

Robots.txt is, aside from being the crawler-friendly file directive, Robots.txt is also the file where you can enter your URL for the XML sitemap to ensure it is possible for Google to crawl and index it quickly.

SEO crawlers like Semrush Site Audit can inform you when the XML sitemap isn’t listed in robots.txt.

This is the way Semrush describes it:

On the surface, it appears to be serious; however, this isn’t the case in most cases.

Google generally has no issues indexing or crawling smaller sites (below 10,000 web pages).
Google will have no issues crawling and indexing large websites if they have a solid inner linking system.
If submitted correctly to Google Search Console, an XML sitemap doesn’t have to be included in robots.txt.
An XML sitemap does not have to be included in robots.txt If it’s located at the default location, i.e., /sitemap.xml (in most instances).).

Before marking this as an issue of high priority within an SEO audit, be sure you’re not assuming any of this applies to the website you’re reviewing.

Bonus The Tool reveals an Important Error Relating to a handful of unimportant URLs

Even if the tool shows the real problem, such as a 404-related page on the website, it is not an issue if only one of the millions of pages on the site show status 404 or if there are no links to the 404 page.

When assessing the problems detected through the spider, always verify how many websites they connect to and the relevant ones.

It is vital to provide the context of the error.

Sitebulb For example, it will display the proportion of URLs the error you see is related to.

Here’s an example of an internal URL that redirects to a broken link that returns 5XX or 4XX, as Sitebulb reported:

It appears to be a severe problem, but it pertains to one non-important website. It’s not a priority concern.