What Are Soft 404 Errors & How To Fix Them
Every so often, when browsing around the internet, you bump into an “Oops... Page not Found” message.
In SEO terms, that’s a 404 status code showing that a page that used to be published at the URL you clicked on - no longer exists.
However, there are various 404-like errors that are marked as soft 404s.
So, what’s the difference?
Soft 404 Vs. Hard 404
404s -or ‘hard 404 errors’ as they are sometimes called- is a numerical status code notifying users and search engines that a piece of content is no longer available.
In many cases, a piece of content is outdated or no longer valid, such as a very old article or a company service that’s no longer offered.
With the vast amount of information on the web, it's pretty standard for content to get unpublished if it's no longer relevant.
However, SEO professionals must also deal with soft 404 errors. This term applies to URLs that give an OK status but are regarded as missing.
To better understand, let’s explore more details about what a soft 404 error is, why it occurs, and how to fix it.
What Are Soft 404 Errors?
A soft 404 error (or ‘crypto 404’), in contrast with the typical ‘hard 404’, is not an official status code.
It is rather used to describe instances where the server responds with a 200 OK status, in other words, a successful HTTP request, when the actual page or content is missing.
A soft 404 is only seen by crawlers and site owners via webmaster tools or SEO audit tools.
Unlike hard 404s where users see a clear “Not Found” message, here they are simply redirected to a homepage, an error page, or another web page.
So, why worry about it? Well, a soft 404 error can hurt your indexing and the positive user experience with your website.
Disadvantages Of Soft 404s
1. Reduce Your Crawl Budget
Soft 404 pages take away the crawl budget from your important pages. Bots spend their allocated resources to crawl and index non-existent or duplicate content. As a result, your “good” URLs may take longer to discover and index.
2. Confuse Visitors
Visitors -who expect to get to the URL they clicked on but instead get to your home page- are left confused. So, they will likely bounce away from your site and get a negative sense from their interaction with your brand.
3. Confuse Crawlers
Crawlers are also confused. While your page is marked with an OK 200 response, they actually see 404 elements on it. So, they categorize it as they please or simply ignore it.
4. Disappoint Users
Soft 404 pages with poor or thin content can be indexed and appear in front of your future customers.
5. Create Negative Signals
Too many soft 404s and dead links indicate poor website maintenance and are negative signs for Google.
So, it’s worth taking the time to monitor your website for soft 404 errors and investigate what caused them in the first place.
4 Common Causes Of Soft 404 Errors
1. Poor Server Configuration
A webserver responds with a 200 OK code for a page that doesn’t exist. This often happens because the server is configured to return a 200 response code with the homepage or an error page.
2. Content Is Missing Or Is Perceived As Thin
For pages with no content or very little content and a 200 status code, crawlers perceive them as soft 404s.
Live pages that get soft 404s are generally deemed thin content.
In other cases, thin content is due to poor page rendering, such as when some of your page resources are too big or have blocked access. As a result, crawlers just perceive a blank page and mark it as a soft 404.
3. Redirects To The Homepage
In the older days of SEO, too many 404 responses were considered bad for SEO. Still, many publishers try to avoid 404s and redirect deleted content to the homepage.
However, today, Google handles 404s as outdated content that no longer needs to be online.
In fact, search engines appreciate 404s and 410s and treat them accordingly. And they are not bad for SEO either.
As information becomes old and irrelevant, it is perfectly OK to delete some pages that no longer make sense.
4. Temporary Crawling Issues
When Google tries to crawl the page, some page resources (CSS, JS) might not load properly. This is perceived as no content and categorized as a soft 404.
There may be other similar reasons why soft 404s occur. No matter what, once spotted, you need to fix them fast so they don’t hurt your SEO.
How To Fix Soft 404s
Check Soft 404s In Google Search Console
In your Google Search Console account, look for the ‘Crawl errors’ section and navigate to the ‘Soft 404s’ tab.
Here, you can determine whether a URL returns a 200 response for no apparent reason, whether it should be 301 redirected to another URL or whether it does not exist and should return a 404 or 410.
Change 200 To 404 For Deleted Content
Instead of a misleading 200, have your non-existent pages return a normal 404 response, which clearly explains to users and search engines that the file in question is no longer available.
Enhance Thin Content
For your 200 pages that are still valid but showing a soft 404 error, enhance their content with solid, relevant information, links, and images to signal their value.
Make 301 Redirects
When it makes sense, create permanent 301 redirects to similar pages, ideally as soon as your old page gets unpublished. Just do not create redirect loops and discard your redirects when old URLs are forgotten.
Configure Your Server Correctly
Configure your server to serve the correct status for every URL.
Merge Duplicates
Analyze your website to find pages with thin or (nearly) duplicate URLs. Once discovered, merge those almost identical pages with the appropriate 301 redirects.
Check Canonicals
Duplicate content is often caused by technical issues. For example, you may be facing problems related to different versions of the same URL.
So, check if your duplicate URLs are differentiated by:
- trailing or non-trailing slash
- www or non-www
- https or http
- with or without “.html” at the end
Check Crawling & Rendering Drawbacks
When pages have enough content but still get a soft 404, investigate whether your page’s resources are too big or inaccessible to crawlers.
From your Google Search Console, you can see rendered screenshots and HTML. If your screenshot appears (almost) bank, then you face a rendering problem.
Remove Auto-Generated Pages
Your CMS may automatically generate some worthless pages. In WordPress, for example, every tag, author, etc., automatically gets their own page, but such pages are often empty.
So, get to know how your CMS works and delete any pages that do not provide any value as soon as they are created.
Mark Pages As Noindex
If you want to keep a soft 404 page, you can simply mark it as “noindex” for crawlers and users.
Conclusion
Monitor your soft 404s and find the actual reason behind each error to prevent them from happening in the future. Plus, always fix soft 404s as soon as you spot them!
At Atropos Digital, we conduct detailed website analyses to find and fix any crawling and indexing errors. Our integrated SEO services help your pages rank high and reach your target audience. Ready to go big? Give us a call.