404 and Soft 404 Errors: For those working on on-page SEO by trying to fix the page status code discrepancies, it is important to know the difference between 404 errors and soft 404 errors. By knowing this only, you can think of potential ways to fix the SEO-related issues that both can cause. Each page that loads onto various web browsers on the client request has a specific response code that gets included in the HTTP header. Sometimes it may be visible or sometimes may not be on the web page.
There are plenty of such response codes that the servers give to the clients to communicate the response status of the request and page loading. You may have come across the 404 error as one of the most commonly occurring response codes. This usually comes across your browser when you try to access a web page, which is mostly a temporary error response.
Status Codes Starting in #4
As we can see in the HTTP status code listings, all the codes which fall between 400 and 499 (usually referred to as the 4xx response codes) denote that the requested page didn’t load. For example,
- Error code 400 implies a ‘Bad Request’
- Error code 401 specifies ‘Unauthorized Request’
- Error code 402 means ‘Payment Needed’
- Error code 403 means ‘Forbidden request’
- Error code 499 implies ‘Client closed the request’
Among all these 4xx codes, the 404 Error Response Code is the specific one, which means that the page is actually gone out of service and may not be back soon.
The Soft 404 Error
In its actual implication, a soft 404 error doesn’t count to be an official request response code sent to the browser. It is a label that Google attaches to a page when it gets listed in their indexing. As the Google search engine crawlers go through a page, it will allocate the crawling resources keenly by making sure that their time isn’t getting wasted on a missing page, which need not have to be indexed anymore.
In the meantime, note that some servers have poor configuration, and the missing pages may load a 200 response code instead of displaying the 404 code. If the HTTP header invisible to the users displays 200 status codes even when the page isn’t found, then the page may get indexed, which is ultimately a resource wastage for Google.
To tackle this issue, Google checks the specific characteristics of the given 404 pages to ensure whether a 404 page is meant to be so. On the other hand, there may also be cases where the page is not missing, but certain characteristics like lack of content on the page or similar other pages in the site trigger Google to tag it as a missing page. Google’s Panda search engine algorithm also considers duplicate and thin content as adverse ranking factors. Considering these, fixing such issues will help avoid the soft 404s and enhance your search engine value.
Also Check: How to Improve Your SEO in 2023?
The Two Major Causes of 404 Errors are:
- Link error which directs users to a non-existing page
- The link leading to a page which was existing but disappeared suddenly
If it is a linking error, then you have to fix the link to resolve the 404. While you do a site audit for this, the most challenging part is finding all the broken links on the website. It can get extremely difficult for huge and complex websites with thousands of pages. The good news in such cases is that there are many crawling tools available lately like Xenu, Screaming Frog, DeepCrawl, or Botify to make your task easier.
When a previous web page no longer exists, the options you can consider based on the situation are:
- Restore the page immediately if it got accidentally removed
- Set up 301 redirects to link it to a related page if it is intentionally removed
To do so, you need to first identify the linking errors on your website. As discussed above, you can use any of the crawling tools to do it. One thing to consider while using crawling tools is that these may not find any orphaned pages, i.e., pages that aren’t internally linked. Orphaned pages may exist on a website if the link to the old page got disappeared due to a redesign, but the external links to the same exist. It would help if you double-checked these kinds of pages during the audit.
You can also use Google Search Console to report 404 pages. Doing so will get your links from other websites that go to a web page that previously existed on your website, but is not now. However, you may not find out a missing page in the Google Analytics report.
Spot Orphan Pages
The most appropriate way for you to find orphan pages is to create a report and segment out your website pages, which have titles mentioning the 404 Page Not Found Error. Another easy way to find such pages within the Analytics reports is to create some custom content groupings and assign all the 404 pages to such a content group.
You can also search Google for “site:yourwebiste.com” to find out all the pages indexed by the Google search engine. You can scroll through the listing to check the listed pages and see if any of these are giving a 404. If you want to do the same at a scale, use tools like WebCEO, which may not only help to find out Google indexing, but also other search engines like Bing, Yahoo, Naver, Baidu, Yandex, Seznam, etc.
All the search engines will give you a specific subset of your pages. Running the index listing for all the search engines will help you spot as many pages as possible and check for 404 Errors being indexed. Using the tools, it is also possible to export these listed pages and run a massive 404 check. You may have to add the URLs to the tool with the HTML file to run the mass 404 Error checks.
Fixing Soft 404
Even though you use the crawling tools to spot 404 Error pages, it may not detect any soft 404 as it is not a 404 error. But you can effectively use the crawling tools for many other things which may give you hints of a Soft 404 as:
- Thin Content: An excellent crawling tool can show you the page word count and spot thin content pages. Based on this report, you can identify the URLs and evaluate those pages.
- Duplicate Content: Advanced crawling tools can spot pages with template content. If it detects the main page content to be the same as that of other pages, you should look into these pages to identify and eradicate duplicate content.
You may also use Google Search Console effectively to check for crawl errors and to identify pages listed as soft 404s. Using this option to crawl the site to find the soft 404s errors at the first point will let you correct these even before Google detects those. In most cases, resolving soft 404 issues appear to be a simple application of common sense. The solution may be as simple as expanding the page content for think pages or replacing duplicate content with fresh and unique content.
Here are a couple of things to consider while fixing Soft 404
Consolidating the pages: Most of the time, the thin content issue may be related to the author being specific about the page topic with little to convey through it. The solution is to merge various think content pages to make them content-rich. It will help you with both resolving thin content issue and help avoid duplicate content.
Say, for example, e-com sites selling an item that only varies in size and color may have different URLs for different variants. You may think of pulling all these into one page by listing the variants out there. It will also make things easier for the actual users too.
Sort out technical issues causing content duplication: You can easily find duplicate content by using any crawling tool by checking the URL. This may sometimes be as simple as the URLs with www vs. non-www or http vs. https, with or without index.html, etc. You can easily correct these technical issues to get rid of the majority of the Soft 404 Errors.
Final Words:
The last thing you need to understand is that a soft 404 is never an actual 404 error. However, you cannot sit idle by knowing this because Google may deindex such pages if you don’t take immediate action to fix soft 404s as soon as you identify those. To tackle this challenge, you must keep on crawling your website using any reliable and functional tool to see actual vs. soft 404 errors occurring on your web pages. As discussed above, there are many crawling tools, both free and premium, which you should always keep in your SEO arsenal and must use effectively from time to time.
Tags: Soft 404 vs hard 404, how to fix soft 404 errors, soft 404 checkers, soft 404 examples, soft 404 google, how to fix soft 404 errors wordpress, submitted URL seems to be a soft 404, page cannot be indexed: soft 404.