Hidden 404 Errors with WordPress Plugin Pages

After a couple hours, I’ve tracked down and fixed a bug I was having with some of our WordPress plugins. I believe that there are a few people out there having the same problem. I think there may be another solution online, but it is one of those issues that is difficult to pare down to a good search query.

Anyway here is a solution for “404 issues with plugin pages” or “lynx shows a 404, but the page still loads”, or “Google Webtools says there is a 404, but I can get to the page”, or “setting status to 200:OK still results in 404″, or “I get a 404 in IE, but refreshing the page brings it up”, or “I get random 404 errors in IE”, or “I’m getting an HTTP/1.1 404 Not Found error but the page still loads”.

You may skip ahead to the Final Solution code, but it is probably a good idea to read everything below to make sure that you are indeed having the same issue I had… and that this will actually fix your problem.

The Context

I have some WordPress plguins (Stranger Products, Stranger Events) that generate pages outside of the core WordPress system (i.e. they are not “wordpress pages” in the WP DB, they are web pages generated by our plugin script). To serve these pages, I add a bunch of rules to the .htaccess file to redirect stuff like /products/1/ to a product info page.

Some gallery plugins or other plugins that generate new pages may have a similar setup/issue.

The Problem

While the mod rewrite works fine, and the page loads fine, WordPress doesn’t find a WP page or post for the query string and so sends a “HTTP/1.1 404 File Not Found” status in the header. Most web browsers will ignore this and show the content that comes after the header. It seems that IE will sometimes choke on this status, and other times show the page. Funny IE!

Google’s crawler however will not crawl that page and will let you know in a web toolkit report. Also, I noticed that the lynx command line browser for Linux would show the 404 error and then load the page.

The big issue here is that Google is not going to crawl our page.

The Fix

I spent a lot of time tracking down where in the WordPress code the 404 status is set. Ideally, there would be a plugin “hook” near this that we could use to prevent the 404 status from reaching the browser.

The function that makes the 404 decision is handle_404(), which can be found in the /wp-includes/classes.php file. Here is the code (for WordPress 2.8.4, similar for previous versions I looked at too):

459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
function handle_404() {
	global $wp_query;
	if ( (0 == count($wp_query->posts)) && !is_404() && !is_search() && ( $this->did_permalink || (!empty($_SERVER['QUERY_STRING']) && (false === strpos($_SERVER['REQUEST_URI'], '?'))) ) ) {
		// Don't 404 for these queries if they matched an object.
		if ( ( is_tag() || is_category() || is_author() ) && $wp_query->get_queried_object() ) {
			if ( !is_404() )
				status_header( 200 );
			return;
		}
		$wp_query->set_404();
		status_header( 404 );
		nocache_headers();
	} elseif ( !is_404() ) {
		status_header( 200 );
	}
}

This would be the ideal place to say, “Hey, don’t 404 this page”, but there is no hook in here. I tried setting the $wp->did_permalink flag to FALSE, which worked sometimes, but sometimes WordPress would write that back to TRUE after I reset it. And I’m not even sure what that flag is doing; so playing with it might cause some bugs elsewhere.

The next place to check is the status_header() function called by handle_404. This function is found in the /wp-includes/functions.php file. Here is the code:

1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
function status_header( $header ) {
	$text = get_status_header_desc( $header );
 
	if ( empty( $text ) )
		return false;
 
	$protocol = $_SERVER["SERVER_PROTOCOL"];
	if ( 'HTTP/1.1' != $protocol && 'HTTP/1.0' != $protocol )
		$protocol = 'HTTP/1.0';
	$status_header = "$protocol $header $text";
	if ( function_exists( 'apply_filters' ) )
		$status_header = apply_filters( 'status_header', $status_header, $header, $text, $protocol );
 
	return @header( $status_header, true, $header );
}

Tada! This function uses the “status_header” hook/filter before updating the header. So we can create a function in our plugin to check the status for a 404 and then return false/NULL if we know that there really is a page to load. Here’s how I did it.

The Final Solution

In PHP code for my pages, I created a global variable called $isapage and set it to true. So at the very top of any page that is giving the 404 errors, add this code:

global $isapage;
$isapage = true;

Now I add the following function and filter to my plugin code:

//this function checks if we have set the $isapage variable, and if so prevents WP from sending a 404
function ssp_status_filter($s)
{
	global $isapage;
	if($isapage && strpos($s, "404"))
		return false;	//don't send the 404
	else
		return $s;
}
add_filter('status_header', 'ssp_status_filter');

I hope this helps some people out there.

Feel free to critique this solution. Let me know if I missed something or if there are better ways to do this.

Feel free to post related issues. I may have found solutions to those along the way… or maybe a commenter can help you out.