After implementing a filter system into a post search with post meta fields, I stumbled upon a major performance bottleneck. The more filters were set, the slower the query was ā€“Ā exponentially.

The problem

Originally, Iā€™ve added the active filters via the pre_get_posts filter. For each active filter, Iā€™ve added an additional meta query to the WP_Query object. So far, so common. And while it worked in a reasonable amount of time for one filter, adding a second one already took multiple seconds to run. Adding a third one results in multiple minutes and in a local test adding a fourth filter even didnā€™t end in a result after multiple hours. There was something in the query that took exponentially longer for each meta field. If you donā€™t know, WordPress performs a separate INNER JOIN for each meta field, which looks like this:

SELECT SQL_CALC_FOUND_ROWS  wp_posts.ID
FROM wp_posts
INNER JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id )
INNER JOIN wp_postmeta AS mt1 ON ( wp_posts.ID = mt1.post_id )
INNER JOIN wp_postmeta AS mt2 ON ( wp_posts.ID = mt2.post_id )
INNER JOIN wp_postmeta AS mt3 ON ( wp_posts.ID = mt3.post_id )
INNER JOIN wp_postmeta AS mt4 ON ( wp_posts.ID = mt4.post_id )
WHERE 1=1
AND (
	(
		(
			( wp_postmeta.meta_key = 'filter_1' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '5000' AND '10000' )
			OR ( wp_postmeta.meta_key LIKE 'filter_1_%_name' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '5000' AND '10000' )
		)
	)
	OR (
		(
			( mt1.meta_key = 'filter_2' AND CAST(mt1.meta_value AS DECIMAL) BETWEEN '1980' AND '2016' )
			OR ( mt1.meta_key LIKE 'filter_2_%_name' AND CAST(mt1.meta_value AS DECIMAL) BETWEEN '1980' AND '2016' )
		)
	)
	OR (
		(
			( mt2.meta_key = 'filter_3' AND CAST(mt2.meta_value AS DECIMAL) BETWEEN '49' AND '50' )
			OR ( mt2.meta_key LIKE 'filter_3_%_name' AND CAST(mt2.meta_value AS DECIMAL) BETWEEN '49' AND '50' )
		)
	)
	OR (
		(
			( mt3.meta_key = 'filter_4' AND CAST(mt3.meta_value AS DECIMAL) BETWEEN '500' AND '1000' )
			OR ( mt3.meta_key LIKE 'filter_4_%_name' AND CAST(mt3.meta_value AS DECIMAL) BETWEEN '500' AND '1000' )
		)
	)
	OR (
		(
			( mt4.meta_key = 'mast_hubhoehe_range' AND CAST(mt4.meta_value AS DECIMAL) BETWEEN '500' AND '1000' )
			OR ( mt4.meta_key LIKE 'mast_hubhoehe_range_%_name' AND CAST(mt4.meta_value AS DECIMAL) BETWEEN '500' AND '1000' )
		)
	)
)
AND ((wp_posts.post_type = 'product' AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'acf-disabled' OR wp_posts.post_status = 'product_homepage' OR wp_posts.post_status = 'private')))
GROUP BY wp_posts.ID
ORDER BY wp_posts.post_date DESC
LIMIT 0, 10;Code language: SQL (Structured Query Language) (sql)

The first approach was checking an EXPLAIN SELECT, which didnā€™t really seem to show any problem with the query:

+----+-------------+-------------+------------+-------+------------------------------------------------------------+------------------+---------+-----------------------+------+----------+-----------------------------------------------------------+
| id | select_type | table       | partitions | type  | possible_keys                                              | key              | key_len | ref                   | rows | filtered | Extra                                                     |
+----+-------------+-------------+------------+-------+------------------------------------------------------------+------------------+---------+-----------------------+------+----------+-----------------------------------------------------------+
|  1 | SIMPLE      | wp_posts    | NULL       | range | PRIMARY,post_name,type_status_date,post_parent,post_author | type_status_date | 164     | NULL                  |    6 |   100.00 | Using where; Using index; Using temporary; Using filesort |
|  1 | SIMPLE      | wp_postmeta | NULL       | ref   | post_id,meta_key                                           | post_id          | 8       | wp_wd50_5.wp_posts.ID |   16 |   100.00 | NULL                                                      |
|  1 | SIMPLE      | mt1         | NULL       | ref   | post_id,meta_key                                           | post_id          | 8       | wp_wd50_5.wp_posts.ID |   16 |   100.00 | NULL                                                      |
|  1 | SIMPLE      | mt4         | NULL       | ref   | post_id,meta_key                                           | post_id          | 8       | wp_wd50_5.wp_posts.ID |   16 |   100.00 | NULL                                                      |
|  1 | SIMPLE      | mt2         | NULL       | ref   | post_id,meta_key                                           | post_id          | 8       | wp_wd50_5.wp_posts.ID |   16 |   100.00 | NULL                                                      |
|  1 | SIMPLE      | mt3         | NULL       | ref   | post_id,meta_key                                           | post_id          | 8       | wp_wd50_5.wp_posts.ID |   16 |   100.00 | Using where                                               |
+----+-------------+-------------+------------+-------+------------------------------------------------------------+------------------+---------+-----------------------+------+----------+-----------------------------------------------------------+
6 rows in set, 2 warnings (0.01 sec)
Code language: PHP (php)

Rewriting to use a single JOIN

My first idea was to use a single INNER JOIN for all meta fields. This seemed to work for this case where all post meta fields are handled via OR. For AND, there must be a single JOIN for every post meta field. Luckily (in this case), the query execution time was only so long when the post meta fields were selected via OR.

So, I used the posts_clauses filter to adjust the query accordingly. This filter allows you to manipulate the actual SQL query before its execution. The filter content can look like this:

array(7) {
  ["where"]=>
  string(1993) " AND (
  (
    (
      ( wp_postmeta.meta_key = 'filter_1' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '5000' AND '10000' )
      OR ( wp_postmeta.meta_key LIKE 'filter_1_%_name' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '5000' AND '10000' )
    )
  )
  OR (
    (
      ( mt1.meta_key = 'filter_2' AND CAST(mt1.meta_value AS DECIMAL) BETWEEN '1980' AND '2016' )
      OR ( mt1.meta_key LIKE 'filter_2_%_name' AND CAST(mt1.meta_value AS DECIMAL) BETWEEN '1980' AND '2016' )
    )
  )
  OR (
    (
      ( mt2.meta_key = 'filter_3' AND CAST(mt2.meta_value AS DECIMAL) BETWEEN '49' AND '50' )
      OR ( mt2.meta_key LIKE 'filter_3_%_name' AND CAST(mt2.meta_value AS DECIMAL) BETWEEN '49' AND '50' )
    )
  )
  OR (
    (
      ( mt3.meta_key = 'filter_4' AND CAST(mt3.meta_value AS DECIMAL) BETWEEN '500' AND '1000' )
      OR ( mt3.meta_key LIKE 'filter_4_%_name' AND CAST(mt3.meta_value AS DECIMAL) BETWEEN '500' AND '1000' )
    )
  )
  OR (
    (
      ( mt4.meta_key = 'mast_hubhoehe_range' AND CAST(mt4.meta_value AS DECIMAL) BETWEEN '500' AND '1000' )
      OR ( mt4.meta_key LIKE 'mast_hubhoehe_range_%_name' AND CAST(mt4.meta_value AS DECIMAL) BETWEEN '500' AND '1000' )
    )
  )
) AND ((wp_posts.post_type = 'product' AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'acf-disabled' OR wp_posts.post_status = 'internal_area' OR wp_posts.post_status = 'product_homepage' 
OR wp_posts.post_status = 'private')))"
  ["groupby"]=>
  string(13) "wp_posts.ID"
  ["join"]=>
  string(206) " INNER JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id )  INNER JOIN wp_postmeta AS mt1 ON ( wp_posts.ID = mt1.post_id )  INNER JOIN wp_postmeta AS mt2 ON ( wp_posts.ID = mt2.post_id )"
  ["orderby"]=>
  string(25) "wp_posts.post_date DESC"
  ["distinct"]=>
  string(0) ""
  ["fields"]=>
  string(12) "wp_posts.*"
  ["limits"]=>
  string(11) "LIMIT 0, 10"
}
Code language: SQL (Structured Query Language) (sql)

What I needed to do here, was removing all subsequent joins. First, replace all mtX identifiers for the table names with the actual post meta table in the where key:

$clauses['where'] = (string) \preg_replace( '/mt\d+\.meta_/', $query->meta_query->meta_table . '.meta_', $clauses['where'] );Code language: PHP (php)

Here, mt1 becomes wp_postmeta, as well as mt2, mt3 and mt4.

Then, check where the first mt1 appears in the join array key. As we donā€™t need these additional joins, use only the substring ahead of them.

$second_inner_join = ' INNER JOIN ' . $query->meta_query->meta_table . ' AS mt1';
$second_inner_join_position = \strpos( $clauses['join'], $second_inner_join );

if ( $second_inner_join_position !== false ) {
	$clauses['join'] = \substr( $clauses['join'], 0, $second_inner_join_position );
}
Code language: PHP (php)

The returned value now looks like this:

array(7) {
  ["where"]=>
  string(1993) " AND (
  (
    (
      ( wp_postmeta.meta_key = 'filter_1' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '5000' AND '10000' )
      OR ( wp_postmeta.meta_key LIKE 'filter_1_%_name' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '5000' AND '10000' )
    )
  )
  OR (
    (
      ( wp_postmeta.meta_key = 'filter_2' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '1980' AND '2016' )
      OR ( wp_postmeta.meta_key LIKE 'filter_2_%_name' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '1980' AND '2016' )
    )
  )
  OR (
    (
      ( wp_postmeta.meta_key = 'filter_3' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '49' AND '50' )
      OR ( wp_postmeta.meta_key LIKE 'filter_3_%_name' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '49' AND '50' )
    )
  )
  OR (
    (
      ( wp_postmeta.meta_key = 'filter_4' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '500' AND '1000' )
      OR ( wp_postmeta.meta_key LIKE 'filter_4_%_name' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '500' AND '1000' )
    )
  )
  OR (
    (
      ( wp_postmeta.meta_key = 'mast_hubhoehe_range' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '500' AND '1000' )
      OR ( wp_postmeta.meta_key LIKE 'mast_hubhoehe_range_%_name' AND CAST(wp_postmeta.meta_value AS DECIMAL) BETWEEN '500' AND '1000' )
    )
  )
) AND ((wp_posts.post_type = 'product' AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'acf-disabled' OR wp_posts.post_status = 'internal_area' OR wp_posts.post_status = 'product_homepage' 
OR wp_posts.post_status = 'private')))"
  ["groupby"]=>
  string(13) "wp_posts.ID"
  ["join"]=>
  string(206) " INNER JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id )"
  ["orderby"]=>
  string(25) "wp_posts.post_date DESC"
  ["distinct"]=>
  string(0) ""
  ["fields"]=>
  string(12) "wp_posts.*"
  ["limits"]=>
  string(11) "LIMIT 0, 10"
}
Code language: SQL (Structured Query Language) (sql)

As you might already noticed, I use this filter for a custom post type product. Thus, I first check in my hook callback whether Iā€™m in the main query and whether this query does belong to my post type:

if (
	(
		! $query->is_main_query()
		|| $query->query_vars['post_type'] !== 'product'
		|| ! $query->is_archive()
	)
	&& ! \has_action( 'my_plugin_enforce_filter' )
) {
	return $clauses;
}
Code language: PHP (php)

Additionally, I also check whether a specific action has been called with add_action to be able to short-circuit this function.

AND relation

As mentioned above, if the AND relation of meta fields ist used, the multiple joins are necessary. Thus, if enabled (via a global option in my case), I add them again:

if ( \get_option( 'my_plugin_filter_relation', 'OR' ) === 'AND' ) {
	/** @var \wpdb $wpdb */
	global $wpdb;
	$join_count = 0;
	
	foreach ( $query->meta_query as $key_query ) {
		if ( ! \is_array( $key_query ) ) {
			continue;
		}
		
		foreach ( $key_query as $meta_query ) {
			if ( ! \is_array( $meta_query ) ) {
				continue;
			}
			
			++$join_count;
			$clauses['join'] .= \sprintf(
				' INNER JOIN %1$s AS mt%2$d ON (%3$s.ID = mt%2$d.post_id AND (mt%2$d.meta_key = \'%4$s\' OR mt%2$d.meta_key LIKE \'%5$s\') AND %6$s %7$s %8$s) ',
				$wpdb->postmeta,
				$join_count,
				$wpdb->posts,
				$meta_query[0]['key'],
				$meta_query[0]['key'] . '_%_name',
				'mt' . $join_count . '.meta_value',
				$meta_query[0]['compare'],
				"'" . \implode( '\' AND \'', $meta_query[0]['value'] ) . "'"
			) . \PHP_EOL;
		}
	}
}
Code language: PHP (php)

Why removing it first and then explicitly adding it afterwards? This is for an easier maintenance, since I can make sure that the actual base is identical, means the query is built the same way for both types.

Conclusion

To this day, I donā€™t know what actually happens to make such queries exponentially slower. I consider the solution above as workaround, but as working. If you have a clue, why this happens, feel free to comment. šŸ™‚

Leave a Reply

Your email address will not be published. Required fields are marked *