是否优化多分类术语MySQL查询?

时间:2011-02-04 作者:goatlady

(Moderator\'s note: 标题原来是:“查询/数据库优化”)

我为自定义“过滤器”搜索面板编写了一个函数,允许用户从最多四个自定义分类中选择术语。我直接对数据库运行查询,查询平均执行时间为半秒(包括所有四种分类法的术语和返回的一个结果)。

这对我来说似乎很慢。我想知道我是否可以做些什么来优化查询,甚至数据库,使其更高效/更快。甚至可能写一个观点?我有使用MS-SQL的经验,但对MySQL的经验不多,我不知道事情有什么不同。

这是我的功能代码:

    function filter_resources($phase,$wa,$aus,$topics){
    global $wpdb;
    $querystr="
    SELECT * 
        FROM $wpdb->posts A
            LEFT JOIN $wpdb->term_relationships B ON(A.ID = B.object_id)
            LEFT JOIN $wpdb->term_taxonomy C ON(B.term_taxonomy_id = C.term_taxonomy_id)
            LEFT JOIN $wpdb->terms D ON(C.term_id = D.term_id)

        LEFT JOIN $wpdb->term_relationships BB ON(A.ID = BB.object_id)
            LEFT JOIN $wpdb->term_taxonomy CC ON(BB.term_taxonomy_id = CC.term_taxonomy_id)
            LEFT JOIN $wpdb->terms DD ON(CC.term_id = DD.term_id)

        LEFT JOIN $wpdb->term_relationships BBB ON(A.ID = BBB.object_id)
            LEFT JOIN $wpdb->term_taxonomy CCC ON(BBB.term_taxonomy_id = CCC.term_taxonomy_id)
            LEFT JOIN $wpdb->terms DDD ON(CCC.term_id = DDD.term_id)

        LEFT JOIN $wpdb->term_relationships BBBB ON(A.ID = BBBB.object_id)
            LEFT JOIN $wpdb->term_taxonomy CCCC ON(BBBB.term_taxonomy_id = CCCC.term_taxonomy_id)
            LEFT JOIN $wpdb->terms DDDD ON(CCCC.term_id = DDDD.term_id)

        WHERE A.post_type = \'resources\' 
            AND A.post_status = \'publish\'
            AND C.taxonomy = \'phase-of-learning\'
            AND D.term_id = \'$phase\'
            AND CC.taxonomy = \'wa-curriculum\'
            AND DD.term_id = \'$wa\'
            AND CCC.taxonomy = \'australian-curriculum\'
            AND DDD.term_id = \'$aus\'
            AND CCCC.taxonomy = \'topics\'
            AND DDDD.term_id = \'$topics\'
        ORDER BY A.post_date DESC";
    return $wpdb->get_results($querystr,OBJECT);
}
谢谢!

3 个回复
最合适的回答,由SO网友:MikeSchinkel 整理而成

虽然这确实是一个MySQL问题,但它确实有助于理解WordPress SQL模式,而且我喜欢尝试优化SQL查询,而不是将您发送到StackOverflow 我会尽量在这里回答你。你可能仍然想把它贴在那里,以获得一些其他的意见。

虽然我不完全理解你的要求,但我想我理解你的要求,我想我是这样做的,我想提出以下内容,看看它是否能更好地满足你的需求。我没有您的数据,因此我很难确定它是否确实有效,但正如我所说,我认为它满足您的需要:

function filter_resources($phase,$wa,$aus,$topics){
  global $wpdb;
  $sql =<<<SQL
SELECT
  t.slug,p.*
FROM
  wp_posts p
  INNER JOIN wp_term_relationships tr ON p.ID=tr.object_id
  INNER JOIN wp_term_taxonomy tt ON tr.term_taxonomy_id = tt.term_taxonomy_id
  INNER JOIN wp_terms t ON tt.term_id = t.term_id
WHERE 1=1
  AND p.post_type = \'resources\'
  AND p.post_status = \'publish\'
  AND t.term_id IN (%d,%d,%d,%d)
  AND CONCAT(tt.taxonomy,\'/\',t.term_id) IN (
    \'phase-of-learning/%s\',
    \'wa-curriculum/%s\',
    \'australian-curriculum/%s\',
    \'topics/%s\'
  )
GROUP BY
  p.ID
HAVING
  COUNT(*)=4
ORDER BY
  p.post_date DESC
SQL;
  $sql = $wpdb->prepare($sql,
    $phase,$wa,$aus,$topics,  // For the %d replacements
    $phase,$wa,$aus,$topics   // For the %s replacements
  );
  $results = $wpdb->get_results($sql,OBJECT);
  return $results;
}
基本上,这将为您提供应用了所有分类法术语的所有帖子,它通过执行自由形式查询来匹配应用了分类法/术语的所有帖子,但仅限于应用了所有术语的帖子分组wp_post.ID 并查找加入该职位4次的所有记录。运行MySQL时EXPLAIN 与您所拥有的相比,优化看起来相当不错;加入的表要少得多。希望这就是你需要的逻辑。

使用瞬态API进行缓存,如果您试图提高性能,也可以考虑在有限的时间内(1小时、4小时、12小时或更长时间)将结果缓存在“瞬态”中Transients API:

  • Overview of WordPress’ Transients API

    define(\'NUM_HOURS\',4); // Time to cache set for your use case
    $data = get_transient( \'your_transient_key\' );
    if( !$data ) {
      $data = // Do something to get your data here
      set_transient( \'your_transient_key\', $data, 60 * 60 * NUM_HOURS );
    }  
    
    在您的filter_resources() 函数它可能看起来像这样:

    define(\'RESOURCE_CACHE_HOURS\',4);
    function filter_resources($phase,$wa,$aus,$topics){
      $resources = get_transient( \'yoursite_filtered_resources\' );
      if(!$resources) {
        global $wpdb;
        $sql =<<<SQL
    SELECT
      t.slug,p.*
    FROM
      wp_posts p
      INNER JOIN wp_term_relationships tr ON p.ID=tr.object_id
      INNER JOIN wp_term_taxonomy tt ON tr.term_taxonomy_id = tt.term_taxonomy_id
      INNER JOIN wp_terms t ON tt.term_id = t.term_id
    WHERE 1=1
      AND p.post_type = \'resources\'
      AND p.post_status = \'publish\'
      AND t.term_id IN (%d,%d,%d,%d)
      AND CONCAT(tt.taxonomy,\'/\',t.term_id) IN (
        \'phase-of-learning/%s\',
        \'wa-curriculum/%s\',
        \'australian-curriculum/%s\',
        \'topics/%s\'
      )
    GROUP BY
      p.ID
    HAVING
      COUNT(*)=4
    ORDER BY
      p.post_date DESC
    SQL;
        $sql = $wpdb->prepare($sql,
          $phase,$wa,$aus,$topics,  // For the %d replacements
          $phase,$wa,$aus,$topics   // For the %s replacements
        );
        $resources = $wpdb->get_results($sql,OBJECT);
        $hours = RESOURCE_CACHE_HOURS * 60 * 60;
        set_transient( \'yoursite_filtered_resources\', $resources, $hours);
      }  
      return $resources;
    }
    
    更新下面是另一个代码,它试图处理用户选择的标准少于四个的情况:

    define(\'RESOURCE_CACHE_HOURS\',4);
    function filter_resources($phase,$wa,$aus,$topics){
      $resources = get_transient( \'yoursite_filtered_resources\' );
      if(!$resources) {
        $terms = $taxterms = array();
        if (!empty($phase))
          $taxterms[$phase] = \'phase-of-learning/%s\';
        if (!empty($wa)) 
          $taxterms[$wa] = \'wa-curriculum/%s\';
        if (!empty($aus))
          $taxterms[$aus] = \'axustralian-curriculum/%s\';
        if (!empty($topics))
          $taxterms[$topics] = \'topics/%s\';
        $count = count($taxterms);
        $having = ($count==0 ? \'\' : "HAVING COUNT(*)={$count}");
        $values = array_keys(array_flip($tax_terms));
        $values = array_merge($values,$values);  // For %d and $s
        $taxterms =  implode("\',\'",$taxterms);
        $terms = implode(\',\',array_fill(0,$count,\'d%\'));
        global $wpdb;
        $sql =<<<SQL
    SELECT
      t.slug,p.*
    FROM
      wp_posts p
      INNER JOIN wp_term_relationships tr ON p.ID=tr.object_id
      INNER JOIN wp_term_taxonomy tt ON tr.term_taxonomy_id = tt.term_taxonomy_id
      INNER JOIN wp_terms t ON tt.term_id = t.term_id
    WHERE 1=1
      AND p.post_type = \'resources\'
      AND p.post_status = \'publish\'
      AND t.term_id IN ({$terms})
      AND CONCAT(tt.taxonomy,\'/\',t.term_id) IN (\'{$taxterms}\')
    GROUP BY
      p.ID
    {$having}
    ORDER BY
      p.post_date DESC
    SQL;
        $sql = $wpdb->prepare($sql,$values);
        $resources = $wpdb->get_results($sql,OBJECT);
        $hours = RESOURCE_CACHE_HOURS * 60 * 60;
        set_transient( \'yoursite_filtered_resources\', $resources, $hours);
      }  
      return $resources;
    }
    

SO网友:scribu

除非您需要与WP 3.0保持向后兼容,否则您可以利用advanced taxonomy queries WP 3.1中的支持。

生成SQL的代码可以在wp includes/taxonomy中找到。php

SO网友:Denis de Bernardy

首先,使用内部联接,而不是左联接。左连接将强制执行一个查询计划,该计划将扫描整个posts表,直到它通过术语过滤器找到匹配的post。

其次,可以通过使用get\\u term()预取术语来减少所需的联接数量。

将这两者结合在一起,您的查询将类似于:

SELECT * 
FROM $wpdb->posts posts
JOIN $wpdb->term_relationships termA
ON posts.ID = termA.object_id
AND termA.term_taxonomy_id = $termA_taxid
JOIN $wpdb->term_relationships termB
ON posts.ID = termB.object_id
AND termB.term_taxonomy_id = $termB_taxid
JOIN $wpdb->term_relationships termC
ON posts.ID = termC.object_id
AND termC.term_taxonomy_id = $termC_taxid
JOIN $wpdb->term_relationships termD
ON posts.ID = termD.object_id
AND termD.term_taxonomy_id = $termD_taxid
WHERE ...
对于5个连接的表(而不是13个)和一个查询计划(从查找与term\\u关系中出现频率最低的术语相关联的帖子开始),应该会产生相同的结果。

结束