php + MongoDB + Sphinx 实现全文检索 (二)

本文详细介绍了如何使用PHP、MongoDB和Sphinx实现全文检索。通过提供IndexDriver接口、MongoDB数据迁移工具类、以及统一调度脚本的实现,阐述了从数据转换到索引创建的完整过程。文章还提到了配置文件config.ini的重要性,用于存储Sphinx目录等相关设置,并区分了手动重建和crontab自动重建索引的逻辑。
摘要由CSDN通过智能技术生成

上一篇文章写了大体思路, IndexDriver 只给了个接口, mysql 数据转移, sphinx 索引配置文件的建立, 这些地方都没有给出.

本着 talk is cheap 的原则,将这部分代码在这里补全.


IndexDriver

这里仅列出一个示例性质的具体实现类.

首先给出一个由其派生的对接mongo的抽象类,所有由mongo数据建立的索引都继承该类:

<?php
abstract class MongoIndexDriver implements IndexDriver {

    private $mongo_host;
    private $index_name;
    private $mongo_conn;

    public function __construct($mongo_host) {
        $this->mongo_host = $mongo_host;
    }

    /**
     * {@inheritDoc}
     * @see IndexDriver::getIndexName()
     */
    public function getIndexName() {
        // use driver file name as index name. in this way, the index name
        // is always unique.
        if (!$this->index_name) {
            $class_name = get_class($this);
            $this->index_name = chop($class_name, "Driver");
        }
        return $this->index_name;
    }

    protected function setMongoHost($host) {
        $this->mongo_host = $host;
    }

    protected function table($table_name, $database_name = "kaiba") {
        if (!$this->mongo_conn) {
            $this->mongo_conn = new MongoClient($this->mongo_host);
        }
        return $this->mongo_conn->$database_name->$table_name;
    }
}

然后是一个具体的实现类:

<?php
require_once SCRIPT_PATH.'/base/IndexDriver.php';
require_once SCRIPT_PATH.'/base/MongoIndexDriver.php';
require_once SCRIPT_PATH.'/base/IndexField.php';

class ExampleDriver extends MongoIndexDriver {

    const REFRESH_INTERVAL = 7 * 24 * 60;

    private $fields;
    private $mongo_table;

    public function __construct() {
        $ini = parse_ini_file(CONFIG_PATH, true);
        parent::__construct($ini['mongo']['host'].":".$ini['mongo']['port']);

        $this->fields = array(
            IndexField::createIntField("_id1"),
            IndexField::createIntField("_id2"),
            IndexField::createIntField("_id3"),
            IndexField::createIntField("_id4"),
            IndexField::createIntField("code"),
            IndexField::createIntField("type"),
            IndexField::createField("name"),
            IndexField::createField("content"),
            IndexField::createField("message"),
        );
    }

    /**
     * {@inheritDoc}
     * @see IndexDriver::getIndexFields()
     */
    public function getIndexFields() {
        return $this->fields;
    }

    /**
     * {@inheritDoc}
     * @see IndexDriver::getValues()
     */
    public function getValues($offset, $limit) {
        $mongo_cursor = $this->table("example_table")
            ->find(array(), array(
                "_id",
                "code",
                "type",
                "name",
                "content",
                "message",
            ))
            ->skip($offset)
            ->limit($limit);
        $result_count = $mongo_cursor->count();
        if ($result_count <= 0) {
            return null;
        }

        $result = array();
        foreach ($mongo_cursor as $k => $v) {
            $value = array();
            $_id1 = $_id2 = $_id3 = $_id4 = "";

            // mongoId -> int
            $id_string = $v['_id']."";
            for ($j = 0; $j < 4; $j ++) {
                $id_sub = substr($id_string, $j * 6, 6);
                $segment = hexdec($id_sub);
                ${"_id".($j + 1)} = $segment;
            }

            $value['_id1'] = $_id1;
            $value['_id2'] = $_id2;
            $value['_id3'] = $_id3;
            $value['_id4'] = $_id4;
            $value['code'] = $v['code'];
            $value['type'] = $v['type'];
            $value['name'] = $v['name'];
            $value['content'] = $v['content'];
            $value['message'] = $v['message'];

            $result[] = $value;
        }

        return $result;
    }

    /**
     * {@inheritDoc}
     * @see IndexDriver::shouldRefreshIndex()
     */
    public function shouldRefreshIndex($last_refresh_time) {
        $hour = (int) date('H');
        // only refresh index in midnight
        if ($hour > 4) {
            return false;
        }
        $minutes = (time() - $last_refresh_time) / 60;
        return $minutes + 5 > self::REFRESH_INTERVAL;
    }

    /**
     * {@inheritDoc}
     * @see IndexDriver::generateDocument()
     */
    public function generateDocument() {
        // TODO Auto-generated method stub
    }

}

索引建立工具类:

<?php
require_once SCRIPT_PATH.'/base/IndexDriver.php';
require_once SCRIPT_PATH.'/base/IndexField.php';
require_once SCRIPT_PATH.'/utils/Logger.php';

/**
 * @author lx
 * date: 2016-11-25
 * generate sphinx conf. used by searchd and indexer.
 */
class SphinxConfCreator {

    private $mysql_host;
    private $mysql_port;
    private $mysql_user;
    private $mysql_password;
    private $mysql_database;

    private $conf_file;
    private $index_data_path;
    private $charset_dictpath;
    private $pid_file;
    private $log_file;
    private $query_log_file;

    private $drivers;
    private $logger;

    public function __construct(array $drivers) {
        $this->logger = new Logger("SphinxConfGenerator");
        if (empty($drivers)) {
            $msg = "IndexDriver array empty";
            $this->logger->e($msg);
            throw new Exception($msg);
        }
        foreach ($drivers as $name => $driver) {
            if (! $driver instanceof IndexDriver) {
                $msg = "need a valid IndexDriver";
                $this->logger->e($msg);
                throw new Exception($msg);
            }
        }
        $this->drivers = $drivers;
    }

    public function setMysqlHost($host, $port = null) {
        $this->mysql_host = $host;
        $this->mysql_port = $port;
        return $this;
    }

    public function setMysqlUser($user, $password) {
        $this->mysql_user = $user;
        $this->mysql_password = $passw
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值