Mozilla Location Service-10

上次说到 缓冲数据库同步到mysql的问题,实际上,同步并不需要自己另外写程序写命令来特地执行同步操作。妙处就在celery配置好并开启相应服务后,可以自动完成这项工作。
celery 一个半中文文档地址:(看全英有些吃力,专业术语好多)
http://docs.jinkan.org/docs/celery/
以下是简介:
Celery - 分布式任务队列

Celery 是一个简单、灵活且可靠的,处理大量消息的分布式系统,并且提供维护这样一个系统的必需工具。

它是一个专注于实时处理的任务队列,同时也支持任务调度。

Celery 有广泛、多样的用户与贡献者社区,你可以通过 IRC 或是 邮件列表 加入我们。

Celery 是开源的,使用 BSD 许可证 授权。
在这个项目中的文件结构:

ichnaea.async(注意不是egg下,我也不知道为何不是运行egg下的文件)
- app.py
- config.py
- settings.py
- task.py

看起来跟webapp那个文件夹里的内容很像。
仔细看看文档后发现确实很多地方如出一辙。
用以下命令跑起worker:

ICHNAEA_CFG=location.ini bin/celery -A ichnaea.async.app:celery_app worker \
    -Ofair --no-execv --without-mingle --without-gossip

配置参数的含义在文档里都有详细解释。

start...
redis_uri is: redis://localhost:6379/0

 -------------- celery@sa-VirtualBox v3.1.23 (Cipater)
---- **** ----- 
--- * ***  * -- Linux-4.4.0-31-generic-x86_64-with-Ubuntu-16.04-xenial
-- * - **** --- 
- ** ---------- [config]
- ** ---------- .> app:         ichnaea.async.app:0x7fdce337aa10
- ** ---------- .> transport:   redis://localhost:6379/0
- ** ---------- .> results:     redis://localhost:6379/0
- *** --- * --- .> concurrency: 1 (prefork)
-- ******* ---- 
--- ***** ----- [queues]
 -------------- .> celery_blue      exchange=celery(direct) key=celery_blue
                .> celery_cell      exchange=celery(direct) key=celery_cell
                .> celery_content   exchange=celery(direct) key=celery_content
                .> celery_default   exchange=celery(direct) key=celery_default
                .> celery_export    exchange=celery(direct) key=celery_export
                .> celery_incoming  exchange=celery(direct) key=celery_incoming
                .> celery_monitor   exchange=celery(direct) key=celery_monitor
                .> celery_ocid      exchange=celery(direct) key=celery_ocid
                .> celery_reports   exchange=celery(direct) key=celery_reports
                .> celery_wifi      exchange=celery(direct) key=celery_wifi

[2016-08-19 11:53:23,870: WARNING/MainProcess] celery@sa-VirtualBox ready.

看到出现这一段文字,说明启动成功。
然而在运行过程中并没有传说中的‘周期性动作’的配置和执行。
数据库仍然没有半点动静。

接下来终于发现celery有个神奇的beat:
http://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html
celery beat is a scheduler. It kicks off tasks at regular intervals, which are then executed by the worker nodes available in the cluster.

By default the entries are taken from the CELERYBEAT_SCHEDULE setting, but custom stores can also be used, like storing the entries in an SQL database.

You have to ensure only a single scheduler is running for a schedule at a time, otherwise you would end up with duplicate tasks. Using a centralized approach means the schedule does not have to be synchronized, and the service can operate without using locks.

让add这个task每隔30秒执行一次

from celery.schedules import crontab

CELERYBEAT_SCHEDULE = {
    # Executes every Monday morning at 7:30 A.M
    'add-every-monday-morning': {
        'task': 'tasks.add',
        'schedule':seconds(30),
        'args': (16, 16),
    },
}

那我们项目中的task在哪里?又是在哪里配置周期时常呢?
/ProgFile/ichnaea-for-liuqiao/ichnaea/ichnaea/async/task.py:

  if enabled and cls._schedule:
            app.conf.CELERYBEAT_SCHEDULE.update(cls.beat_config())

照葫芦画瓢搜索到这样几句代码
beat_config看起来很可疑:

@classmethod
    def beat_config(cls):
        """
        Returns the beat schedule for this task, taking into account
        the optional shard_model to create multiple schedule entries.
        """
        if cls._shard_model is None:
            return {cls.shortname(): {
                'task': cls.name,
                'schedule': cls._schedule,
            }}

        result = {}
        for shard_id in cls._shard_model.shards().keys():
            result[cls.shortname() + '_' + shard_id] = {
                'task': cls.name,
                'schedule': cls._schedule,
                'kwargs': {'shard_id': shard_id},
            }
        return result

再来追踪下task和schedule
真相是真正的task在data/tasks下,所有的task都用注解的方式继承了这个基类,同时也指定了周期,比如下面这个:

@celery_app.task(base=BaseTask, bind=True, queue='celery_reports',
                 _countdown=2, expires=20, _schedule=timedelta(seconds=32))
def update_incoming(self):
    print 'update_incoming'
    export.IncomingQueue(self)(export_reports)

现在还不清楚这个task到底完成什么任务。

不管了,看看beat怎么开启来。

Starting the Scheduler

To start the celery beat service:

$ celery -A proj beat
这个proj 是ichnaea.async.app:celery_app,没有这个app啥也干不了

You can also start embed beat inside the worker by enabling workers -B option, this is convenient if you will never run more than one worker node, but it’s not commonly used and for that reason is not recommended for production use:

$ celery -A proj worker -B 不推荐使用这种方法

Beat needs to store the last run times of the tasks in a local database file (named celerybeat-schedule by default), so it needs access to write in the current directory, or alternatively you can specify a custom location for this file:

$ celery -A proj beat -s /home/celery/var/run/celerybeat-schedule 这种会报错

用第一种方法开启,出现:

a@sa-VirtualBox:/ProgFile/ichnaea-for-liuqiao/ichnaea$ ICHNAEA_CFG=location.ini bin/celery -A ichnaea.async.app:celery_app beat
start...
redis_uri is: redis://localhost:6379/0
celery beat v3.1.23 (Cipater) is starting.
__    -    ... __   -        _
Configuration ->
    . broker -> redis://localhost:6379/0
    . loader -> celery.loaders.app.AppLoader
    . scheduler -> celery.beat.PersistentScheduler
    . db -> celerybeat-schedule
    . logfile -> [stderr]@%WARNING
    . maxinterval -> now (0s)

这样开启成功了。一旦beat跑起来,所有带周期的task都像接到了命令一样开始执行起来了。
这个时候在我们开启worker的那个终端可以看到自动出现的一排排黄色字体,like:
[2016-08-19 14:03:31,521: WARNING/Worker-1] **
我自己在每个task开始的地方加了个print,打印任务的名字:

[2016-08-19 14:05:37,375: WARNING/Worker-1] update_incoming
[2016-08-19 14:05:37,377: WARNING/Worker-1] query in session.py entities:
[2016-08-19 14:05:37,378: WARNING/Worker-1] (<class 'ichnaea.models.config.ExportConfig'>,)
[2016-08-19 14:05:37,380: WARNING/Worker-1] sqlalchemy.orm.query
[2016-08-19 14:05:40,971: WARNING/Worker-1] update_cellarea
[2016-08-19 14:05:49,969: WARNING/Worker-1] update_datamap
[2016-08-19 14:05:50,091: WARNING/Worker-1] update_datamap
[2016-08-19 14:05:50,142: WARNING/Worker-1] update_datamap
[2016-08-19 14:05:50,168: WARNING/Worker-1] update_datamap
[2016-08-19 14:05:52,997: WARNING/Worker-1] update_blue
[2016-08-19 14:05:53,021: WARNING/Worker-1] update_blue
[2016-08-19 14:05:53,058: WARNING/Worker-1] update_blue
[2016-08-19 14:05:53,118: WARNING/Worker-1] update_blue
[2016-08-19 14:05:53,162: WARNING/Worker-1] update_blue
[2016-08-19 14:05:53,205: WARNING/Worker-1] update_blue
[2016-08-19 14:05:53,233: WARNING/Worker-1] update_blue
[2016-08-19 14:05:53,248: WARNING/Worker-1] update_blue
[2016-08-19 14:05:53,252: WARNING/Worker-1] update_blue
[2016-08-19 14:05:53,275: WARNING/Worker-1] update_blue
[2016-08-19 14:06:09,111: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,136: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,175: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,202: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,239: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,256: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,294: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,307: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,330: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,336: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,355: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,369: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,373: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,392: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,413: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,436: WARNING/Worker-1] update_wifi
[2016-08-19 14:06:09,573: WARNING/Worker-1] update_incoming
[2016-08-19 14:06:09,575: WARNING/Worker-1] query in session.py entities:

我以为到这一步,redis里的东西就会进入到mysql了,然并卵。
仅发现:
1.stat表里有了些记录:

mysql> select * from stat;
+-----+------------+-------+
| key | time       | value |
+-----+------------+-------+
|   1 | 2016-08-18 |     0 |
|   1 | 2016-08-19 |     0 |
|   2 | 2016-08-18 |     0 |
|   2 | 2016-08-19 |     0 |
|   3 | 2016-08-18 |     0 |
|   3 | 2016-08-19 |     0 |
|   4 | 2016-08-18 |     0 |
|   4 | 2016-08-19 |     0 |
|   7 | 2016-08-18 |     0 |
|   7 | 2016-08-19 |     0 |
|   8 | 2016-08-18 |     0 |
|   8 | 2016-08-19 |     0 |
|   9 | 2016-08-18 |     0 |
|   9 | 2016-08-19 |     0 |
+-----+------------+-------+
14 rows in set (0.00 sec)

第二个发现时,md,redis里的key跟之前完全不一样了!

127.0.0.1:6379> keys *
 1) "statcounter_unique_wifi_20160819"
 2) "statcounter_unique_wifi_20160818"
 3) "statcounter_unique_blue_20160819"
 4) "statcounter_blue_20160818"
 5) "statcounter_unique_cell_20160818"
 6) "statcounter_unique_cell_ocid_20160818"
 7) "statcounter_unique_cell_20160819"
 8) "statcounter_wifi_20160818"
 9) "statcounter_unique_blue_20160818"
10) "_kombu.binding.celeryev"
11) "_kombu.binding.celery.pidbox"
12) "statcounter_blue_20160819"
13) "statcounter_unique_cell_ocid_20160819"
14) "_kombu.binding.celery"
15) "statcounter_cell_20160818"
16) "statcounter_wifi_20160819"
17) "statcounter_cell_20160819"

还有,某些个时候,开worker的那个终端会报错,报错内容看不懂。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值