gearman failover

From gearman's main page, they mention running with multiple job servers so if a job server dies, the clients can pick up a new job server. Given the statement and diagram below, it seems that the job servers do not communicate with each other.

Our question is what happens to those jobs that are queued in the job server that died? What is the best practice to have high-availability for these servers to make sure jobs aren't interrupted in a failure?

You are able to run multiple job servers and have the clients and workers connect to the first available job server they are configured with. This way if one job server dies, clients and workers automatically fail over to another job server. You probably don't want to run too many job servers, but having two or three is a good idea for redundancy.

enter image description here


share|improve this question
add comment

As far as I know there is no proper way to handle this at the moment, but as long as you run both job servers with permanent queues (using MySQL or another datastore - just don't use the same actual queue for both servers), you can simply restart the job server and it'll load its queue from the database. This will allow all the queued tasks to be submitted to available workers, even after the server has died.

There is however no automagical way of doing this when a job server goes down, so if both the job server and the datastore goes down (a server running both locally goes down) will leave the tasks in limbo until it gets back online.

The permanent queue is only read on startup (and inserted / deleted from as tasks are submitted and completed).

I'm not sure about the complexity required to add such functionality to gearmand and whether it's actually wanted, but simple "task added, task handed out, task completed"-notifications between servers shouldn't been too complicated to handle.

想对作者说点什么? 我来说一句


2011年05月23日 585KB 下载


2013年07月10日 109KB 下载