Sidekiq错误处理

最佳实践

  1. 使用一个处理错误的组件,比如 Honeybadger, Airbrake, Rollbar, BugSnag, Sentry, Exceptiontrap, Raygun等,它们功能上都类似,你可以选择一个使用。在一个任务上出现异常时,这些组件会向你发送一封email。请注意,Sidekiq 3.0删除了对Airbrake, Honeybadger, Exceptional and ExceptionNotifier内置的支持。确保你的error_service 支持Sidekiq。
  2. Sidekiq自身的重试机制会捕获这些异常并定期重新执行任务。这些错误处理组件会通知你出现了异常,你可以修复这些导致异常的bug,直到Sidekiq能够成功处理任务。
  3. 如果重试了25次(大概21天)你还没有修复bug,Sidekiq将会停止重试,并将你的任务移到Dead Job Queue。在接下来的六个月内,你可以修复bug,并通过Web UI手动重试你的任务。
  4. 六个月之后,Sidekiq会删除这项任务。

错误处理程序

Gems可以附加到Sidekiq的全局错误处理程序,所以在Sidekiq内部出现一个错误后,其就会被感知到。通过把error services的gem放到你应用的Gemfile中,error services可以被自动的集成进来。
You can create your own error handler by providing something with responds to call(exception, context_hash):

Sidekiq.configure_server do |config|
  config.error_handlers << Proc.new {|ex,ctx_hash| MyErrorService.notify(ex, ctx_hash) }
end

请注意,错误处理程序只是和Sidekiq 服务进程相关,其在Rails console并不可用。

回溯日志

为一个任务使能backtrace 日志记录将会使回溯持续到任务终止。如果大量的任务不断失败并被重新排队执行,这样即使没有新任务加入,也会导致Redis占用的内存增加。当使能backtrace 时,你要保持谨慎,你可以把它限制在几行,或者通过error service来跟踪错误。

自动重启任务

Sidekiq会使用一个指数退避公式((retry_count * 4) + 15 + (rand(30) (retry_count + 1)))来计算重试失败任务的时间,这会在大概21天内重试25次。假设你在这个时间内修复了bug,任务会重试并成功执行通过。如果达到了25次,Sidekiq会把任务移到Dead Job queue,这样的话后面就需要手动执行该任务了。

Web界面

Sidekiq的Web界面有一个“Retries”和“Dead”标签列出失败的任务,并允许你执行、检查或删除它们。

Dead Job Queue

Sidekiq 3.0引入了这一功能,重试次数达到上限的任务将被放到这个队列里。Sidekiq将不会再重试这个队列里的任务,你可以通过Web界面手工的执行它们。这个队列不会无限增长,至多有10,000个任务且任务被放到队列里的时间小于6个月。只有配置了大于等于0次重试的任务才可能会被放到Dead Job Queue中,如果某种特殊类型的任务很短暂,你可以通过设置:retry => false 使其不重试或者死亡。

配置

You can specify the number of retries for a particular worker if 25 is too many:

class LessRetryableWorker
  include Sidekiq::Worker
  sidekiq_options :retry => 5 # Only five retries and then to the Dead Job Queue

  def perform(...)
  end
end

You can disable retry support for a particular worker. Note with retry disabled, Sidekiq will not track or save any error data for the worker’s jobs.

class NonRetryableWorker
  include Sidekiq::Worker
  sidekiq_options :retry => false # job will be discarded immediately if failed

  def perform(...)
  end
end

You can disable a job going to the DJQ:

class NonRetryableWorker
  include Sidekiq::Worker
  sidekiq_options :retry => 5, :dead => false

  def perform(...)
  end
end

The retry delay can be customized using sidekiq_retry_in, if needed.

class WorkerWithCustomRetry
  include Sidekiq::Worker
  sidekiq_options :retry => 5

  # The current retry count is yielded. The return value of the block must be 
  # an integer. It is used as the delay, in seconds. 
  sidekiq_retry_in do |count|
    10 * (count + 1) # (i.e. 10, 20, 30, 40)
  end

  def perform(...)
  end
end

After retrying so many times, Sidekiq will call the sidekiq_retries_exhausted hook on your Worker if you’ve defined it. The hook receives the queued message as an argument. This hook is called right before Sidekiq moves the job to the DJQ.

class FailingWorker
  include Sidekiq::Worker

  sidekiq_retries_exhausted do |msg|
    Sidekiq.logger.warn "Failed #{msg['class']} with #{msg['args']}: #{msg['error_message']}"
  end

  def perform(*args)
    raise "or I don't work"
  end
end

As of Sidekiq 3.3.2, you can change the maximum number of jobs in the DJQ or the maximum time spent in the DJQ by setting dead_max_jobs and dead_timeout_in_seconds in your Sidekiq options hash.

进程崩溃

If the Sidekiq process segfaults or crashes the Ruby VM, any jobs that were being processed are lost. Sidekiq Pro offers a reliable queueing feature which does not lose those jobs.

No More Bike Shedding

Sidekiq’s retry mechanism is a set of best practices but many people have suggested various knobs and options to tweak in order to handle their own edge case. This way lies madness. Design your code to work well with Sidekiq’s retry mechanism as it exists today or fork the RetryJobs middleware and add your own logic. I’m no longer accepting any functional changes to the retry mechanism unless you make an extremely compelling case for why Sidekiq’s thousands of users would want that change.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值