识别长时间运行的代理作业并报警

Identify and Alert for Long-Running Agent Jobs

Being a DBA is like being a train conductor. One of the biggest responsibilities is making sure all jobs are running as expected, or making sure “all the trains are running on time” so to speak. As my partner-in-crime Devin Knight (Blog | Twitter) posted earlier, we have come up with a solution to identify and alert for when SQL Agent jobs are running longer than expected.

The need for this solution came from the fact that despite my having alerts for failed agent jobs, we had a process pull a Palin and went rogue on us. The job was supposed to process a cube but since it never failed, we (admins) weren’t notified. The only way we got notified was when a user finally alerted us and said “the cube hasn’t been updated in a couple days, what’s up?”. Sad trombone.

As Devin mentioned in his post the code/solution below is very much a version 1 product so if you have any modifications/suggestions then have at it. We’ve documented in-line so you can figure out what the code is doing. Some caveats here:

  • This solution has been tested/validated on SQL Server 2005 (SP4) and 2008 R2 (SP1).
  • Code requires a table to be created in a database. I’ve setup a DBAdmin database on all servers here for custom scripts for DBAs such as this, Brent Ozar’s Blitz script, Ola Hallengren’s maintenance solution, Adam Machanic’s sp_whoisactive, etc. You can use any database you’d like to keep your scripts in but just be aware of the USE statement at top of this particular code
  • This solution requires that you have Database Mail setup/configured
  • To setup this solution, create an Agent job that runs ever few minutes (we’re using 5) to call this stored procedure
  • FYI, I set the mail profile name to be the same as the server name. One – makes it easy for me to standardize naming conventions across servers. Two – Lets me be lazy and code stuff like I did in the line setting the mail profile name. If your mail profile is set differently, make sure you correct it there.
  • Thresholds – This is documented in code but I’m calling it out anyways. We’ve set it up so that any job whose average runtime is less than 5 minutes, the threshold is average runtime + 10 minutes (e.g. Job runs average of 2 minutes would have an alert threshold of 12 minutes). Anything beyond a 5 minute average runtime is controlled by variable value, with default value of 150% of average runtime. For example, a job that averages 10 minute runtime would have an alert threshold of 15 minutes.
  • If a job triggers an alert, that information is inserted into a table. Subsequent runs of the stored procedure then check the table to see if the alert has already been reported. We did this to avoid having admins emailed every subsequent run of the stored procedure.

CODE (WARNING: This code is currently beta and subject to change as we improve it)

Last script update: 7/12/2012

Change log: 7/12/2012 – Updated code to deal with “phantom” jobs that weren’t really running. Improved logic to handle this. Beware, uses undocumented stored procedure xp_sqlagent_enum_jobs

Download script link – Click here

--Create Long Running Jobs table
USE [DBAdmin]
GO

IF OBJECT_ID('dbo.LongRunningJobs') IS NOT NULL
	DROP TABLE dbo.LongRunningJobs

CREATE TABLE [dbo].[LongRunningJobs](
	[ID] [int] IDENTITY(1,1) NOT NULL,
	[JobName] [sysname] NOT NULL,
	[JobID] [uniqueidentifier] NOT NULL,
	[StartExecutionDate] [datetime] NULL,
	[AvgDurationMin] [int] NULL,
	[DurationLimit] [int] NULL,
	[CurrentDuration] [int] NULL,
	[RowInsertDate] [datetime] NOT NULL
) ON [PRIMARY]

GO

ALTER TABLE [dbo].[LongRunningJobs] ADD  CONSTRAINT [DF_LongRunningJobs_Date]  DEFAULT (getdate()) FOR [RowInsertDate]
GO


--Create Stored Procedure usp_LongRunningJobs
USE [DBAdmin]
GO

/****** Object:  StoredProcedure [dbo].[usp_LongRunningJobs]    Script Date: 07/12/2012 08:16:01 ******/
IF  EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[usp_LongRunningJobs]') AND type in (N'P', N'PC'))
DROP PROCEDURE [dbo].[usp_LongRunningJobs]
GO

USE [DBAdmin]
GO

/****** Object:  StoredProcedure [dbo].[usp_LongRunningJobs]    Script Date: 07/12/2012 08:16:01 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

-- =============================================
-- Author:        Devin Knight and Jorge Segarra
-- Create date: 7/6/2012
-- Description:    Monitors currently running SQL Agent jobs and 
-- alerts admins if runtime passes set threshold
-- Updates: 7/11/2012	Changed Method for capturing currently running jobs to use master.dbo.xp_sqlagent_enum_jobs 1, ''
--			
-- =============================================
CREATE PROCEDURE [dbo].[usp_LongRunningJobs]
AS
--Set Mail Profile
DECLARE @MailProfile VARCHAR(50)

SET @MailProfile = (
		SELECT @@SERVERNAME
		) --Replace with your mail profile name

--Set Email Recipients
DECLARE @MailRecipients VARCHAR(50)

SET @MailRecipients = 'DBAGroup@adventureworks.com'

--Set limit in minutes (applies to all jobs)
--NOTE: Percentage limit is applied to all jobs where average runtime greater than 5 minutes
--else the time limit is simply average + 10 minutes
DECLARE @JobLimitPercentage FLOAT

SET @JobLimitPercentage = 150 --Use whole percentages greater than 100
	-- Create intermediate work tables for currently running jobs

DECLARE @currently_running_jobs TABLE (
	job_id UNIQUEIDENTIFIER NOT NULL
	,last_run_date INT NOT NULL
	,last_run_time INT NOT NULL
	,next_run_date INT NOT NULL
	,next_run_time INT NOT NULL
	,next_run_schedule_id INT NOT NULL
	,requested_to_run INT NOT NULL
	,-- BOOL
	request_source INT NOT NULL
	,request_source_id SYSNAME COLLATE database_default NULL
	,running INT NOT NULL
	,-- BOOL
	current_step INT NOT NULL
	,current_retry_attempt INT NOT NULL
	,job_state INT NOT NULL
	) -- 0 = Not idle or suspended, 1 = Executing, 2 = Waiting For Thread, 3 = Between Retries, 4 = Idle, 5 = Suspended, [6 = WaitingForStepToFinish], 7 = PerformingCompletionActions

--Capture Jobs currently working
INSERT INTO @currently_running_jobs
EXECUTE master.dbo.xp_sqlagent_enum_jobs 1,''

--Temp table exists check
IF OBJECT_ID('tempdb..##RunningJobs') IS NOT NULL
	DROP TABLE ##RunningJobs

CREATE TABLE ##RunningJobs (
	[JobID] [UNIQUEIDENTIFIER] NOT NULL
	,[JobName] [sysname] NOT NULL
	,[StartExecutionDate] [DATETIME] NOT NULL
	,[AvgDurationMin] [INT] NULL
	,[DurationLimit] [INT] NULL
	,[CurrentDuration] [INT] NULL
	)

INSERT INTO ##RunningJobs (
	JobID
	,JobName
	,StartExecutionDate
	,AvgDurationMin
	,DurationLimit
	,CurrentDuration
	)
SELECT jobs.Job_ID AS JobID
	,jobs.NAME AS JobName
	,act.start_execution_date AS StartExecutionDate
	,AVG(FLOOR(run_duration / 100)) AS AvgDurationMin
	,CASE 
		--If job average less than 5 minutes then limit is avg+10 minutes
		WHEN AVG(FLOOR(run_duration / 100)) <= 5  
			THEN (AVG(FLOOR(run_duration / 100))) + 10
		--If job average greater than 5 minutes then limit is avg*limit percentage
		ELSE (AVG(FLOOR(run_duration / 100)) * (@JobLimitPercentage / 100))  
		END AS DurationLimit
	,DATEDIFF(MI, act.start_execution_date, GETDATE()) AS [CurrentDuration]
FROM @currently_running_jobs crj
INNER JOIN msdb..sysjobs AS jobs ON crj.job_id = jobs.job_id
INNER JOIN msdb..sysjobactivity AS act ON act.job_id = crj.job_id
	AND act.stop_execution_date IS NULL
	AND act.start_execution_date IS NOT NULL
INNER JOIN msdb..sysjobhistory AS hist ON hist.job_id = crj.job_id
	AND hist.step_id = 0
WHERE crj.job_state = 1
GROUP BY jobs.job_ID
	,jobs.NAME
	,act.start_execution_date
	,DATEDIFF(MI, act.start_execution_date, GETDATE())
HAVING CASE 
		WHEN AVG(FLOOR(run_duration / 100)) <= 5
			THEN (AVG(FLOOR(run_duration / 100))) + 10
		ELSE (AVG(FLOOR(run_duration / 100)) * (@JobLimitPercentage / 100))
		END < DATEDIFF(MI, act.start_execution_date, GETDATE())


--Checks to see if a long running job has already been identified so you are not alerted multiple times
IF EXISTS (
		SELECT RJ.*
		FROM ##RunningJobs RJ
		WHERE CHECKSUM(RJ.JobID, RJ.StartExecutionDate) NOT IN (
				SELECT CHECKSUM(JobID, StartExecutionDate)
				FROM dbo.LongRunningJobs
				)
		)
	--Send email with results of long-running jobs
	EXEC msdb.dbo.sp_send_dbmail @profile_name = @MailProfile
		,@recipients = @MailRecipients
		,@query = 'USE DBAdmin; Select RJ.*
From ##RunningJobs RJ
WHERE CHECKSUM(RJ.JobID,RJ.StartExecutionDate) NOT IN (Select CHECKSUM(JobID,StartExecutionDate) From dbo.LongRunningJobs) '
		,@body = 'View attachment to view long running jobs'
		,@subject = 'Long Running SQL Agent Job Alert'
		,@attach_query_result_as_file = 1;

--Populate LongRunningJobs table with jobs exceeding established limits
INSERT INTO [DBAdmin].[dbo].[LongRunningJobs] (
	[JobID]
	,[JobName]
	,[StartExecutionDate]
	,[AvgDurationMin]
	,[DurationLimit]
	,[CurrentDuration]
	) (
	SELECT RJ.* FROM ##RunningJobs RJ WHERE CHECKSUM(RJ.JobID, RJ.StartExecutionDate) NOT IN (
		SELECT CHECKSUM(JobID, StartExecutionDate)
		FROM dbo.LongRunningJobs
		)
	)
GO

 

Got any feedback/comments/criticisms? Let me hear them in the comments!

转载自这里

1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md或论文文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。 5、资源来自互联网采集,如有侵权,私聊博主删除。、可私 6信博主看论文后选择购买源代码。 1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md或论文文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。 5、资源来自互联网采集,如有侵权,私聊博主删除。 、可私信6博主看论文后选择购买源代码。 1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md或论文文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。 5、资源来自互联网采集,如有侵权,私聊博主删除。 、可私信6博主看论文后选择购买源代码。
1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md或论文文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。 5、资源来自互联网采集,如有侵权,私聊博主删除。 6、可私信博主看论文后选择购买源代码。 1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md或论文文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。 5、资源来自互联网采集,如有侵权,私聊博主删除。 6、可私信博主看论文后选择购买源代码。 1、资源项目源码均已通过严格测试验证,保证能够正常运行; 2、项目问题、技术讨论,可以给博主私信或留言,博主看到后会第一时间与您进行沟通; 3、本项目比较适合计算机领域相关的毕业设计课题、课程作业等使用,尤其对于人工智能、计算机科学与技术等相关专业,更为适合; 4、下载使用后,可先查看README.md或论文文件(如有),本项目仅用作交流学习参考,请切勿用于商业用途。 5、资源来自互联网采集,如有侵权,私聊博主删除。 6、可私信博主看论文后选择购买源代码。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值