[译者按]Andy Li这篇文章,是我看过的最好的,最透彻的关于SharePoint Workflow架构的文章。通过阅读他的文章,我才清楚的了解了SharePoint Workflow的运作机制,并且在遇到问题的时候,知道如何下手查找问题。因此翻译过来,希望对Workflow的开发人员有帮助。本部分主要讲trouble shooting。
这篇博客是由Andy Li贡献的,他是SharePoint开发人员支持组的处理疑难问题的工程师。原文地址。这个关于Workflow的系列,是他贡献给社区的,帮助大家更好的理解Workflow运行时的内部机制,以及如何和SharePoint交互。
这篇文章由SPFarmer翻译。
上一篇地址:
查找workflow问题最常用的log是Workflow Runtime trace和ULS logs。
Workflow Runtime Trace
想要开启workflow logging,在调试问题的时候,你可以使用配置文件,来产生Windows Workflow Foundation logging的输出。 Windows Workflow Foundation 使用的配置文件格式,在Microsoft .NET Framework 2.0时期就引入了。下面的例子显示了如果开启Windows Workflow Foundation 的log。
要想使用配置文件来控制workflow runtime engine 的log输出,首先需要找出哪些进程可能运行你的workflow。W3wp.exe 可能是第一个映入你脑海里的,但是不要忘记了timer service 也运行workflow. 如果你没有一个OWSTIMER.exe.CONFIG 文件,你可以在14\bin文件夹下手动创建一个。
下面是一个例子:在initializeData 属性上定了一个绝对路径。同时,在traceOutputOptions属性里也定了一些其他的选项 : ProcessId 和 DateTime.
如果这个属性被一个web application使用,并且有多个w3wp.exe 实例在运行,每一个属性会创建他自己的trace文件。在这种情况下,除了 WFTrace.log 文件,你会看到一些以<GUID>WFTrace.log命名的文件。
<?xml version="1.0" encoding="utf-8"?>
<configuration>
<system.diagnostics>
<sources>
<source name="System.Workflow.Runtime" >
<listeners>
<add name = "System.Workflow"/>
</listeners>
</source>
<source name="System.Workflow.Runtime.Hosting">
<listeners>
<add name="System.Workflow"/>
</listeners>
</source>
<source name="System.Workflow.Activities">
<listeners>
<add name="System.Workflow"/>
</listeners>
</source>
</sources>
<sharedListeners>
<add name="System.Workflow"
type="System.Diagnostics.TextWriterTraceListener"
initializeData="d:\_temp\WFTrace.log"
traceOutputOptions="DateTime,ProcessId"/>
</sharedListeners>
<switches>
<add name="System.Workflow.LogToTraceListeners" value="1"/>
<add name="System.Workflow.Runtime" value="All" />
<add name="System.Workflow.Runtime.Hosting" value="All" />
<add name="System.Workflow.Runtime.Tracking" value="All" />
<add name="System.Workflow.Activities" value="All" />
<add name="System.Workflow.Activities.Rules" value="All" />
</switches>
</system.diagnostics>
</configuration>
Trace 监听器使用traceOutputOptions 来检查哪个选项或者元素应该被加到trace输出中。绝大都数情况下, DateTime 和 ProcessId 就很好了. 你还可以添加“Callstack” 这样会得到log的更多的上下文新消息。
ULS Logs
想要打出workflow的verbose log ,打开Central Administration 站点,然后在左侧导航里点击Monitoring. 点击Configure diagnostics logging,在SharePoint Foundation category下选上General, Timer 和Workflow Infrastructure ,并且在least critical event 里面选择Verbose ,这样就能把最详细的log打出来。我们开启“General” 是因为绝大多数event receiver相关的logs都在 “General” category 里面,并且在查找事件传输管道时候,非常有用。
分析 ULS Logs
绝大多数的workflow错误是因为除了一些异常,你应该可以在ULS logs里找到这些异常。有时候workflow 在页面上看起来没问题,比如, workflow在页面上状态显示的是“In Progress”, 但是有一些task被锁住了。这个时候,就需要搜索ULS logs,看看有没有关于特定workflow实例的异常信息。调用堆栈应该能显示出来哪里出了问题。
每一个ULS log都有一个单独的tag和它关联例。例如 “936m” 就是所有“Events Successfully delivered toInstance: <InstanceId>”的tag。
07/25/2011 12:31:01.68
OWSTIMER.EXE (0x1DCC)
0x1E34
SharePoint Foundation
Workflow Infrastructure
936m
Verbose
RunWorkflow: Events successfully delivered to Instance: e63dcf89-5cd5-4121-8302-58063851e903, Association: dde85862-a347-4ab2-9978-f8fbaf7d965f, BaseTemplate: 62506157-6fad-4127-8b0f-6f37fa023000 2f8a3470-d977-4b09-b51b-5770bf6e86b4
下面的表格列举了ULS log里最可能出现的的workflow错误和异常:
ULS tag | Location and Log |
tag_72fq | Location: SPWorkflowManager.StartWorkflow() Message: "Start Workflow: <exception message>” |
tag_72fr | Location: SPWorkflowManager.SaveInstanceData() Message: "Workflow Save Instance: <exception message>” |
tag_98d6 | Location: SPWorkflowManager.SaveInstanceData() |
tag_72fs | Location: SPWorkflowManager.RunWorkflow() Message: "RunWorkflow: <exception message>" |
tag_98d7 | Location: SPWorkflowManager.RunWorkflow() Message: An entry is logged into HistoryList |
tag_72fu | Location: SPWorkflowManager.LoadPluggableClass Message: “Load Workflow Class: <exception message>" |
tag_72fr | Location: SPWorkflowManager.AutoStartWorkflow Message: “Declarative workflow may fail to start because it was trying to impersonate the workflow author, but that user does not exist on this site." |
tag_72fv | Location: SPWorkflowManager.AutoStartWorkflow Message: “AutoStart Workflow: <exception message>" |
tag_7fbw | Location: SPWorkflowManager.DelaySTartworkflow Message: "DelayedStart Workflow: <exception message>” |
tag_72et, | Location: SPWinOeHostServices.CreateSubscription |
tag_72eu, | Location: SPWinOeHostServices.DeleteSubscription |
tag_72ev | Location: SPWinOETaskService.CreateTask |
tag_72ew, | Location: SPWinOETaskService.CreateTaskWithContnetType |
tag_72ez, | Location: SPWinOETaskService.DeleteTask |
tag_72f0, | Location: SPWinOETaskService.CompleteTask |
tag_72f1, | Location: SPWinOETaskService.RollBackTask |
tag_72f2 | Location: SPWinOETaskService.SendEmail |
tag_75yn | Location: SPWinOEHostService.LoadWorkflowAssembly “Load Workflow Assembly: %s"” |
tag_88xr, | Location: SPWinOEHostService.TerminateWorkflow “WinWF Internal Error, terminating workflow Id# <InstanceID>" |
tag_98d4, | Location: SPWinOEHostService.TerminateWorkflow |
tag_98d5 | Location: SPWinOEHostService .ExceptionNotHandled |
tag_72eo | Location: SPWinOEHostService .DehydrateInstance Message: “DehydrateInstance: <Exception Message>” |
tag_72ep | Location: SPWinOEHostService .WorkItemDequeue Message: “Workflow WorkItem Dequeue: <Exception Message>" |
tag_936l | Location: SPWinOeEngine.RunWorkflow Message: "Engine RunWorkflow: <Exception Message>", |
tag_936m | Location: SPWinOeEngine.RunWorkflow Message: "RunWorkflow: Events successfully delivered to Instance: <InstanceID> |
0000 | "In workflow failover job, unlocking workflow instance: {0}, processing id: {1}" |
0000 | In workflow failover job, failover in progress work items for workflow instance: {1}, processing id: {1} |
在2011年8月份和2012年2月份的的CU(SharePoint 2010),我们添加了更多的log 来诊断workflow 问题。他们在tag里都显示“0000” 。这些log对于查找一些复杂的问题非常有用。
ULS tag | Location and Log |
0000 | "Running timer event for workflow instance <InstanceID>” |
0000 | “Running non-timer event for instance <InstanceID>” |
0000 | "After RunWorkflow() the instance is still running. Calling TryUnload() on workflow instance: < InstanceID>” |
0000 | “Processing {0} events for workflow instance: < InstanceID>, processing id: <ProcessingID>” |
0000 | "Stopped processing events for workflow instance: <InstanceID> because the instance is unlocked." |
0000 | "Processing event <eventName> with id <EventId> for workflow instance: < InstanceID>, processing id: <ProcessingID>” |
0000 | "In RaiseEvent(), calling RunWorkflowElev for instance < InstanceID>, event type: <EventType>" |
0000 | "Calling RouteWorkflowEvent() on list <ListId>, item <ItemID> for event type: <EventType>” |
0000 | "In Workflow event receiver, ItemUpdating()"); |
0000 | "In ItemUpdated() call on workflow event receiver for list id: <ListId>, item id: <ItemID>” |
0000 | "Setting up workflow event receiver for instance: <InstanceID> on list <ListID>" |
0000 | "Loaded instance data for instance: <InstanceID>, data size: <Size>, processing id: <ProcessingID>" |
0000 | "Calling SaveInstanceData for workflow instance:<InstanceID>; instance data size: <Size>, internal state: <InternalState>, processing id: <ProcessingID>" |
0000 | "Failed to persist workflow instance: <InstanceID> with error code: <InstanceID>,instance data size: <Size>, internal state: <InternalState>, processing id: <ProcessingID>" |
0000 | "Successfully persisted workflow instance: <InstanceID> with error code: <ErrorCode>, instance data size: <Size>, internal state: <InternalState> processing id: <ProcessingID>" |
0000 | "Unlocking instance: <InstanceID> with processing id: <ProcessingID> because its internal state <InternalState> is completed or terminated" |
0000 | "In RunWorkflowElev(), Enqueue pending event: <Event> for instance: <InstanceID>" |
0000 | "In RunWorkflowElev(), begin processing events for instance: <InstanceID>” |
0000 | "Updating workflow task with id <TaskID> on list <ListID> in AlterTask()" |
0000 | "In fireItemEvent(), for list <ListID>, and item: <ItemID>", |
0000 | "In fireSyncItemEvent(), calling ExecuteItemEventReceivers()") |
希望这些信息能够对你查找Workflow问题有帮助。