How to Correctly and Uniformly Use Progress Monitors

Summary
Handling a progress monitor instance is deceptively simple. It seems to be straightforward but it is easy to make a mistake when using them. And, depending on numerous factors such as the underlying implementation, how it is displayed, if it’s set to use a fixed number of work items or ‘unknown’, if used through a SubProgressMonitor wrapper etc., the result can range from completely ok, mildly confusing or outright silliness.

In this article I hope I can lay down a few ground rules that will help anyone use progress monitors in a way that will work with the explicit and implicit contract of IProgressMonitor. Also, understanding the usage side makes it easier to understand how to implement a monitor.

By Kenneth Ölwing, BEA JRPG
January 18, 2006


Using a progress monitor - what's up with that?

It all really comes down to a few, not too complex, rules. A common theme is 'know what you know - but only that'. This means that you shouldn't assume you know things you really don't know, and this includes the common mistake of only considering progress monitors you have seen, i.e. typically the graphical ones when using the IDE. Another thing to watch out for is the fact that commonly you design a number of tasks that may call each other using sub progress monitors, and while doing that make assumptions based on your knowledge that they will be called in this manner - never forget that sometime maybe your separate subtasks may be called from not-yet-written routines. It's then vitally important that your subtasks act exactly in a 'neutral' manner, i.e. with no 'implicit assumptions' on what happened before or what will happen after.

One of the motivations for this article is when I tried my hand at implementing a progress monitor intended for headless/console use - and realised that code using it could make it look really wacky when the monitor was wrongly used, and this was issues that were not as readily apparent with a graphical monitor. Also, code (including my own) frequently abuses the explicit and implicit (which admittedly are my interpretation of reasonable behavior) contract that the IProgressMonitor interface states, and this makes for dicey decisions for a monitor implementor - should it complain (and how) when it gets conflicting orders? If not, how should it then behave to make for a reasonable and intuitive user experience?

The protocol of IProgressMonitor

Generally, all interaction with a progress monitor is through the interface IProgressMonitor and this interface defines the protocol behavior expected. It does leave some things up in the air though; for example, the description states some things that should be true, but the methods have no throws clause that helps enforce some invariants. I have chosen to interpret the descriptions ‘hard’, even to the point of saying it’s valid to throw an (unchecked) exception if a described rule is violated (this is somewhat controversial of course - if you implement a monitor doing this you should probably provide a way to turn off 'strictness'). Hopefully we could eventually see a new interface that deprecates the old methods and provides new ones that better reflect the contract. The discussion below is based on the assumption that the reader is familiar with the general API; review it in the Eclipse help.

The first important consideration is the realization that a monitor (contract wise) can be in basically four states. Any given implementation may or may not track those state changes and may or may not do anything about them, which is part of the reason that misbehaving users of a monitor sometimes gets away with it. Only one of these states are readily testable using the interface however (if the monitor is canceled); the other states are just a given from correct use of the interface.

Essentially, the state changes are governed by the methods beginTask(), done() and setCanceled(), plus the implicit initial state of a new instance. Note that for the purposes discussed here the perceived ‘changes in state’ occurring as a result from calling worked() is not relevant. A separate discussion below details how to deal with worked() calls.

NB: The states described here are not any ‘officialese’ that can be found as constants or anything like that; they’re only here to serve so they can be used for discussion.

  • PRISTINE
    This is the initial state of a newly created instance of an IProgressMonitor implementation, i.e. before beginTask() has been called. In principle a given implementation may handle a single instance such that it is reusable and reverted back to the PRISTINE state after a done() call, but that is opaque from the point of the contract. In this state it should be essentially correct and possible to go to any of the other states, but the typical and expected transition should be from PRISTINE to IN_USE as a result from a successful beginTask() call. The transition to FINISHED should result only in a very particular situation, see more below.
  • IN_USE
    This is the state the monitor after the first and only call to beginTask(). This is one of those things that are very easy to get wrong; contract wise, beginTask() can and should only be called at most once for a given instance. A more detailed discussion on the code pattern required to deal with this obligation can be found below.
  • FINISHED
    The transition to this state is achieved by calling done(). As with beginTask(), done() should only be called once and should always be called on a monitor when beginTask() has been called (i.e. it is ok to not call done() only if the monitor is still in the PRISTINE state). Again, the discussion below is more detailed on how to ensure proper protocol.
  • CANCELED
    Actually, this state is slightly murky; it’s possible that canceled/not canceled should be tracked separately from the others. But, contract wise it should be adequate if this state is either achieved directly from PRISTINE and just left that way, or if done() is called (likely as a result of detecting the canceled status), it is cleared and the monitor then transitions to FINISHED.

Now, one contract pattern described above is that if beginTask() is ever called, done() MUST be called. This is achieved by always following this code pattern (all code is simplified):
monitor = … // somehow get a new progress monitor which is in a pristine state
// figure some things out such as number of items to process etc…
try
{
monitor.beginTask(…)
// do stuff and call worked() for each item worked on, and check for cancellation
}
finally
{
monitor.done()
}
The important thing here then is to ensure that done() is always called (by virtue of being in the finally clause) but (normally) only if beginTask() has been successfully called (by virtue of being the first thing called in the try clause). There is a small loophole that could cause done() to be called without the monitor actually transitioning from PRISTINE to IN_USE. This loophole can with this pattern only happen if a particular beginTask() implementation throws an unchecked exception (The interface itself declares no throws clause) before it internally makes a note of the state change (if the specific implementation even tracks state in this manner and/or is too loose in its treatment of the interface contract).

tip Arguably, you should always strive for calling beginTask()/ done(). The reasons for this are buried in the fact that you in principle never know when you are being called as a subtask. If you don't 'complete' the monitor, the parent can end up with an incorrect count for its own task. The full rationale is covered more below, in the section "Ensure to always complete your monitor!".

Delegating use of a progress monitor to subtasks

Above for the IN_USE state I mentioned that it’s very easy to get things wrong; beginTask() should never be called more than once. This frequently happens in code that doesn’t correctly understand the implications of the contract. Specifically, such code pass on the same instance it has been given to subtasks, and those subtasks; not aware that the caller already has begun following the contract, also tries following the contract in the expected manner – i.e. they start by doing a beginTask().

Thus, passing on a monitor instance is almost always wrong unless the code knows exactly what the implications are. So the rule becomes: In the general case, a piece of code that has received a progress monitor from a caller should always assume that the instance they are given is theirs and thus completely follow the beginTask()/ done() protocol, and if it has subtasks that also needs a progress monitor, they should be given their own monitor instances through further use of the SubProgressMonitor implementation that wraps the ‘top-level’ monitor and correctly passes on worked() calls etc (more on this below).

Sample code to illustrate this:
monitor = … // somehow get a new progress monitor which is in a pristine state
// figure some things out such as number of items to process etc…
try
{
monitor.beginTask(…)
// do stuff and call worked() for each item processed, and check for cancellation

// farm out a piece of the work that is logically done by ‘me’ to something else
someThing.doWork(new SubProgressMonitor(monitor,…))
// farm out another piece of the work that is logically done by ‘me’ to something else
anotherThing.doWork(new SubProgressMonitor(monitor,…))
}
finally
{
monitor.done()
}
Note that each doWork() call gets a new instance of a SubProgressMonitor; such instances can and should not be reused for all the protocol reasons already discussed.

The only time a single instance of a monitor passed to, or retrieved by, a certain piece code can be reused in multiple places (e.g. typically methods called by the original receiver), is when the code in such methods is so intimately coupled so that they in effect constitute a single try/ finally block. Also, for this to work each method must know exactly who does beginTask()/ done() calls, and also (don’t forget this) how many work items they represent of the total reported to beginTask() so that they can make the correct reports. Personally, I believe this is generally more trouble than it’s worth – always follow the regular pattern of one receiver, one unique monitor instead and the code as a whole is more maintainable.

Managing the item count

This section is about how to do the initial beginTask() call and report the amount of total work expected, and then ideally report exactly that many items to the monitor. It is ok to end up not reporting all items in one particular case: when the job is aborted (due to cancellation by user, an exception thrown and so on) – this is normal and expected behavior and we will wind up in the finally clause where done() is called.

It is however sloppy technique to ‘just pick a number’ for the total and then call worked(), reporting a number and hope that the total is never exceeded. Either way this can cause very erratic behavior of the absolute top level and user visible progress bar (it is for a human we’re doing this after all) – if the total is too big compared to the actual items reported, a progress bar will move slowly, perhaps not at all due to scaling and then suddenly (at the done() call) jump directly to completed. If the total is too small, the bar will quickly reach ’100%’ or very close to it and then stay there ‘forever’.

So, first and foremost: do not guess on the number of work items. It’s a simple binary answer: either you know exactly how many things that will be processed…or you don’t know. It IS ok to not know! If you don't know, just report  IProgressMonitor.UNKNOWN as the total number, call worked() to your hearts content and a clever progress monitor implementation will still do something useful with it. Note that each (sub)task can and should make its own decision on what it knows or not. If all are following the protocol it will ensure proper behavior at the outer, human visible end. A heads up though: never call the SubProgressMonitor(parentMonitor, subticks) constructor using  IProgressMonitor.UNKNOWN for subticks - this is wrong! More on this later.

How to call beginTask() and worked()

There are typically two basic patterns where you know how many items you want to process: either you are going to call several different methods to achieve the full result, or you are going to call one method for each instance in a collection of some sort. Either way you know the total item count to process (the number of methods or the size of the collection). Variations of this are obviously combinations of these basic patterns so just multiply and sum it all up.

There is sometimes a benefit of scaling your total a bit. So, instead of reporting ‘3’ as the total (and do worked(1) for each item) you may consider scaling with, say 1000, and reporting ‘3000’ instead (and do worked(1000) for each item). The benefit comes up when you are farming out work to subtasks through a SubProgressMonitor; since they may internally have a very different total, especially one that is much bigger than your total, you give them (and the monitor instance) some ‘room’ to more smoothly consume and display the allotment you’ve given them (more explanations below on how to mix worked() and SubProgressMonitor work below). Consider that you say ‘my total is 3’ and you then give a subtask ‘1’ of these to consume. If the subtask now will report several thousand worked() calls, and assuming that the actual human visible progress bar also has the room, the internal protocol between a SubProgressMonitor and it’s wrapped monitor will scale better and give more smooth movement if you instead would have given it ‘1000’ out of ‘3000’. Or not - the point is really that you don't know what monitor implementation will be active, all you can do is give some information. How it's then displayed in reality is a matter of how nifty the progress monitor implementation is.

A sample of simple calls:
monitor = … // somehow get a new progress monitor which is in a pristine state
int total = 3 // hardcoded and known
try
{
monitor.beginTask(total)

// item 1
this.doPart1()
monitor.worked(1)

// item 2
this.doPart2()
monitor.worked(1)

// item 3
this.doPart3()
monitor.worked(1)
}
finally
{
monitor.done()
}
No reason to scale and no collection to dynamically compute.

A more elaborate sample:
monitor = … // somehow get a new progress monitor which is in a pristine state
int total = thingyList.size() * 3 + 2
try
{
monitor.beginTask(total)

// item 1
this.doBeforeAllThingies()
monitor.worked(1)

// items 2 to total-1
for (Thingy t : thingyList)
{
t.doThisFirst()
monitor.worked(1)
t.thenDoThat()
monitor.worked(1)
t.lastlyDoThis()
monitor.worked(1)
}

// final item
this.doAfterAllThingies()
monitor.worked(1)
}
finally
{
monitor.done()
}

Mixing straightforward calls with subtasks

I was initially confused by how to report progress when I farmed out work to subtasks. I experienced ‘reporting too much work’ since I didn’t understand when to call and when to not call worked(). Once I caught on, the rule is very simple however: calling a subtask with a SubProgressMonitor is basically an implicit call to worked() with the amount allotted to the subtask. So instead of this:
monitor = … // somehow get a new progress monitor which is in a pristine state
int scale = 1000
int total = 3 // hardcoded and known
try
{
monitor.beginTask(total * scale)

// item 1
this.doPart1()
monitor.worked(1 * scale)

// item 2
this.doPart2(new SubProgressMonitor(monitor, 1 * scale)) // allot 1 item
monitor.worked(1 * scale) // WRONG! Not needed, already managed by the SubProgressMonitor

// item 3
this.doPart3()
monitor.worked(1 * scale)
}
finally
{
monitor.done()
}
You should just leave out the second call to worked().
Tip Never pass IProgressMonitor.UNKNOWN (or any other negative value) when creating a SubProgressMonitor() wrapper!
A situation I just the other day experienced was when doing an IProgressMonitor.UNKNOWN number of things - I needed to call a subtask, and hence I set up to call it using a SubProgressMonitor(parent, subticks) but I realized that I hadn't ever considered how the sub monitor should be created - how many subticks it should be given - in the unknown case. I figured it should be ok to pass  IProgressMonitor.UNKNOWN there also. However, when later trying my code I saw to my horror that my progress bar went backwards! Not the effect I figured on...

As it turns out, this is because the implementation (as of Eclipse 3.2M3) blindly uses the incoming ticks as a scaling factor. However, it goes haywire when it receives a negative value (and IProgressMonitor.UNKNOWN happens to have a value of -1). It does computations with it, and it ends up calling worked() with negative values which my monitor tried to process...that code is now fixed to be more resilient in such cases. I've filed bug #119018 to request that SubProgressMonitor handles it better and/or document that negative values is a bad idea for the constructor call.

Whatever, passing  IProgressMonitor.UNKNOWN is incorrect in any case. If you have called beginTask() using  IProgressMonitor.UNKNOWN you can gladly pass in any reasonable tick value to a SubProgressMonitor, it will give the correct result.

Ensure to always complete your monitor!

Consider the concept described in the previous section: the important thing here is that basically, you say that you have three distinct and logical things to do, and then you tick them off - but one of the ticks is actually farmed out to a subtask through a SubProgressMonitor. You don't really know how many distinct and logical things the subtask has to do, nor should you care. The mechanics of using a SubProgressMonitor makes the advancement of one of  your ticks happen in the correct way. So, the end expectation is that once you reach the end of your three things, the monitor you have, have actually fulfilled the count you intended - the internal state of it should reflect this: "the user said three things should happen and my work count is now indeed '3'".

But, as I recently found out, this can fail. Specifically, I blindly invoked IProject.build() on a project which had no builders configured. To this method I sent in a SubProgressMonitor and allotted it one 'tick' of mine. But, as it turned out, internally it never used the monitor it got, presumably because there was no work to perform - not very unreasonable in a sense. However, this did have the effect that one of my ticks never got, well, 'tocked' :-). I could solve this specific problem by simply checking if there was any builders configured, and if there were none, I simply advanced the tick by  worked(1) instead. But, it requires me, the caller, to make assumptions on the internal workings of the subtask, which is never good.

This is not a huge problem of course. But, I think it would make sense to always act the same. The code resulting from IProject.build() could just call beginTask("", countOfBuilders) regardless of if countOfBuilders was 0, iterate over the empty array or whatever, and then call done(). This would correctly advance my tick.

Cancellation

The sample code above does not show cancellation checks. However, it is obviously recommended that users of a progress monitor actively check for cancellation to timely break out of the operation. The more (potentially) long-running, the more important of course. And remember: you don't know if the operation is running in a context that allows it to be canceled or not - so you just have to code defensively. A sample of how it should look could be this:
monitor = … // somehow get a new progress monitor which is in a pristine state
try
{
monitor.beginTask(thingyList.size())

for (Thingy t : thingyList)
{
if(monitor.isCanceled())
throw new OperationCanceledException();
t.doSomething()
monitor.worked(1)
}
}
finally
{
monitor.done()
}

The NullProgressMonitor

A common pattern is to allow callers to skip sending a monitor, i.e. sending ‘ null’. A simple and convenient way to deal with such calls is this:
public void doIt(IProgressMonitor monitor)
{
// ensure there is a monitor of some sort
if(monitor == null)
monitor = new NullProgressMonitor();

try
{
monitor.beginTask(thingyList.size())

for (Thingy t : thingyList)
{
if(monitor.isCanceled())
throw new OperationCanceledException();
t.doSomething()
monitor.worked(1)
}
}
finally
{
monitor.done()
}
}

Conclusion

I believe that by diligently following these rules and patterns, you will never have a problem in using the progress monitor mechanism. Obviously, it requires implementations to follow the contract as well. But remember, if you mistreat the protocol you will sooner or later end up talking to a progress monitor implementation that is stern and will simply throw an exception or give strange visual effects if you call it’s beginTask() one time too many. It’s essentially valid if the IProgressMonitor interface description is to be believed – and you will get blamed by your customer…
深度学习是机器学习的一个子领域,它基于人工神经网络的研究,特别是利用多层次的神经网络来进行学习和模式识别。深度学习模型能够学习数据的高层次特征,这些特征对于图像和语音识别、自然语言处理、医学图像分析等应用至关重要。以下是深度学习的一些关键概念和组成部分: 1. **神经网络(Neural Networks)**:深度学习的基础是人工神经网络,它是由多个层组成的网络结构,包括输入层、隐藏层和输出层。每个层由多个神经元组成,神经元之间通过权重连接。 2. **前馈神经网络(Feedforward Neural Networks)**:这是最常见的神经网络类型,信息从输入层流向隐藏层,最终到达输出层。 3. **卷积神经网络(Convolutional Neural Networks, CNNs)**:这种网络特别适合处理具有网格结构的数据,如图像。它们使用卷积层来提取图像的特征。 4. **循环神经网络(Recurrent Neural Networks, RNNs)**:这种网络能够处理序列数据,如时间序列或自然语言,因为它们具有记忆功能,能够捕捉数据中的时间依赖性。 5. **长短期记忆网络(Long Short-Term Memory, LSTM)**:LSTM 是一种特殊的 RNN,它能够学习长期依赖关系,非常适合复杂的序列预测任务。 6. **生成对抗网络(Generative Adversarial Networks, GANs)**:由两个网络组成,一个生成器和一个判别器,它们相互竞争,生成器生成数据,判别器评估数据的真实性。 7. **深度学习框架**:如 TensorFlow、Keras、PyTorch 等,这些框架提供了构建、训练和部署深度学习模型的工具和库。 8. **激活函数(Activation Functions)**:如 ReLU、Sigmoid、Tanh 等,它们在神经网络中用于添加非线性,使得网络能够学习复杂的函数。 9. **损失函数(Loss Functions)**:用于评估模型的预测与真实值之间的差异,常见的损失函数包括均方误差(MSE)、交叉熵(Cross-Entropy)等。 10. **优化算法(Optimization Algorithms)**:如梯度下降(Gradient Descent)、随机梯度下降(SGD)、Adam 等,用于更新网络权重,以最小化损失函数。 11. **正则化(Regularization)**:技术如 Dropout、L1/L2 正则化等,用于防止模型过拟合。 12. **迁移学习(Transfer Learning)**:利用在一个任务上训练好的模型来提高另一个相关任务的性能。 深度学习在许多领域都取得了显著的成就,但它也面临着一些挑战,如对大量数据的依赖、模型的解释性差、计算资源消耗大等。研究人员正在不断探索新的方法来解决这些问题。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值