该篇为英文,自己google翻译。
原文地址:https://blog.codinghorror.com/the-problem-with-logging/
The Problem With Logging
日志记录的问题
A recent Stack Overflow post described one programmer's logging style. Here's what he logs:
最近的一篇堆栈溢出帖子描述了一位程序员的日志样式。以下是他记录的:
INFO Level
INFO级别
The start and end of the method
方法的开始和结束
The start and end of any major loops
任何主要循环的开始和结束
The start of any major case/switch statements
任何主要案例/切换语句的开头
DEBUG Level
DEBUG级别
Any parameters passed into the method
传递给方法的任何参数
Any row counts from result sets I retrieve
我检索的结果集中的任何行计数
Any datarows that may contain suspicious data when being passed down to the method
当传递给方法时,任何可能包含可疑数据的数据行
Any "generated" file paths, connection strings, or other values that could get mungled up when being "pieced together" by the environment.
任何“生成的”文件路径,连接字符串或其他值可能被环境“拼凑在一起”时会杂乱无章。
ERROR Level
ERROR级别
Handled exceptions
处理的异常
Invalid login attempts (if security is an issue)
无效的登录尝试(如果存在安全问题)
Bad data that I have intercepted forreporting
我截取报告的不良数据
FATAL Level
FATAL级别
Unhandled exceptions.
未处理的异常。
I don't mean to single out the author here, but this strikes me as a bit .. excessive.
我不是要在这里挑出作者,但这让我有些吃惊。
Although I've never been a particularly big logger, myself, one of my teammates on Stack Overflow is. So when building Stack Overflow, we included log4net, and logged a bunch of information at the various levels. I wasn't necessarily a big fan of the approach, but I figured what's the harm.
尽管我从来都不是一个特别大的记录员,但是我自己,我在Stack Overflow上的一个队友是。因此,在构建Stack Overflow时,我们包括了log4net,并记录了各个级别的大量信息。我不一定是这种方法的忠实拥护者,但我知道有什么害处。
Logging does have a certain seductive charm. Why not log as much as you can whenever you can? Even if you're not planning to use it today, who knows, it might be useful for troubleshooting tomorrow. Heck, just log everything! What could it possibly hurt?
记录日志确实具有一定的诱人魅力。为什么不尽可能多地记录日志?知道的,即使您今天不打算使用它,也可能对明天进行故障排除很有用。哎呀,只需记录所有内容!它可能会伤害什么?
Oh, sure, logging seems harmless enough, but let me tell you, it can deal some serious hurt. We ran into a particularly nasty recursive logging bug:
哦,可以肯定,记录日志似乎无害,但是让我告诉您,它可能造成一些严重伤害。我们遇到了一个特别讨厌的递归日志记录错误:
- On thread #1, our code was doing Log (lock) / DB stuff (lock)
- 在线程#1上,我们的代码正在执行Log(锁定)/ DB事务(锁定)
- On thread #2, our code was doing DB stuff (lock) / log stuff (lock)
- 在线程2上,我们的代码正在执行DB操作(锁定)/日志操作(锁定)
If these things happened close together enough under heavy load, this resulted in -- you guessed it -- a classic out-of-order deadlock scenario. I'm not sure you'd ever see it on a lightly loaded app, but on our website it happened about once a day on average.
如果这些事情在重负载下发生得足够紧密,那么就会导致-您猜到了-经典的乱序死锁场景。我不确定您是否会在轻负载的应用程序上看到它,但是在我们的网站上,它平均每天大约发生一次。
I don't blame log4net for this, I blame our crappy code. We spent days troubleshooting these deadlocks by .. wait for it .. adding more logging! Which naturally made the problem worse and even harder to figure out. We eventually were forced to take memory dumps and use dump analysis tools. With the generous assistance of Greg Varveris, we were finally able to identify the culprit: our logging strategy. How ironic. And I mean real irony, not the fake Alanis Morrissette kind.
我不怪log4net,我怪我们糟糕的代码。我们花了几天的时间来解决这些死锁问题通过,..等待它(wait for it 可以忽略)..添加更多日志记录!这自然使问题变得更糟,甚至更难解决。我们最终被迫进行内存转储并使用转储分析工具。在Greg Varveris的慷慨帮助下,我们终于能够找到罪魁祸首:我们的日志记录策略。多讽刺。我的意思是真正的讽刺,而不是假的Alanis Morrissette(人名)那种。
Although I am a strong believer in logging exceptions, I've never been a particularly big fan of logging in the general "let's log everything we possibly can" sense:
虽然我非常相信日志记录异常,但我从来不是一个特别喜欢日志记录的人,一般来说“让我们记录所有可能发生的事情”的感觉是:
- Logging means more code. If you're using a traditional logging framework like log4net, every logged event is at least one additional line of code. The more you log, the larger your code grows. This is a serious problem, because code is the enemy. Visible logging code is clutter -- like excessive comments, it actively obscures the code that's doing the real work in the application.记录日志意味着更多代码。如果您使用的是传统的日志记录框架(例如log4net),则每个记录的事件至少要包含一行代码。您记录的越多,代码的增长就越大。这是一个严重的问题,因为代码是敌人。可见的日志记录代码很混乱——就像过多的注释一样,它会使在应用程序中执行实际工作的代码变得模糊。
- Logging isn't free. Most logging frameworks are fairly efficient, but they aren't infinitely fast. Every log row you write to disk has an overall performance cost on your application. This can also be tricky if you're dissecting complex objects to place them in the log; that takes additional time.记录不是免费的。大多数日志记录框架都相当高效,但并不是无限快。写入磁盘的每个日志行都会对应用程序造成整体性能损失。如果要剖析复杂的对象以将它们放入日志中,这也会很棘手;这需要额外的时间。
- If it's worth saving to a logfile, it's worth showing in the user interface. This is the paradox: if the information you're logging is at all valuable, it deserves to be surfaced in the application itself, not buried in an anonymous logfile somewhere. Even if it's just for administrators. Logfiles are all too often where useful data goes to die, alone, unloved and ignored.如果值得保存到日志文件,则值得在用户界面中显示。这是自相矛盾的:如果您要记录的信息非常有价值,则应将其显示在应用程序本身中,而不是将其隐藏在某个地方的匿名日志文件中。即使仅适用于管理员。日志文件经常是有用数据消失的地方,孤独,不受欢迎和忽视。
- The more you log, the less you can find. Log enough things and eventually your logs are so noisy nobody can find anything. It's all too easy to bury yourself in an avalanche of log data. Heck, that's the default: any given computer is perfectly capable of generating more log data than any of us could possibly deal with in our lifetime. The hidden expense here isn't the logging, it's the brainpower needed to make sense of these giant logs. I don't care how awesome your log parsing tools are, nobody looks forward to mining a gigabyte of log files for useful diagnostic information.你记录的越多,你能找到的就越少。记录足够多的东西,最终您的日志太多干扰(噪音),没人能找到任何东西。将自己埋没在大量的日志数据中太容易了。哎呀,这是默认设置:任何给定的计算机都能完美地生成比我们一生中可能处理的日志数据更多的日志数据。这里隐藏的费用不是记录日志,而是理解这些巨大日志所需的智力。我不在乎您的日志解析工具有多棒,没有人期望挖掘出千兆字节的日志文件以获取有用的诊断信息。
- The logfile that cried Wolf. Good luck getting everyone on your team to agree on the exact definitions of FATAL, ERROR, DEBUG, INFO, and whatever other logging levels you have defined. If you decide to log only the most heinous serial-killer mass-murderer type problems, evil has a lot less room to lurk in your logfiles -- and it'll be a heck of a lot less boring when you do look.哭狼(幽默)的日志文件。祝您好运,让团队中的每个人都能就FATAL、ERROR、DEBUG、INFO以及您定义的其他日志级别的确切定义达成一致。如果你决定只记录那些最令人发指的连环杀手、大规模杀人犯类型的问题,那么邪恶在你的日志文件中潜伏的空间就小得多了——当你仔细看的时候,它就不会那么无聊了。
So is logging a giant waste of time? I'm sure some people will read about this far and draw that conclusion, no matter what else I write. I am not anti-logging. I am anti-abusive-logging. Like any other tool in your toolkit, when used properly and appropriately, it can help you create better programs. The problem with logging isn't the logging, per se -- it's the seductive OCD "just one more bit of data in the log" trap that programmers fall into when implementing logging. Logging gets a bad name because it's so often abused. It's a shame to end up with all this extra code generating volumes and volumes of logs that aren't helping anyone.
那么,记录日志是否浪费时间呢?我敢肯定,无论我写什么东西,有些人都会读到这一点并得出结论。我不是反对日志记录。我是反对滥用日志记录。与工具包中的任何其他工具一样,如果使用得当,它可以帮助您创建更好的程序。日志记录的问题本身并不是日志记录——而是程序员在实现日志记录时会陷入的诱人的OCD“日志中只有一点数据”陷阱。日志记录因为经常被滥用而名声不好。遗憾的是,这些额外的代码生成的大量日志对任何人都没有帮助。
We've since removed all logging from Stack Overflow, relying exclusively on exception logging. Honestly, I don't miss it at all. I can't even think of a single time since then that I'd wished I'd had a giant verbose logfile to help me diagnose a problem.
此后,我们已完全依赖异常日志记录,从Stack Overflow中删除了所有日志记录。我一点也不错过。从那以后,我甚至想不出有没有一次我希望我有一个巨大的详细日志文件来帮助我诊断问题。
When it comes to logging, the right answer is not "yes, always, and as much as possible." Resist the tendency to log everything. Start small and simple, logging only the most obvious and critical of errors. Add (or ideally, inject) more logging only as demonstrated by specific, verifiable needs.
在进行日志记录时,正确的答案不是“是的,总是,而且尽可能多”。抵制记录所有内容的倾向。从小而简单开始,仅记录最明显和最关键的错误。仅根据特定的,可验证的需求需要,才添加(或理想情况下是注入)更多日志记录。
If you aren't careful, those individual log entries, as wafer thin as they might be, have a disturbing tendency to make your logs end up like the unfortunate Mr. Creosote.
如果你不小心,那些单独的日志条目,尽管它们可能很薄(少),却有一种令人不安的倾向,使你的日志最终像不幸的Creosote先生一样结束。
Continue Discussion
继续讨论——挑选了一下网友评论
——
In your example, were you logging to a database? If you were logging to a database, were you using the same connection for the logging and the transactional work?
在您的示例中,您是否登录到数据库?如果要登录到数据库,是否使用相同的连接进行日志记录和事务性工作?
Using the same connection for both could lead to potential problems - the least of which would be deadlocks. What happens to any log messages when the transactional work get rolled back?
两者使用相同的连接可能会导致潜在的问题-最少会出现死锁。当事务工作回滚时,任何日志消息会如何处理?
——
I still haven’t seen a clear root cause of the deadlock error.
我仍然没有发现死锁错误的明确根源。
Please tell me how your log4net can deadlock if it is only acquiring one lock? There is no chance for deadlock in this case.
请告诉我,如果log4net仅获得一个锁,它将如何死锁?在这种情况下,没有死锁的机会。
In the case where multiple locks are used, the simple solution might be just to disable the logger lock if you can. The logger lock should only be to serialize the output to a non-threadsafe resource.
在使用多个锁的情况下,简单的解决方案可能是仅禁用记录器锁。记录器锁应仅用于将输出序列化为非线程安全资源。
The real problem here is likely poor documentation in log4net on how to setup logging to a device that implements its own thread-safe locking and/or a discussion on this exact problem which should have been realized before log4net was ever released.
真正的问题可能是log4net中关于如何设置记录到实现其自己的线程安全锁定的设备的日志的文档不多,以及/或有关此确切问题的讨论,而该问题应该在log4net发布之前就已经实现。
——
Sometimes logging can be useful as a tool to prove to the customer that he’s doing something wrong.
有时,日志记录可用作向客户证明他做错了什么的工具。
Our programs interface to large PBX’s and a host of other telephone-related systems. Looking into the log, peering over the different binary exchanges between the systems, and being able to show to the customer that it is, in fact, his PBX that is doing something wrong, has saved us hours and days of unbillable support time.
我们的程序与大型PBX和其他电话相关系统的接口。查看日志,查看系统之间不同的二进制交换,并能够向客户显示,实际上,是他的PBX出了问题,这为我们节省了数小时和数天的可靠支持时间。
I’d say that in any situation where you’re communication between different systems, log what’s happening. If not, you’ll have no clue which system that actually malfunctioned.
我想说的是,在任何不同系统之间进行通信的情况下,都要记录发生了什么。如果没有,你就不知道是哪个系统出了故障。
——
The code footprint of logging can be greatly reduced by using AOP techniques. It’s pretty much the canonical example of aspect oriented programming. A simple method boundary aspect (straight from the postsharp site) looks something like:
使用AOP技术可以大大减少日志记录的代码占用量。这几乎是面向方面编程的典型示例。一个简单的方法边界方面(直接来自postharhar网站)看起来像:
public class SimplestTraceAttribute : OnMethodBoundaryAspect
{
public override void OnEntry( MethodExecutionEventArgs eventArgs)
{
Trace.TraceInformation(Entering {0}., eventArgs.Method);
Trace.Indent();
}
public override void OnExit( MethodExecutionEventArgs eventArgs)
{
Trace.Unindent();
Trace.TraceInformation(Leaving {0}., eventArgs.Method);
}
}
In postsharp these aspects can be projected onto your types conditionaly at compilation time using either method/class level attributes or assembly level multicast.
在后期处理中,可以使用方法/类级属性或程序集级多播在编译时有条件地将这些方面投影到您的类型上。
[assembly:SimplestTraceAttribute (AttributeTargetTypes=TypesThatIWantToLog;AndSomeOtherTypes)]
Having these as aspects also allows you to produce versions on the fly with higher levels of instrumentation.
将这些方面作为方面还可以使您以更高的检测水平即时生成版本。
There are also dynamic proxy based techniques that work well in combination with dependency injection to achieve the same result.
还有一些基于动态代理的技术可以很好地与依赖项注入结合使用以达到相同的结果。
I agree with your overall point though,knowing what to log is a fine art; one that I am more than a little amateurish at myself.
但是,我同意您的总体观点,知道记录什么是一门精美的艺术品。我对自己不只是一个业余爱好者。
——
总结一下:
- 不要滥用日志,会影响到效率,并且是代码可读性变差。
- 抵制记录所有内容的倾向。从小而简单开始,仅记录最明显和最关键的错误。
- 仅根据特定的,可验证的需求需要,才添加(或理想情况下是注入)更多日志记录。
- 团队中约定日志记录的规范,包括内容规范,级别使用规范,使用位置规范等等。。。
- 异步记录日志要比同步更好。
- 使用aop技术可以减少日志记录的代码占用量