可维护性记录了他们的使用

保养102(Maintenance 102)

In the first chapter, we saw how to handle exceptions and how to log them to:

在第一章中,我们了解了如何处理异常以及如何将异常记录到:

  • not leak sensitive data to the user.

    不会将敏感数据泄漏给用户。
  • have user user-friendly messages displayed for your users.

    为您的用户显示用户友好的消息。
  • have all the details for easy troubleshooting.

    提供所有详细信息,以方便进行故障排除。

In this second article, we will see what needs to be logged beside exceptions, how to log them, and why we are logging them.

在第二篇文章中,我们将看到异常旁边需要记录什么,如何记录它们以及为什么记录它们。

The examples in the article are in PHP, using the PSR-3: Logger Interface; but the logging concepts we will discuss are common to all languages.

本文中的示例在PHP中使用PSR-3:Logger接口; 但是我们将讨论的日志记录概念对于所有语言都是通用的。

记录什么以及如何记录 (What to Log & How to Log)

To decide what needs to be logged, we must first decide why we are logging the information.

要决定需要记录什么,我们必须首先决定为什么要记录信息。

  • Most often we need to log information to make diagnostic easier. For example, if an API returns an error we will log the request & response.

    通常,我们需要记录信息以简化诊断。 例如,如果API返回错误,我们将记录请求和响应。

  • We also log to for debug purposes, For example, we might log all api requests & responses in order the debug the api endpoint.

    我们还出于调试目的登录到,例如,我们可能会记录所有api请求和响应,以便对api端点进行调试。

  • We can also log for statistics; this can be used for various reasons.

    我们也可以登录统计; 出于各种原因可以使用它。

记录诊断 (Logging for diagnostic)

The most important log for diagnostic is the exception logs; those we have already discussed previously. But we can also log other “errors” for all errors are not exceptions.

诊断最重要的日志是异常日志。 我们之前已经讨论过的那些。 但是我们也可以记录其他“错误”,因为所有错误都不例外。

One of the best examples of this is API calls; there are other cases as well; a missing file; an empty database query…. The cases depend upon your application.

API调用就是最好的例子之一。 还有其他情况; 丢失的文件; 空的数据库查询...。 情况取决于您的应用程序。

Let’s see an example with an api call

让我们看一个带有api调用的示例

<?php$response = $this->httpClient->get($endpoint, $query);
if ($response->getStatusCode() != 200) {
$this->logger->warning(
"Failed to fetch test data from test api",
[
"endpoint" => $endpoint,
"query" => $query,
"status_code" => $response->getStatusCode(),
"body" => $response->getBody()
]
); throw new MyApiException("Failed to fetch data from test api");
}

In this example, we logged everything we need to make the same api call locally to check what is going on. We also used a unique log message so that if we find the log message in our logs we can find the code that has logged it.

在此示例中,我们记录了在本地进行相同的api调用所需的所有信息,以检查发生了什么。 我们还使用了唯一的日志消息,因此,如果我们在日志中找到该日志消息,则可以找到记录该日志消息的代码。

So the point here is to :

所以这里的重点是:

  • Log what you need to reproduce the error locally and you can’t get by any other means.

    记录您需要在本地重现错误的内容,而您无法通过其他任何方式获取该错误。

  • Log what you believe will allow a quick diagnostic (The body)

    记录您认为可以进行快速诊断的内容(身体)

  • Log a unique message that has meaning.

    记录具有含义的唯一消息。

  • Do not log every step of the code!

    不要记录代码的每一步!

You need to keep in mind not to log sensitive data such as passwords.

您需要记住不要记录敏感数据,例如密码。

记录调试 (Logging for Debug)

Logging for debugging is very similar to logs for diagnostic. If you have api’s that are very sensitive and you can’t make api calls locally easily(payments for example). Or you do not trust the api. It can be a good idea to log everything.

调试日志与诊断日志非常相似。 如果您有非常敏感的api,并且无法轻松地在本地进行api调用(例如付款)。 否则您不信任api。 记录所有内容可能是一个好主意。

Let’s add debugging data to our previous example;

让我们在前面的示例中添加调试数据。

<?php
$this->logger->debug(
"Fetching test data from test api",
[
"endpoint" => $endpoint,
"query" => $query,
]
);$response = $this->httpClient->get($endpoint, $query);
if ($response->getStatusCode() != 200) {
$this->logger->warning(
"Failed to fetch test data from test api",
[
"endpoint" => $endpoint,
"query" => $query,
"status_code" => $response->getStatusCode(),
"body" => $response->getBody()
]
);throw new MyApiException("Failed to fetch data from test api");
}$this->logger->debug(
"Fetched tes data from test api",
[
"endpoint" => $endpoint,
"query" => $query,
"body" => $response->getBody() ]
);

Here we have added 2 additional logs:

在这里,我们添加了2个其他日志:

  • The first log is before making the call to the API. We log all the information we have at that point to be able to execute the same call locally if possible, or comparing with working calls.

    第一个日志是在调用API之前。 我们记录了此时的所有信息,以便能够在本地执行相同的调用,或者与工作调用进行比较。
  • The second log has the response, if we are working with payments this might be the only way to see a complete response. (Some payment methods have additional information on the production they don’t have in the sandbox).

    第二个日志包含响应,如果我们正在处理付款,这可能是查看完整响应的唯一方法。 (某些付款方式会提供沙盒中没有的其他生产信息)。

So the point here is to :

所以这里的重点是:

  • Log what we do.

    记录我们的工作。

  • Log the result.

    记录结果。

Like in the previous case you need to keep in mind not to log sensitive data such as passwords.

与前面的情况一样,您需要记住不要记录敏感数据,例如密码。

We also have another trick up our sleeves here. In our previous example we logged using “warning” here we used “debug”. With most frameworks (Magento 😥) you can configure the level that should be logged. Constantly logging debug information is useless. So you need to enable/disable these logs when needed.

我们在这里还有另一招。 在前面的示例中,我们使用“警告”进行记录,此处使用“ debug”。 在大多数框架(Magento😥)中,您可以配置应记录的级别。 不断记录调试信息是没有用的。 因此,您需要在需要时启用/禁用这些日志。

To be able to use this in symfony for example; I would recommend having multiple channels set up and have our custom channels use custom handlers. That way we can easily enable disable the debug logs for a given part of the application.

为了能够在symfony中使用它;例如我建议设置多个渠道,并让我们的自定义渠道使用自定义处理程序。 这样,我们可以轻松地为应用程序的给定部分启用禁用调试日志。

记录统计信息 (Log for Statistics)

Finally, we can also log to do statistics. This is possible if your logs are parsed and sent to tools such as Elasticsearch/Kibana, Elasticsearch/Grafana, Loki/Grafana, Datadog. On the third maintenance article, we will discuss more these tools and do a quick comparison. For now, let us just say that it is possible to use logs for statistics.

最后,我们还可以登录进行统计。 如果您的日志被解析并发送到诸如Elasticsearch / Kibana,Elasticsearch / Grafana,Loki / Grafana,Datadog之类的工具,则可能发生这种情况。 在第三篇维护文章中,我们将讨论更多这些工具并进行快速比较。 现在,让我们只说可以使用日志进行统计。

Our current logs can already be used to make statistics. For example, we can use the “Fetching data from test api” debug log to count the number of calls made to the API. This can be important information.

我们当前的日志已经可以用来进行统计。 例如,我们可以使用“从测试api获取数据”调试日志来计算对API的调用次数。 这可能是重要的信息。

The only thing we would need to change is to increase the log level from debug to info.

我们唯一需要更改的是将日志级别从调试更改为信息。

There is also another usage for statistics, lets use our previous example again. We can use the “Fetching data from test api” logs and the “Failed to fetch test data from test api” logs to have an idea of the fail rate of the api. If you are using Grafana you can also use this for alerting purposes.

统计数据还有另一种用法,让我们再次使用前面的示例。 我们可以使用“从测试api获取数据”日志和“无法从测试api获取测试数据”日志来了解api的失败率。 如果您正在使用Grafana,则也可以将其用于警报目的。

I recently had such a case on production; without touching the code we were able to add an alert for a specific case that was not documented in the api we were using. This allowed our client to communicate with their customers about the issue. Thanks to grafana theye were able to see all concerned customers and didn’t need to wait for a negative customer feedback. Meanwhile we exchanged with the api maintainers about the best approach to take to get around the issue.

我最近在生产中有这样的案例。 无需触摸代码,我们就可以针对所使用的api中未记录的特定情况添加警报。 这使我们的客户可以与客户就此问题进行沟通。 多亏了grafana,您可以看到所有相关的客户,而无需等待负面的客户反馈。 同时,我们与api维护人员交换了解决该问题的最佳方法。

This means once you identify 1 isolated issue, you can quickly find if there are other similar issues. Communicate on the subject making the issue less critical.

这意味着一旦识别出一个孤立的问题,就可以快速找到是否存在其他类似的问题。 就该主题进行交流,使问题的重要性降低。

Using logs for statistics is cheap in development time. In my opinion, every project should have some log parsing anyway, so using that parsing to make some statistics is very useful.

在开发时间中使用日志进行统计很便宜。 我认为,每个项目都应该进行一些日志解析,因此使用该解析进行一些统计非常有用。

There are of course limits; you are probably logging way more information than you need for the statistics you are extracting. This can be a good thing as it can be used to add an alert you hadn’t thought about. But this also means you can’t just keep all the logs indefinitely. It would cost too much to store them all.

当然有限制。 您可能正在记录比提取的统计信息更多的信息。 这可能是一件好事,因为它可以用来添加您从未考虑过的警报。 但这也意味着您不能无限期地保存所有日志。 存储它们全部将花费太多。

为什么日志如此重要? (Why are logs so important?)

All applications have bugs; bugs they inherit from the software they are based on (cf Magento), bugs coming from the infrastructure, bugs that you develop, bugs from the application you connect to… There are bugs somewhere.

所有应用程序都有错误; 他们从其所基于的软件(参见Magento)继承的错误,来自基础结构的错误,您开发的错误,您所连接的应用程序的错误……某些地方存在错误。

So at some, you are going to have a problem to resolve.

因此,在某些情况下,您将需要解决一个问题。

Properly logging will hopefully allow you to get a better vision of the problem as fast as possible.

正确记录将有望使您尽快更好地了解问题。

If you do have proper logs for your issue, this will allow you to take the following steps:

如果您的问题确实有正确的日志,则可以执行以下步骤:

具有追溯力 (Be retroactive)

Will allow you to communicate on the issue before customers notice it, or get too much frustrated by it.

将使您能够在客户发现问题之前就此问题进行沟通,或者让您感到沮丧。

For example, we have an issue on some orders that are not shipping because of api calls to the carrier endpoints failing.

例如,由于对运营商端点的api调用失败,因此某些无法发货的订单存在问题。

If we can list the concerned orders we can communicate and allow our customers to plan better.

如果我们可以列出相关订单,我们可以进行沟通,并让我们的客户更好地计划。

了解影响 (Understand the impact)

You have identified the issue and started to communicate on the subject. Thanks to the logs you know the orders that are concerned and you can understand the impact.

您已找到问题,并开始就此问题进行交流。 借助日志,您可以了解所涉及的订单,并且可以了解其影响。

For example, your website has 1000 orders per day and you discover that 1 order per day is having an issue. If it’s Friday 6 pm you probably wouldn’t want to fix the issue. it would risk the other 999 orders. And if something went wrong with your fix no one would notice it during the weekend.

例如,您的网站每天有1000个订单,而您发现每天有1个订单存在问题。 如果是星期五下午6点,您可能不想解决此问题。 它会冒其他999个订单的风险。 而且,如果您的修复出现问题,那么周末没人会注意到它。

了解问题 (Understand the issue)

With the proper log, you get closer to understand the origin of the issue. Is our api call to the carrier in error:

使用正确的日志,您可以更深入地了解问题的根源。 是我们对承运人的api调用错误:

  • because of a mistake in the code

    由于代码错误
  • because of the proxy or another issue with the infrastructure?

    是因为代理还是基础架构存在其他问题?
  • because of the endpoint being unstable?

    因为端点不稳定?

The logs will most often not give you a straight answer but it will make debugging easier than having none.

日志通常不会给您直接的答案,但是比没有日志更容易调试。

如何解决问题 (How to fix the issue)

We must now find a solution. The solution can be implemented in multiple steps.

现在我们必须找到一个解决方案。 该解决方案可以分多个步骤实施。

In our example case, we can have at first a person manually shipping the orders. This might need to be done if the issue is long to fix. We know there is a single order per day and therefore doing this manually would be simple.

在我们的示例案例中,我们可以首先让一个人手动发送订单。 如果问题很难解决,则可能需要执行此操作。 我们知道每天只有一个订单,因此手动进行会很简单。

Step 2 would, of course, be to fix the issue.

当然,第2步将解决该问题。

结论 (Conclusion)

Will you be able to put logs everywhere that needs it to be able to debug future issues? You won’t be able to. That is why you must not overthink it either.

您能否将日志放置在需要它的地方,以便能够调试未来的问题? 您将无法。 这就是为什么您也不要考虑太多。

I have rarely come across an application that logs too much. I have often complained about not having enough logs, particularly in payment modules.

我很少遇到记录过多的应用程序。 我经常抱怨没有足够的日志,尤其是在支付模块中。

Does that mean that you should log everything? No; this is a case I came across one project I worked on. Opening a single page created over 500 lines of logs. There were logs in conditions to trace what was happening, logs in loops… Removing all the logs allowed page load times to be divided by 3.

这是否意味着您应该记录所有内容? 没有; 这是我遇到一个我从事的项目的情况。 打开一个页面创建了500多行日志。 有登录条件可以跟踪发生的情况,有循环登录...删除所有日志可以将页面加载时间除以3。

Debug logs can be used for particularly complex and critical bits of code. But those should be disabled most of the time. Writing a log has a cost. You are accessing the file system after all.

调试日志可用于特别复杂和关键的代码位。 但是,大多数时候应该禁用这些功能。 编写日志是有成本的。 毕竟,您正在访问文件系统。

You must think of log rotation and deleting your logs. Depending on the situation you might need to keep some logs longer than others. Logs are not meant to be kept and you need a strategy to get rid of the old ones.

您必须考虑日志轮换和删除日志。 根据情况,您可能需要保留一些日志比其他日志更长的时间。 日志不是要保留的,您需要一种策略来消除旧的。

In this article, we have seen what to log, and what to log.

在本文中,我们看到了要记录的内容以及要记录的内容。

The issue is using grep to search in log files has its limits. Particularly if we wish to start searching in the context of the log and we search for correlations.

问题是使用grep搜索日志文件有其局限性。 尤其是如果我们希望在日志的上下文中开始搜索并搜索相关性。

Grep’s main limit whoever is for sites that are hosted on multiple servers with load balancing. Logs will be distributed on the multiple servers which makes finding and understanding them harder.

Grep的主要限制是谁将站点托管在具有负载平衡功能的多个服务器上。 日志将分布在多个服务器上,这使得查找和理解它们变得更加困难。

That is why in the next chapter of maintainability we will discuss the various tools that are available to us to parse & aggregate the logs to make them more accessible.

这就是为什么在下一章的可维护性中,我们将讨论可用于解析和汇总日志以使它们更易于访问的各种工具。

Thank you for reading.

感谢您的阅读。

翻译自: https://medium.com/swlh/maintainability-logs-their-use-6ac3bb235e31

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值