解决乱码问题_如何解决编码问题

解决乱码问题

重点 (Top highlight)

Every single line of code ever written was ultimately made with one purpose in mind: to solve problems. No matter what you do, you are solving problems on several scales at once.

E极单一的代码行有史以来最终被一个特制的考虑:解决问题。 无论您做什么,都可以一次解决多个问题。

A small one-liner solves a problem which makes a function work. The function is needed for a data processing pipeline. The pipeline is integrated to a platform which enables a machine learning driven solution for its users.

小单线解决了使功能起作用的问题。 数据处理管道需要该功能。 流水线被集成到一个平台上,该平台为其用户提供了机器学习驱动的解决方案。

Problems are everywhere. Their magnitude and impact might be different, but the general problem solving strategies are the same.

问题无处不在。 它们的规模和影响可能不同,但是一般的问题解决策略是相同的。

As an engineer, developer or data scientist, being effective in problem solving can really supercharge your results and put you before your peers. Some can do this instinctively after years of practice, some has to put conscious effort to learn it. However, no matter who you are, you can and you must improve your problem solving skills.

作为工程师,开发人员或数据科学家,有效解决问题确实可以使您的工作成果超群,并使您居于同行之前 。 经过多年的实践,有些人可以本能地做到这一点,有些人必须付出自觉的努力来学习它。 但是,无论您是谁,都可以而且必须提高您的问题解决能力。

Having a background in research-level mathematics, I had the opportunity to practice problem solving and observe the process. Surprisingly, this is not something which you have to improvise each time. Rather,

我具有研究级数学的背景,因此有机会练习解决问题和观察过程。 令人惊讶的是,这不是您每次都必须即兴创作的东西。 而是

a successful problem solver has several standard tools and a general plan under their belt, adapting as they go.

一个成功的问题解决者拥有多种标准工具和通用计划,可以随时随地进行调整。

In this post, my aim is to give an overview of these tools and use them to create a process, which can be followed any time. To make the situation realistic, let’s place ourselves into the following scenario: we are deep learning engineers, working on an object detection model. Our data is limited, so we need to provide a solution for image augmentation.

在这篇文章中,我的目的是概述这些工具并使用它们创建一个过程,该过程可以随时遵循。 为了使情况变得现实,让我们将自己置于以下情况中:我们是深度学习工程师,正在研究对象检测模型。 我们的数据有限,因此我们需要提供图像增强的解决方案。

Image for post
albumentations README 专辑自述

Augmentation is the process of generating new data from the available images by applying random transformations like crops, blurs, brightness changes, etc. See the image above which is from the readme of the awesome albumentations library.

增强是通过应用随机变换(例如裁切,模糊,亮度变化等)从可用图像中生成新数据的过程。请参见上面的图像,该图像来自令人敬畏的专辑库的自述文件。

You need to deliver the feature by next week, so you need to get working on it right away. How to approach the problem?

您需要在下周之前交付该功能,因此您需要立即进行操作。 如何解决这个问题

(As a mathematician myself, my thinking process is heavily influenced by the book How to Solve It by George Pólya. Although a mathematical problem is different from real life coding problems, this is a must read for anyone who wishes to get good in problem solving.)

(作为我自己的数学家,我的思维过程受到GeorgePólya的《 How to Solve It 》一书的严重影响。尽管数学问题与现实生活中的编码问题不同,但是对于那些希望在解决问题上有所建树的人来说,这是一本必读的书。 )

Image for post
Photo by Agence Olloweb on Unsplash
Agence OllowebUnsplash拍摄的照片

步骤0:了解问题 (Step 0: Understanding the problem)

Before attempting to solve whatever problem you have in mind, there are some questions which need to be answered. Not understanding some details properly can lead to wasted time. You definitely don’t want to do that. For instance, it is good to be clear about the following.

在尝试解决您遇到的任何问题之前,需要先回答一些问题。 如果不正确理解某些细节,可能会浪费时间。 您绝对不想这样做。 例如,最好了解以下内容。

  • What is the scale of the problem? In our image augmentation example, will you need to process thousands of images per second in production, or is it just for you to experiment with some methods? If a production grade solution is needed, you should be aware of this in advance.

    问题的规模是多少? 在我们的图像增强示例中,您是否需要在生产中每秒处理数千张图像,还是只是尝试一些方法? 如果需要生产级解决方案,则应提前意识到这一点。

  • Will other people use your solution? If people are going to work with your code extensively, significant effort must be put into code quality and documentation. On the other end of the spectrum, if this is for your use only, there is no need to work as much on this. (I already see people disagreeing with me :) However, I firmly believe in minimizing the amount work. So, if you only need to quickly try out an idea and experiment, feel free to not consider code quality.)

    其他人会使用您的解决方案吗? 如果人们打算广泛地使用您的代码,则必须在代码质量和文档方面投入大量精力。 另一方面,如果这只是供您使用,则无需为此做太多工作。 (我已经看到人们不同意我的意见:)但是,我坚信将工作量降至最低。 因此,如果您只需要快速尝试一个想法和实验,请不要考虑代码质量。)

  • Do you need a general or a special solution? A lot of time can be wasted on implementing features no one will ever use, including you. In our example, do you need a wide range of image augmentation methods, or just vertical and horizontal flips? In the latter case, flipping the images in advance and adding them to your training set can also work, which requires minimal work.

    您需要一般解决方案还是特殊解决方案? 实施包括您在内的任何人都不会使用的功能会浪费很多时间。 在我们的示例中,您是否需要各种各样的图像增强方法,或者仅需要垂直和水平翻转? 在后一种情况下,也可以预先翻转图像并将其添加到您的训练集中,这需要最少的工作。

A good gauge of your degree of understanding is your ability to explain and discuss the problem with others. Discussion is also a great way to discover unexpected approaches and edge cases.

理解能力的一个很好的衡量标准是您与他人一起解释和讨论问题的能力。 讨论也是发现意外的方法和极端情况的好方法。

When you understand your constraints and have a somewhat precise problem specification, it is time to get to work.

当您了解自己的约束并有一个较为精确的问题说明时,就该开始工作了。

步骤1.是否存在现有解决方案? (Step 1. Is there an existing solution?)

The first thing you must always do is to look for existing solutions. Unless you are pushing the very boundaries of human knowledge, someone else had already encountered this issue, created a thread on Stack Overflow and possibly wrote an open source library around it.

您必须始终要做的第一件事就是寻找现有的解决方案。 除非您突破人类知识的极限,否则其他人已经遇到了这个问题,在Stack Overflow上创建了一个线程,并可能围绕它编写了一个开源库。

Take advantage of this. There are several benefits of using well established tools, instead of creating your own ones.

利用这一点。 使用完善的工具而不是创建自己的工具有很多好处。

  • You save a tremendous amount of time and work. This is essential when operating under tight deadlines. (One of my teachers used to say ironically that “you can save an hour of Google search with two months of work”. Spot on.)

    您可以节省大量的时间和工作。 在紧迫的期限内运行时,这是必不可少的。 (我的一位老师曾经讽刺地说,“通过两个月的工作,您可以节省一个小时的Google搜索时间。”)

  • Established tools are more likely to be correct. Open source tools are constantly validated and checked by the community. Thus, they are less likely to contain bugs. (Of course, this is not a guarantee.)

    已建立的工具更有可能是正确的。 开源工具不断得到社区的验证和检查。 因此,它们不太可能包含错误。 (当然,这不是保证。)

  • Less code for you to maintain. Again, we should always strive for reducing complexity, and preferably the amount of code. If you use an external tool, you don’t have to worry about its maintenance, which is a great deal. Every line of code has a hidden cost of maintenance, to be paid later. (Often when it is the most inconvenient.)

    更少的代码供您维护。 同样,我们应该始终努力降低复杂性,最好减少代码量。 如果您使用外部工具,则不必担心其维护,这非常麻烦。 每行代码都有隐藏的维护成本,以后需要支付。 (通常是最不方便的时候。)

Junior developers and data scientists often overlook these and prefer to always write everything from scratch. (I certainly did, but quickly learned to know better.) The most extreme case I have seen was a developer, who wrote his own deep learning framework. You should never do that, unless you are a deep learning researcher and you have an idea how to do significantly better than the existing frameworks.

初级开发人员和数据科学家经常忽略这些内容,而是宁愿始终从头开始编写所有内容。 (我确实做到了,但是很快就学会了更好地了解。)我所看到的最极端的情况是开发人员,他编写了自己的深度学习框架。 除非您是一名深度学习研究人员,并且您知道如何做得比现有框架好得多,否则您绝对不要这样做。

Of course, not all problems require an entire framework, maybe you are just looking for a one-liner. Looking for existing solutions can be certainly beneficial, though you need to be careful in this case. Finding and using code snippets from Stack Overflow is only fine if you take the time to understand how and why it works. Not doing so may result in unpleasant debugging sessions later, or even serious security vulnerabilities in the worst case.

当然,并非所有问题都需要一个完整的框架,也许您只是在寻找一种形式。 寻找现有的解决方案当然可以带来好处,尽管在这种情况下您需要格外小心。 仅当您花时间了解它的工作方式和原因时,才可以从Stack Overflow查找和使用代码片段。 否则,可能会导致以后令人不快的调试会话,甚至在最坏的情况下甚至导致严重的安全漏洞。

For these smaller problems, looking for existing solution consists of browsing tutorials and best practices. In general, there is a balance between the ruthless pragmatism and the outside of the box thinking. When you implement something in a way that is usually done, you are doing a favor for the developers who are going to use and maintain that piece of code. (Often including you.)

对于这些较小的问题,寻找现有解决方案包括浏览教程和最佳实践。 总的来说,在残酷的实用主义和外在思维之间有一个平衡。 当您以通常的方式实现某些东西时,您将对将要使用和维护该代码段的开发人员有所帮助。 (通常包括您在内。)

有一个现有的解决方案。 接下来是什么? (There is an existing solution. What next?)

Suppose that on your path towards delivering image augmentation for your data preprocessing pipeline, you have followed my advice, looked for existing solutions and found the awesome albumentations library. Great! What next?

假设在为数据预处理管道提供图像增强的过程中,您遵循了我的建议,寻找了现有的解决方案,并找到了很棒的专辑库。 大! 接下来是什么?

As always, there is a wide range of things to consider. Unfortunately, just because you have identified an external tool which can be a potential solution, it doesn’t mean that it will be suitable for your purposes.

与往常一样,有很多事情要考虑。 不幸的是,仅仅因为您已经确定了可以用作潜在解决方案的外部工具,并不意味着它会适合您的目的。

  • Is it working well and supported properly? There is one thing worse than not using external code: using buggy and unmaintained external code. If a project is not well documented and not maintained, you should avoid it.

    它运作良好并得到适当支持吗? 有一件事情比不使用外部代码更糟糕:使用越野车和未维护的外部代码。 如果没有很好的文档记录和维护项目,则应避免使用该项目。

    For smaller problems, where answers generally can be found on Stack Overflow, the

    对于较小的问题,通常可以在Stack Overflow上找到答案,

    working well part is essential. (See the post I have linked above.)

    良好的工作部分至关重要。 (请参阅我上面链接的帖子。)

  • Is it adaptable directly? For example, if you use an image processing library which is not compatible with albumentations, then you have to do additional work. Sometimes, this can be too much and you have to look for another solution.

    它可以直接适应吗? 例如,如果您使用的图像处理库与影集不兼容,则您必须做其他工作。 有时,这可能太多了,您必须寻找其他解决方案。

  • Does it perform adequately? If you need to process thousands of images per second, performance is a factor. A library might be totally convenient to use, but if it fails to perform, it has to go. This might not be a problem for all cases (for instance, if you are just looking for a quick solution to do experiments), but if it is, it should be discovered early, before putting much work to it.

    它执行得很好吗? 如果您需要每秒处理数千张图像,则性能是一个因素。 库可能完全方便使用,但是如果执行失败,则必须使用。 在所有情况下,这可能都不是问题(例如,如果您只是在寻找一种快速的解决方案来进行实验),但是如果是这样,则应在进行大量工作之前及早发现它。

  • Do you understand how it works and what are its underlying assumptions? This is especially true for using Stack Overflow code snippets, for the reasons I have mentioned above. For more complex issues like the image augmentation problem, you don’t need to understand every piece of external code line by line. However, you need to be aware of the requirements of the library, for instance the format of the input images.

    您了解它是如何工作的,其基本假设是什么? 出于我上面提到的原因,使用堆栈溢出代码段尤其如此。 对于诸如图像增强问题之类的更复杂的问题,您不需要逐行理解每一个外部代码。 但是,您需要了解库的要求,例如输入图像的格式。

This, of course, is applicable only if you can actually find an external solution. Read on to see what to do when this is not the case.

当然,这仅在您可以实际找到外部解决方案时才适用。 继续阅读,看看不是这种情况时该怎么办。

Image for post
UX Indonesia on UX Indonesia Unsplash 摄,Unsplash

如果没有现有的解决方案怎么办? (What if there are no existing solutions?)

Sometimes you have to develop a solution on your own. The smaller the problem is, the more frequently it happens. These are great opportunities for learning and building. In fact, this is the actual problem solving part, the one which makes many of us most excited.

有时,您必须自己开发解决方案。 问题越小,发生的频率就越高。 这些都是学习和建设的绝佳机会。 实际上,这是实际的解决问题的部分,这使我们许多人最兴奋。

There are several strategies to employ, all of them should be in your toolkit. If you read carefully, you’ll notice that there is a common pattern.

有几种策略可以使用,所有策略都应放在您的工具箱中。 如果仔细阅读,您会发现存在一个通用模式。

  • Can you simplify? Sometimes, it is enough to solve only a special case. For instance, if you know for a fact that the inputs for your image augmentation pipeline will always have the same format, there is no need to spend time on processing the input for several cases.

    你可以简化吗? 有时,仅解决一个特殊情况就足够了。 例如,如果您知道图像增强管道的输入将始终具有相同的格式,则无需在几种情况下花费时间来处理输入。

  • Isolate the components of the problem. Solving one problem can be difficult, let alone two at the same time. You should always make things easy for yourself. When I was younger, I used to think that solving hard problems is the thing to do in order to get dev points. Soon, I have realized that the people who solve hard problems always do it by solving many small ones.

    隔离问题的组成部分 。 解决一个问题可能很困难,更不用说同时解决两个问题了。 您应该始终使事情变得容易。 在我年轻的时候,我曾经认为解决困难的问题是获得开发点的目的。 很快,我意识到解决难题的人总是通过解决许多小难题来解决的。

  • Can you solve for special cases? Before you go and implement an abstract interface for image augmentation, you should work on a single method to add into your pipeline. Once you discover the finer details and map out the exact requirements, a more general solution can be devised.

    您能解决特殊情况吗? 在实现用于图像增强的抽象接口之前,您应该使用一种方法添加到管道中。 一旦发现更详细的信息并确定确切的要求,就可以设计出更通用的解决方案。

In essence, problem solving is an iterative process, where you pick the problem apart step by step, eventually reducing it to easily solvable pieces.

本质上,问题解决是一个反复的过程,您可以逐步解决问题,最终将其简化为易于解决的部分。

Image for post
Photo by Moritz Mentges on Unsplash
Moritz MentgesUnsplash拍摄的照片

步骤2.打破解决方案(可选) (Step 2. Break the solution (Optional))

There is a common trait which I have noticed in many excellent mathematicians and developers: they enjoy picking apart a solution, analyzing what makes them work. This is how you learn, and how you build robust yet simple code.

我在许多优秀的数学家和开发人员中都有一个共同的特点:他们喜欢挑选解决方案,分析使他们起作用的原因。 这是学习的方式,也是构建健壮而简单的代码的方式。

Breaking things can be the part of the problem solving process. Going from a special case to general, you usually discover solutions by breaking what you have.

破坏事物可能是问题解决过程的一部分。 从特殊情况到一般情况,您通常会通过破坏现有资产来发现解决方案。

完成后 (When it is done)

Depending on the magnitude of the problem itself, you should consider open sourcing it, if you are allowed. Solving problems for other developers is a great way to contribute to the community.

根据问题本身的严重程度,如果允许,您应考虑将其开源。 为其他开发人员解决问题是为社区做出贡献的好方法。

For instance, this is how I have built modAL, one of the most popular active learning libraries for Python. I started from a very specific problem: building active learning pipelines for bioinformatics. Since building complex methods always require experimentation, I needed a tool which enabled rapid experimentation. This was difficult to achieve with the available frameworks at the time, so I slowly transformed my code to a tool which can be easily adopted by others.

例如,这就是我构建modAL的方式modAL是Python最受欢迎的主动学习库之一。 我从一个非常具体的问题开始:为生物信息学建立积极的学习渠道。 由于构建复杂方法始终需要进行实验,因此我需要一个能够进行快速实验的工具。 当时很难用现有的框架来实现,所以我慢慢地将代码转换为易于他人使用的工具。

What used to be “just” a solution became a library, with thousands of users.

过去只是“解决方案”的解决方案变成了一个拥有成千上万用户的图书馆。

结论 (Conclusion)

Contrary to popular belief, effective problem solving is not the same as coming up with brilliant ideas all the time. Rather, it is a thinking process with some well-defined and easy to use tools, which can be learned by anyone. Smart developers use these instinctively, making them look like magic.

与普遍的看法相反,有效的解决问题与始终想出好主意并不相同。 相反,这是一个思维过程,其中包含一些定义明确且易于使用的工具,任何人都可以学习。 聪明的开发人员本能地使用它们,使它们看起来像魔术。

Problem solving skills can be improved with deliberate practice and awareness of thinking habits. There are several platforms where you can find problems to work on, like Project Euler or HackerRank. However, even if you start applying these methods to issues you encounter during your work, you’ll see your skills improve rapidly.

刻意的练习和对思维习惯的意识可以提高解决问题的能力。 您可以在几个平台上找到需要解决的问题,例如Project EulerHackerRank 。 但是,即使您开始将这些方法应用于工作中遇到的问题,您也会发现自己的技能Swift提高。

翻译自: https://towardsdatascience.com/how-to-solve-coding-problems-e86944c5bfdf

解决乱码问题

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值