python静态的代码分析_Python的静态代码分析

python静态的代码分析

Static code analysis looks at the code without executing it. It is usually extremely fast to execute, requires little effort to add to your workflow, and can uncover common mistakes. The only downside is that it is not tailored towards your code.

静态代码分析将查看代码,而不执行代码。 它通常执行起来非常快,不需要花费很多精力就可以添加到您的工作流程中,并且可以发现常见的错误。 唯一的缺点是它不是针对您的代码量身定制的。

In this article, you will learn how to perform various types of static code analysis in Python. While the article focuses on Python, the types of analysis can be done in any programming language.

在本文中,您将学习如何在Python中执行各种类型的静态代码分析。 虽然本文重点介绍Python,但分析类型可以用任何编程语言完成。

代码复杂度 (Code Complexity)

Image for post
Photo by John Barkiple on Unsplash
John BarkipleUnsplash拍摄的照片

One way to measure code complexity is the cyclomatic complexity, also called McCabe complexity as defined in A Complexity Measure:

度量代码复杂度的一种方法是循环复杂度 ,也称为“复杂性度量”中定义的McCabe复杂

CC = E - N + 2*P

where N is the number of nodes in the control flow graph, E is the number of edges and P is the number of condition-nodes (if-statements, while/for loops).

其中N是控制流程图中的节点数,E是边缘数,P是条件节点数(if语句,while / for循环)。

You can calculate it in Python with radon:

您可以使用radon在Python中进行计算:

$ pip install radon
$ radon cc mpu/aws.py -s
mpu/aws.py
F 85:0 s3_download - B (6)
F 16:0 list_files - A (3)
F 165:0 _s3_path_split - A (2)
F 46:0 s3_read - A (1)
F 141:0 s3_upload - A (1)
C 77:0 ExistsStrategy - A (1)

The first letter shows the type of block (F for function, C for class). Then radon gives the line number, the name of the class/function, a grade (A, B, C, D, E, or F), and the actual complexity as a number. Typically, a complexity below 10 is ok. The most complex part of scipy has a complexity of 61.

第一个字母显示类型 (函数表示F,类表示C)。 然后radon给出行 ,类/函数的名称等级 (A,B,C,D,E或F)以及实际的复杂度(以数字表示) 。 通常,低于10的复杂度是可以的。 Scipy最复杂的部分的复杂度为61。

Besides radon, there are various other packages and Flake8 plugins:

除了ra以外,还有其他各种软件包和Flake8插件:

风格指南 (Style Guides)

Image for post
Make your code look professional. Photo by Hunters Race on Unsplash
使您的代码看起来专业。 猎人Unsplash上的 照片

You might have heard the words “pythonic code”. It means to not only write correct Python code but use the languages features how they are intended to be used (source). It is for sure an opinionated term, but there are a lot of plugins that show you what a large part of the community considers to be pythonic.

您可能已经听说过“ Python代码”。 这意味着不仅要编写正确的Python代码,还要使用语言功能(按预期使用)( )。 可以肯定地说,这是一个自以为是的术语,但是有许多插件向您展示了社区中很大一部分人认为是Python的东西。

Writing code in a similar style to other Python projects is valuable as people will have an easier time reading the code. This is important as we read software more often than we write it (source).

以与其他Python项目类似的方式编写代码非常有价值,因为人们将可以更轻松地阅读代码。 这一点很重要,因为我们阅读软件的频率比编写软件的频率高( )。

So, what is pythonic code?

那么,什么是pythonic代码?

Let’s start with PEP-8: It’s a style guide written and accepted by the Python community in 2001. So it’s been around for a while and most people want to follow most of it. The main part which I’ve seen most people not to agree with is the maximum line length of 79. I’m always recommending to follow this advice in 95% of your codebase. I gave reasons for that.

让我们从PEP-8开始:这是一个样式指南,由Python社区在2001年编写并接受。所以它已经存在了一段时间了,大多数人都希望遵循它。 我见过的大多数人不同意的主要部分是最大线长79 。 我总是建议您在95%的代码库中遵循此建议。 我给出了原因

Image for post
Black contributors 黑人贡献者

For pure code formatting, you should use an auto formatter. I grew into liking black because it does NOT allow customization. Code formatted by black always looks the same. As you cannot customize it, you don’t need to discuss it. It just solves the issue of conflicting styles and arguments around it. Black is maintained by the Python Software Foundation and likely the most commonly adopted auto formatter for Python.

对于纯代码格式化,应使用自动格式化程序。 我开始喜欢黑色,因为它不允许自定义。 用黑色格式化的代码始终看起来相同。 由于无法自定义,因此无需讨论。 它只是解决了周围样式和参数冲突的问题。 Black由Python软件基金会(Python Software Foundation)维护,并且可能是最常用的Python自动格式化程序。

yapf by Google is another auto formatter.

Google的yapf是另一种自动格式化程序。

字串 (Docstrings)

Image for post
Reading the manual can be fun if it’s written well. Lasagne and Scipy have pretty good documentation. Photo by Laura Dewilde on Unsplash
如果编写得当,那么阅读手册可能会很有趣。 千层面Scipy有相当不错的文档。 Laura DewildeUnsplash上的 照片

For docstrings, there is PEP-257. All of those rules are widely accepted in the community, but they still allow a wide variety of docstrings. There are three commonly used styles:

对于文档字符串,有PEP-257 。 所有这些规则在社区中已被广泛接受,但是它们仍然允许使用各种各样的文档字符串。 共有三种常用样式:

  • NumpyDoc-style docstrings: Used by Numpy and Scipy. It’s markdown with some specified sections such as Parameters and Returns in a fixed order.

    NumpyDoc样式的文档字符串:由Numpy和Scipy使用。 它的降价带有一些指定的部分,例如“ Parameters和“ Returns ”以固定顺序排列。

  • Google-style docstrings: A super-slim format which has Args: and Returns: .

    Google样式的文档字符串:一种超薄格式,具有Args:Returns:

  • Sphinx-style docstrings: A very flexible format that uses restructured text.

    Sphinx样式的文档字符串:一种非常灵活的格式,使用重新构造的文本。

I love the NumpyDoc format as it is super easy to read even when you just have it inside a text editor. Numpydoc is also well-supported by editors.

我喜欢NumpyDoc格式,因为即使将其放在文本编辑器中,它也非常易于阅读。 Numpydoc也得到编辑的大力支持。

Here you can see the three in comparison:

在这里您可以看到三个比较:

def get_meta_numpydoc(filepath, a_number, a_dict):
"""
Get meta-information of an image. Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Aenean commodo
ligula eget dolor. Aenean massa. Cum sociis natoque penatibus
et magnis dis
parturient montes, nascetur ridiculus mus. Parameters
----------
filepath : str
Get metadata from this file
a_number : int
Some more details
a_dict : dict
Configuration Returns
-------
meta : dict
Extracted meta information Raises
------
IOError
File could not be read
"""def get_meta_google_doc(filepath, a_number, a_dict):
"""Get meta-information of an image. Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Aenean commodo
ligula eget dolor. Aenean massa. Cum sociis natoque penatibus
et magnis dis
parturient montes, nascetur ridiculus mus. Args:
filepath: Get metadata from this file.
a_number: Some more details.
a_dict: Configuration. Returns:
Extracted meta information: Raises:
IOError: File could not be read.
"""


def get_meta_sphinx_doc(filepath, a_number, a_dict):
"""
Get meta-information of an image. Lorem ipsum dolor sit amet, consectetuer adipiscing elit.
Aenean commodo
ligula eget dolor. Aenean massa. Cum sociis natoque penatibus
et magnis dis
parturient montes, nascetur ridiculus mus. :param filepath: Get metadata from this file
:type filepath: str
:param a_number: Some more details
:type a_number: int
:param a_dict: Configuration
:type a_dict: dict :returns: dict -- Extracted meta information :raises: IOError
"""

片状8 (Flake8)

You should always use a linter, as Alberto Gimeno pointed out. They can check your style, but more importantly, show potential errors.

正如Alberto Gimeno指出的那样, 您应该始终使用短绒 。 他们可以检查您的样式,但更重要的是,可以显示潜在的错误。

Flake8 is a wrapper around PyFlakes, pycodestyle, and a McCabe script. It is the most commonly used tool for linting in Python. Flake8 is awesome because there are so many plugins for it. I found 223 packages with the string “flake8” within the name and looked at many of them. I’ve also looked at packages with the trove classifier Framework :: Flake8 and found 143 packages of which 122 started with flake8- . Only 21 packages had the Flake8 Framework trove classifier but didn’t start with flake8- and only two of them looked interesting.

Flake8是PyFlakes,pycodestyle和McCabe脚本的包装。 它是Python中最常用的掉毛工具。 Flake8很棒,因为有很多插件。 我找到了223个名称中带有字符串“ flake8”的软件包,并查看了其中的许多软件包。 我还使用Framework :: Flake8分类器Framework :: Flake8查看了软件包,发现143个软件包,其中122个以flake8- 。 只有21个软件包具有Flake8 Framework trove分类器,但并非以flake8-开头,并且其中只有两个看起来很有趣。

Side note: Typo squatting is an issue every open package repository has to fight with (Bachelor’s Thesis: Typosquatting in Programming Language Package Managers which has a blog post and an interesting follow-up, Bachelor’s Thesis: Attacks on Package Managers). There are examples in Python for it causing harm (2017, 2017, 2017, 2019, 2019, 2019). There is pypi-scan for finding examples and pypi-parker to prevent common typos to be used. William Bengtsson also did something similar to harden the Python community against this thread. See his article below for more information about his project. Package parkinginflates the number of packages on PyPI and I filtered them by looking for the summary “A package to prevent exploit”.

旁注 :打字错误是每个开放式软件包存储库都必须解决的问题(Bachelor的论文: Programming Language Package Managers中的Tyquaquatting,其中包含博客文章有趣的后续 文章 Bachelor的论文: Attacks on Package Managers )。 Python中有一些示例会造成伤害( 2017年2017年2017 2019年2019 2019 )。 有pypi-scan用于查找示例,而pypi-parker可防止使用常见的错字。 威廉·本格森(William Bengtsson)也做了类似的事情来使Python社区更坚强地反对这个线程。 有关他的项目的更多信息,请参见下面的文章。 程序包停放会增加PyPI上的程序包数量,我通过查找摘要“防止攻击的程序包”来过滤它们。

Here are some of the interesting flake8 plugins:

以下是一些有趣的flake8插件:

An alternative to parts of Flake8 prospector. It couples tools, but it is way less commonly used and thus not as flexible as Flake8.

替代Flake8 探矿机的一部分 。 它与工具结合使用,但是使用的方式较少,因此不如Flake8灵活。

Flake8:安全性和错误 (Flake8: Security and Bugs)

Image for post
Be safe by looking at warning signs. Photo by Troy Bridges on Unsplash
注意警告标志,确保安全。 Troy BridgesUnsplash拍摄的照片
  • flake8-bandit: Security Testing

    flake8-bandit :安全测试

  • flake8-bugbear: finding likely bugs and design problems in your program — usually it’s silent, but when it’s not you should have a look 🐻

    flake8-bugbear :在程序中查找可能的错误和设计问题-通常它是静默的,但是如果不是,则应该看看

  • flake8-requests: checks usage of the request framework

    flake8-requests :检查请求框架的使用情况

Flake8:删除调试工件 (Flake8: Remove Debugging Artifacts)

It happened quite a couple of times to me: I’ve added some code while developing a new feature or debugging an old one and forgot to remove it afterward. It was most often caught by the reviewer, but it is not necessary to distract the reviewer with this.

它发生了很多次:我在开发新功能或调试旧功能时添加了一些代码,但后来忘记删除它。 它通常是由审阅者捕获的,但是没有必要以此来分散审阅者的注意力。

flake8-breakpoint checks for forgotten breakpoints and flake8-print will complain about every print statement. flake8-debugger, flake8-fixme, flake8-todo go in the same direction.

flake8-breakpoint检查是否遗忘了断点,而flake8-print将抱怨每条打印语句。 flake8-debuggerflake8-fixme和 flake8-todo朝着相同的方向发展。

皮林特 (Pylint)

pylint is one of the most wide-spread linters in Python. The features of pylint for sure overlaps with Flake8, but there is one feature I love: Checking for code duplication ❤

pylint是Python中使用最广泛的pylint 。 pylint的功能肯定与Flake8重叠,但是我喜欢一个功能:检查代码重复❤

$ pylint --disable=all --enable=duplicate-code .
************* Module mpu.datastructures.trie.base
mpu/datastructures/trie/base.py:1:0: R0801: Similar lines in 2 files
==mpu.datastructures.trie.char_trie:85
==mpu.datastructures.trie.string_trie:138
string += child.print(_indent=_indent + 1)
return stringdef __str__(self):
return f"TrieNode(value='{self._value}', nb_children='{len(self.children)}')"__repr__ = __str__EMPTY_NODE = TrieNode(value="", is_word=False, count=0, freeze=True)class Trie(AbstractTrie):
def __init__(self, container=None):
if container is None:
container = [] (duplicate-code)

让死代码死 (Let Dead Code Die)

Image for post
Kenny Orr on 肯尼·奥尔 ( Unsplash Underlash)摄

Who hasn’t done it: You removed a functionality, but the code could be handy. So you comment it out. Or you add a if False block around it. Sometimes more sophisticated by adding a configuration option you don’t need.

谁没有做:您删除了功能,但是代码可能很方便。 因此,您将其注释掉。 或者,您可以在其周围添加if False块。 有时,通过添加不需要的配置选项可以使功能更加复杂。

The clean solution is to have a single, clear commit that removes that feature. Maybe add a git tag so that you can find it later if you want to add it again.

干净的解决方案是拥有一个明确的提交,以删除该功能。 也许添加一个git标签,以便以后再次添加时可以找到它。

And then there is code which is dead, but you forgot about it. Luckily, you can automatically detect it:

然后是死掉的代码,但是您忘了它。 幸运的是,您可以自动检测到它:

  • flake8-eradicate: Find commented out (or so-called “dead”) code.

    flake8-eradicate :查找注释掉(或所谓的“死”)代码。

  • vulture: Finds unused code in Python programs

    ::在Python程序中查找未使用的代码

Flake8:鼓励自己使用好风格 (Flake8: Nudging Yourself to use Good Style)

Image for post
Having an experienced developer review your code is awesome. In the best case, you will learn something new that you can apply in all further projects. And some plugins act like that. Photo by Brooke Cagle on Unsplash
让经验丰富的开发人员检查您的代码真是太棒了。 在最佳情况下,您将学到一些新知识,可以在所有其他项目中应用。 有些插件就是这样。 Brooke CagleUnsplash拍摄的照片

Some plugins helped me to learn something about Python. For example, the following helped me to get rid of small little bugs and inconsistencies:

一些插件帮助我学习了有关Python的知识。 例如,以下内容帮助我摆脱了小的小错误和不一致之处:

The following new style nudging plugins aim to push you to use modern style Python:

以下新样式的推钉插件旨在推动您使用现代样式的Python:

This is one of the most valuable categories for me. If you know more plugins which help to use new styles, let me know 😃

对我来说,这是最有价值的类别之一。 如果您知道更多有助于使用新样式的插件,请告诉我😃

Flake8 Meta插件 (Flake8 Meta Plugins)

Image for post
Image created by Martin Thoma via imgflip.com
图片由Martin Thoma通过imgflip.com创建

Flake8 has some plugins which don’t add more linting functionality, but improve flake8 in another way:

Flake8有一些插件,它们没有添加更多的棉绒功能,但以另一种方式改进了flake8:

And some plugins people might need for legal reasons like flake8-author, flake8-copyright, and flake8-license.

人们可能出于法律原因需要一些插件,例如flake8-author,flake8-copyright和flake8-license。

To Flake8 plugin authors: Please make sure that you list the error codes your plugin introduces and that you give at least some examples of what your plugin considers bad / good.

给Flake8插件作者:请确保您列出了插件引入的错误代码,并且至少提供了一些示例,说明您的插件认为不好/很好。

类型注释和类型检查 (Type Annotations and Type Checking)

Image for post
The mypy plugin for VS Code showing an issue with the types. Screenshot by Martin Thoma.
VS Code的mypy插件显示类型问题。 Martin Thoma的屏幕截图。

It’s possible in Python, but you need to do it. It’s not done automatically. I’ve written a longer article about how type annotations work in Python. There are multiple tools you can use, but I recommend mypy. You can run it via pytest by using pytest-mypy or via flake8 by using flake8-mypy , but I prefer to run it separately. The main reason for it is that the output given by CI pipelines is cleaner.

在Python中是可能的,但是您需要这样做。 它不是自动完成的。 我写了一篇较长的文章,介绍如何在Python中使用类型注释 。 您可以使用多种工具,但我建议使用mypy。 您可以通过使用通过pytest运行pytest-mypy或通过使用flake8 flake8-mypy ,但我更喜欢单独运行它。 主要原因是CI管道提供的输出更干净。

You can integrate type checking (e.g. via mypy) into your editor, but the type annotations alone already go a long way as they document what is expected.

您可以将类型检查(例如,通过mypy)集成到编辑器中,但是仅使用类型注释就可以了,因为它们记录了预期的内容。

包装结构 (Package Structure)

Image for post
Check that your package looks fine before shipping it. Photo by Toby Stodart on Unsplash
在运输之前,请检查您的包裹是否正常。 Toby StodartUnsplash上的 照片

pyroma rates how well a Python project complies with the best practices of the Python packaging ecosystem.

pyroma评估Python项目与Python打包生态系统的最佳实践的符合程度。

Here are some examples of my projects:

以下是我的项目的一些示例:

$ pyroma mpu 
------------------------------
Checking mpu
Found mpu
------------------------------
Final rating: 10/10
Your cheese is so fresh most pe$ pyroma nox
------------------------------
Checking nox
Found nox
------------------------------
Your long_description is not valid ReST:
<string>:2: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
<string>:3: (WARNING/2) Field list ends without a blank line; unexpected unindent.
<string>:4: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
------------------------------
Final rating: 9/10
Cottage Cheese
------------------------------

想更多地了解单元测试? (Want to Know More About Unit Testing?)

In this series, we already had:

在本系列中,我们已经有:

Let me know if you’re interested in other topics around testing with Python or professional software development with Python: info@martin-thoma.de

让我知道您是否对使用Python测试或使用Python专业软件开发感兴趣的其他主题:info@martin-thoma.de

翻译自: https://towardsdatascience.com/static-code-analysis-for-python-bdce10b8d287

python静态的代码分析

  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值