Python Source Code Analysis (0)

In this post, we will take a dissecion of source code of Python.

To benefit the simplicity and meanwhile follow the most recent functionnalities, I choose Python 3.6.9 to do the analysis.

The first step, is to build!

Build Python

As other projects, Python uses autoconf toolset to configure and then make to build itself. If it doesn’t make sense to you, just ignore it. What you really need is just a set of single commands to make it:

  1. Run ./configure, it will detect your environment along with the architecture, the dependencies, the features supported by your compiler
  2. After that, if there is no error, a Makefile file should be generated and placed in the same directory
  3. Run make, and you will get your own Python build!

Now we use default configurations to build, because we don’t aim at building a Python binary. If you’d like to play with Python builds, you can find more information at https://docs.python.org/3.6/using/unix.html#building-python.

Project Structure

Before we configure the project, the structure of project is really clear and simple:

aclocal.m4     configure     Lib              Misc     Programs       python-config.py
build          configure.ac  LICENSE          Modules  pyconfig.h     python-gdb.py
config.guess   Doc           Mac              Objects  pyconfig.h.in  README.rst
config.log     Grammar       Makefile         Parser   python         setup.py
config.status  Include       Makefile.pre     PC       Python         Tools
config.sub     install-sh    Makefile.pre.in  PCbuild  python-config

aclocal.m4, config.guess, config.sub, configure, configure.ac, install-sh, Makefile.pre.in, pyconfig.h.in are files which concerns the configuration and the compilation. LICENSE is the license file.

Doc/

This folder contains the documentation of Python.

Grammar1/

In Grammar folder, there is only one file, which described the abstract grammar representation of Python.

Include/

All Python headers used during Python compilation. Some of those will be intalled in your system for a further development.

Lib/

All libraries written in Python.

Mac/

Build tools for macOS build.

Misc/

Other things, not so important.

Modules/

Modules written in C.

Objects/

Declarations and implementations of various Python Objects.

Parser/

Python code parser.

PC/

The code for Windows PC.

PCbuild/

The Windows PC build files, such as Visual Studio Prject files.

Programms/

Main functions of Python.

Python/

Python main implementation codes.

Tools/

Auxiliary tools and demo.

Programm Entry, Programs/python.c

In the analysis, we start by the main entry of program.

At the beginning, "Python.h" and <locale.h> are imported.

Then, the main function has two branches:

Windows

On Windows, the function wmain instead of main is used as the main entry.

int
wmain(int argc, wchar_t **argv)
{
    return Py_Main(argc, argv);
}

This function is for the unicode environment, you can go to the doc page of Microsoft for further information:
https://docs.microsoft.com/en-us/cpp/c-language/using-wmain?view=vs-2019.

In the main function, it calls the real Main Function of Python, Py_Main with command-line arguments.

Other Unix-Like Systems

It’s not so simple as the one for Windows.

 wchar_t **argv_copy;
/* We need a second copy, as Python might modify the first one. */
wchar_t **argv_copy2;
int i, res;
char *oldloc;

In this code, some necessary variables are declared, regarding C89 or above. Then, to copy arguments, Python requests to use malloc to allocate memories by invoking (void)_PyMem_SetupAllocators("malloc");.

Then two memory spaces are allocated:

argv_copy = (wchar_t **)PyMem_RawMalloc(sizeof(wchar_t*) * (argc+1));
argv_copy2 = (wchar_t **)PyMem_RawMalloc(sizeof(wchar_t*) * (argc+1));
if (!argv_copy || !argv_copy2) {
    fprintf(stderr, "out of memory\n");
    return 1;
}

Then, Python did one thing like this:

oldloc = _PyMem_RawStrdup(setlocale(LC_ALL, NULL));

This line will return C locale and store it in oldloc.

Then Python tries to set locale with user-prefered one by setlocale(LC_ALL, ""), and to decode all command line arguments with new locale.

setlocale(LC_ALL, "");
for (i = 0; i < argc; i++) {
    argv_copy[i] = Py_DecodeLocale(argv[i], NULL);
    if (!argv_copy[i]) {
        PyMem_RawFree(oldloc);
        fprintf(stderr, "Fatal Python error: "
                        "unable to decode the command line argument #%i\n",
                        i + 1);
        return 1;
    }
    argv_copy2[i] = argv_copy[i];
}
argv_copy2[argc] = argv_copy[argc] = NULL;

After decoding command line arguments with user-prefered locale, Python would like to use the old default C locale and do some clean things.

setlocale(LC_ALL, oldloc);
PyMem_RawFree(oldloc);

After everything about decoding and copying arguments is ready, run Python main process:

res = Py_Main(argc, argv_copy);

After finishing Python main process, all memory spaces which were allocated should be released properly with following code:

/* Force again malloc() allocator to release memory blocks allocated
   before Py_Main() */
(void)_PyMem_SetupAllocators("malloc");

for (i = 0; i < argc; i++) {
    PyMem_RawFree(argv_copy2[i]);
}
PyMem_RawFree(argv_copy);
PyMem_RawFree(argv_copy2);

And then, return the result through return res;.

Before the end

In this post, we tried to compile Python with all default configurations, and we explicited the main entry function.

In the next post, we’ll get involve with Py_Main function, as well as some Python Objects or types if possible.

See you then!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
这些是来自 SonarQube 静态代码分析工具的警告和建议。我将逐个解释它们的含义: 1. "SCM provider autodetection failed. Please use "sonar.scm.provider" to define SCM of your project, or disable the SCM Sensor in the project settings." 这表示 SonarQube 无法自动检测到你的项目的源代码管理(SCM)提供者。你需要手动设置 "sonar.scm.provider" 参数来定义你的项目的 SCM,或在项目设置中禁用 SCM 传感器。 2. "Your code is analyzed as compatible with python 2 and 3 by default. This will prevent the detection of issues specific to python 2 or python 3. You can get a more precise analysis by setting a python version in your configuration via the parameter "sonar.python.version"." 这表示默认情况下,你的代码被分析为与 Python 2 和 3 兼容。这将阻止检测特定于 Python 2 或 Python 3 的问题。你可以通过在配置中设置 Python 版本参数 "sonar.python.version" 来获得更精确的分析。 3. "There are problems with file encoding in the source code. Please check the scanner logs for more details." 这表示源代码中存在文件编码问题。请检查扫描器日志以获取更多详细信息。 4. "24 unanalyzed C files were detected in this project during the last analysis. C cannot be analyzed with your current SonarQube edition. Please consider upgrading to Developer Edition to find Bugs, Code Smells, Vulnerabilities and Security Hotspots in this file." 这表示在最近的分析中检测到了 24 个未分析的 C 文件。你当前使用的 SonarQube 版本无法分析 C 语言。请考虑升级到 Developer Edition,以便在该文件中查找错误、代码异味、漏洞和安全热点。 希望这些解释对你有所帮助!如果你需要进一步的解释或有其他问题,请随时提问。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值