In this post, we will take a dissecion of source code of Python.
To benefit the simplicity and meanwhile follow the most recent functionnalities, I choose Python 3.6.9
to do the analysis.
In the last post, the entry of Python is found, so it’s time to look at it!
Py_Main - the Python Main function
According to C89 standard, in a C function, all the variables should be declared at the beginning. So, we can take a look at those variables.
Variables
The variables are listed below, though we’ll not explicite them at the moment.
int c;
int sts;
wchar_t *command = NULL;
wchar_t *filename = NULL;
wchar_t *module = NULL;
FILE *fp = stdin;
char *p;
#ifdef MS_WINDOWS
wchar_t *wp;
#endif
int skipfirstline = 0;
int stdin_is_interactive = 0;
int help = 0;
int version = 0;
int saw_unbuffered_flag = 0;
char *opt;
PyCompilerFlags cf;
PyObject *main_importer_path = NULL;
PyObject *warning_option = NULL;
PyObject *warning_options = NULL;
Arguments
Then, after listing and initializing all variables, Python will try to find flags in the arguments.
First, Python tries to find some options which are needed by some initializations.
/* Hash randomization needed early for all string operations
(including -W and -X options). */
while ((c = _PyOS_GetOpt(argc, argv, PROGRAM_OPTS)) != EOF) {
if (c == 'm' || c == 'c') {
/* -c / -m is the last option: following arguments are
not interpreter options. */
break;
}
if (c == 'E') {
Py_IgnoreEnvironmentFlag++;
break;
}
}
PROGRAM_OPTS
is defined as BASE_OPTS
, and #define BASE_OPTS L"bBc:dEhiIJm:OqRsStuvVW:xX:?"
is at the header of main.c
.
_PyOS_GetOpt
is implemented in Python/getopt.c
, which validates and returns argument option. If an option is not in PROGRAM_OPTS
, a _
will be returned. It will not accept --argument
and returns a -1 if an argument with that form is found.
In these lines, only options E
, m
, c
are detected:
- if
E
is detected, the flag which leads to the negligence ofPYTHONPATH
,PYTHONHOME
environment variables. - once
m
orc
is detected, following parameters should be the name of module(form
option) or the command that will be executed(forc
option). So we terminated this loop.
Then, Python gets the PYTHONMALLOC
variables and tries to use it to setup allocators.
opt = Py_GETENV("PYTHONMALLOC");
if (_PyMem_SetupAllocators(opt) < 0) {
fprintf(stderr,
"Error in PYTHONMALLOC: unknown allocator \"%s\"!\n", opt);
exit(1);
}
Valid allocators are pymalloc
, pymalloc_debug
, malloc
, malloc_debug
and debug
. If you’d like to get more about them, Objects/obmalloc.c
is a good place.
And then Python does an initialization of Random module. In this module, PYTHONHASHSEED
can be used to initialize random module. And it resets warning options, resets option parsing process to process all options.
_PyRandom_Init();
PySys_ResetWarnOptions();
_PyOS_ResetGetOpt();
while ((c = _PyOS_GetOpt(argc, argv, PROGRAM_OPTS)) != EOF) {
// ...
}
We finally enter the period to parse all arguments. They will be explicited in order.
Option c
-c cmd : program passed in as string (terminates option list)`
if (c == 'c') {
size_t len;
/* -c is the last option; following arguments
that look like options are left for the
command to interpret. */
len = wcslen(_PyOS_optarg) + 1 + 1;
command = (wchar_t *)PyMem_RawMalloc(sizeof(wchar_t) * len);
if (command == NULL)
Py_FatalError(
"not enough memory to copy -c argument");
wcscpy(command, _PyOS_optarg);
command[len - 2] = '\n';
command[len - 1] = 0;
break;
}
If we encounter an c
option, all other arguments will be neglected. The following argument will be parsed as the commands to be run.
Option m
-m mod : run library module as a script (terminates option list)
if (c == 'm') {
/* -m is the last option; following arguments
that look like options are left for the
module to interpret. */
module = _PyOS_optarg;
break;
}
If we encounter an m
option, all other arguments will be neglected. The following argument will be parsed as the module to be run.
Other options
-B : don't write .py[co] files on import; also PYTHONDONTWRITEBYTECODE=x
-d : debug output from parser; also PYTHONDEBUG=x
-E : ignore PYTHON* environment variables (such as PYTHONPATH)
-h : print this help message and exit (also --help)
-i : inspect interactively after running script; forces a prompt even
if stdin does not appear to be a terminal; also PYTHONINSPECT=x
-O : optimize generated bytecode slightly; also PYTHONOPTIMIZE=x
-OO : remove doc-strings in addition to the -O optimizations
-R : use a pseudo-random salt to make hash() values of various types be
unpredictable between separate invocations of the interpreter, as
a defense against denial-of-service attacks
-Q arg : division options: -Qold (default), -Qwarn, -Qwarnall, -Qnew
-s : don't add user site directory to sys.path; also PYTHONNOUSERSITE
-S : don't imply 'import site' on initialization
-t : issue warnings about inconsistent tab usage (-tt: issue errors)
-u : unbuffered binary stdout and stderr; also PYTHONUNBUFFERED=x
see man page for details on internal buffering relating to '-u'
-v : verbose (trace import statements); also PYTHONVERBOSE=x
can be supplied multiple times to increase verbosity
-V : print the Python version number and exit (also --version)
-W arg : warning control; arg is action:message:category:module:lineno
also PYTHONWARNINGS=arg
-x : skip first line o