NumPy 文档与安装 NumPy
1. NumPy
NumPy brings the computational power of languages like C and Fortran to Python, a language much easier to learn and use.
1.1 NumPy documentation
https://numpy.org/doc/stable/index.html
2. NumPy 中文
NumPy 是用 Python 进行科学计算的基础软件包。
2.1 NumPy 参考手册
https://www.numpy.org.cn/reference/
2.2 NumPy 中文文档 - GitHub
https://github.com/teadocs/numpy-cn
3. Installing NumPy
The only prerequisite for installing NumPy is Python itself. If you don’t have Python yet and want the simplest way to get started, we recommend you use the Anaconda Distribution
- it includes Python, NumPy, and many other commonly used packages for scientific computing and data science.
prerequisite [priːˈrekwəzɪt]:n. 前提,先决条件,必备条件 adj. 必须先具备的,先决条件的
3.1 pip & conda
NumPy can be installed with conda
, with pip
, with a package manager on macOS and Linux, or from source. The two main tools that install Python packages are pip
and conda
. Their functionality partially overlaps (e.g. both can install numpy
), however, they can also work together.
CONDA
If you use conda
, you can install NumPy from the defaults
or conda-forge
channels:
# Best practice, use an environment rather than install in the base env
conda create -n my-env
conda activate my-env
# If you want to install from conda-forge
conda config --env --add channels conda-forge
# The actual install command
conda install numpy
PIP
If you use pip
, you can install NumPy with:
pip install numpy
The first difference is that conda is cross-language and it can install Python, while pip is installed for a particular Python on your system and installs other packages to that same Python install only. This also means conda can install non-Python libraries and tools you may need (e.g. compilers, CUDA, HDF5), while pip can’t.
第一个区别是 conda 是跨语言的,它可以安装 Python,而 pip 是为系统上的特定 Python 安装的,并且只将其他包安装到同一个 Python 环境。这也意味着 conda 可以安装您可能需要的非 Python 库和工具 (e.g. compilers, CUDA, HDF5),而 pip 则不能。
The second difference is that pip installs from the Python Packaging Index (PyPI), while conda installs from its own channels (typically “defaults” or “conda-forge”). PyPI is the largest collection of packages by far, however, all popular packages are available for conda as well.
PyPI 是迄今为止最大的软件包集合,但是,所有流行的软件包也可用于 conda。
The third difference is that conda is an integrated solution for managing packages, dependencies and environments, while with pip you may need another tool (there are many!) for dealing with environments or complex dependencies.
3.2 NumPy packages & accelerated linear algebra libraries
NumPy doesn’t depend on any other Python packages, however, it does depend on an accelerated linear algebra library - typically Intel MKL
or OpenBLAS
. Users don’t have to worry about installing those (they’re automatically included in all NumPy install methods). Power users may still want to know the details, because the used BLAS can affect performance, behavior and size on disk:
-
The NumPy wheels on PyPI, which is what pip installs, are built with OpenBLAS. The OpenBLAS libraries are included in the wheel. This makes the wheel larger, and if a user installs (for example) SciPy as well, they will now have two copies of OpenBLAS on disk.
-
In the conda defaults channel, NumPy is built against Intel MKL. MKL is a separate package that will be installed in the users’ environment when they install NumPy.
NumPy 是针对 Intel MKL 构建的。 -
In the conda-forge channel, NumPy is built against a dummy “BLAS” package. When a user installs NumPy from conda-forge, that BLAS package then gets installed together with the actual library - this defaults to OpenBLAS, but it can also be MKL (from the defaults channel), or even BLIS or reference BLAS.
-
The MKL package is a lot larger than OpenBLAS, it’s about 700 MB on disk while OpenBLAS is about 30 MB.
-
MKL is typically a little faster and more robust than OpenBLAS.
algebra [ˈældʒɪbrə]:n. 代数学
against [əˈɡenst]:prep. 反对,与 ... 相反,逆,违反,对 ... 不利,以 ... 为竞争对手,紧靠,倚,碰,撞,以防,以 ... 为背景,和 ... 相比,作为 ... 借项,以 ... 抵付
Besides install sizes, performance and robustness, there are two more things to consider:
- Intel MKL is not open source. For normal use this is not a problem, but if a user needs to redistribute an application built with NumPy, this could be an issue.
- Both MKL and OpenBLAS will use multi-threading for function calls like
np.dot
, with the number of threads being determined by both a build-time option and an environment variable. Often all CPU cores will be used. This is sometimes unexpected for users; NumPy itself doesn’t auto-parallelize any function calls. It typically yields better performance, but can also be harmful - for example when using another level of parallelization with Dask, scikit-learn or multiprocessing.