Python金融大数据分析——第8章高性能的Pyhon 笔记

最新推荐文章于 2024-06-22 10:30:53 发布

RealEmperor

最新推荐文章于 2024-06-22 10:30:53 发布

阅读量1.7w

点赞数 11

分类专栏： Python金融大数据分析文章标签： Pyhon 高性能

本文链接：https://blog.csdn.net/weixin_42018258/article/details/80874589

版权

本文深入探讨了Python在金融大数据分析中的高性能技术，包括Python范型、内存布局、并行计算（如蒙特卡洛算法）、多处理、动态编译（如Numba）和Cython静态编译。通过实例展示了如何利用这些技术提升计算效率，特别是在期权估值等复杂金融算法中的应用。

摘要由CSDN通过智能技术生成

第8章高性能的Python

第8章高性能的Python

许多高性能库可以用于加速Python代码的执行：
• Cython 用于合并Py由on和C语言静态编译范型。
• IPython.parallel 用于在本地或者在群集上并行执行代码／函数。
• numexpr 用于快速数值运算。
• multiprocessing Python内建的（本地）并行处理模块。
• Numba 用于为CPU动态编译Python代码。
• NumbaPro 用于为多核CPU和GPU动态编译Python代码。

定义一个方便的函数，可以系统性地比较在相同或者不同数据集上执行不同函数的性能：

def perf_comp_data(func_list, data_list, rep=3, number=1):
    """
    Function to compare the performance of different function.
    :param func_list: list with function names as strings
    :param data_list: list with data set names as strings
    :param rep: number of repetitions of the whole comparison
    :param number: number of executions for every function
    :return:
    """
    from timeit import repeat
    res_list = {}
    for name in enumerate(func_list):
        stmt = name[1] + '(' + data_list[name[0]] + ')'
        setup = "from __main__ import " + name[1] + ', ' + data_list[name[0]]
        results = repeat(stmt=stmt, setup=setup, repeat=rep, number=number)
        res_list[name[1]] = sum(results) / rep
    res_sort = sorted(res_list.items(), key=lambda item: item[1])
    for item in res_sort:
        rel = item[1] / res_sort[0][1]
        print('function:' + item[0][1] + ', av.item sec: %9.5f, ' % item[1] + 'relative: %6.1f' % rel)

8.1 Python范型与性能

在金融学中与其他科学及数据密集学科一样，大数据集上的数值计算相当费时。举个例子，我们想要在包含 50 万个数值的数组上求取某个复杂数学表达式的值。我们选择公式中的表达式，它的每次计算都会带来一定的计算负担。除此之外，该公式没有任何特殊的含义。
数学表达式示例


def perf_comp_data(func_list, data_list, rep=3, number=1):
    """
    Function to compare the performance of different function.
    :param func_list: list with function names as strings
    :param data_list: list with data set names as strings
    :param rep: number of repetitions of the whole comparison
    :param number: number of executions for every function
    :return:
    """
    from timeit import repeat
    res_list = {}
    for name in enumerate(func_list):
        stmt = name[1] + '(' + data_list[name[0]] + ')'
        setup = "from __main__ import " + name[1] + ', ' + data_list[name[0]]
        results = repeat(stmt=stmt, setup=setup, repeat=rep, number=number)
        res_list[name[1]] = sum(results) / rep
    res_sort = sorted(res_list.items(), key=lambda item: item[1])
    for item in res_sort:
        rel = item[1] / res_sort[0][1]
        print('function:' + item[0] + ', av.item sec: %9.5f, ' % item[1] + 'relative: %6.1f' % rel)


# 8.1 Python范型与性能

from math import *


# 很容易转换为一个Python函数
def f(x):
    return abs(cos(x)) ** 0.5 + sin(2 + 3 * x)

# 使用range函数，我们可以高效地生成一个包含 50 万个数值的列表对象
I = 500000
a_py = range(I)

# 包含显式循环的标准Python函数
def f1(a):
    res = []
    for x in a:
        res.append(f(x))
    return res

# 包含隐含循环的迭代子方法
def f2(a):
    return [f(x) for x in a]

# 包含隐含循环、使用eval的选代子方法
def f3(a):
    ex = 'abs(cos(x))**0.5+sin(2+3*x)'
    return [eval(ex) for x in a]

# Numy向量化实现
import numpy as np
a_np = np.arange(I)
def f4(a):
    return (np.abs(np.cos(a)) ** 0.5 + np.sin(2 + 3 * a))

# 专用库numexpr求数值表达式的值。 这个库内建了多线程执行支持
# numexpr单线程实现
import numexpr as ne
def f5(a):
    ex='abs(cos(a))**0.5+sin(2+3*a)'
    ne.set_num_threads(1)
    return ne.evaluate(ex)

# nwexpr多线程实现
def f6(a):
    ex = 'abs(cos(a))**0.5+sin(2+3*a)'
    ne.set_num_threads(16)
    return ne.evaluate(ex)

%%time
r1=f1(a_py)
r2=f2(a_py)
r3=f3(a_py)
r4=f4(a_np)
r5=f5(a_np)
r6=f6(a_np)
# Wall time: 35.1 s

# NumPy函数alJclose可以轻松地检查两个（类） ndarray对象是否包含相同数据
np.allclose(r1,r2)
# True
np.allclose(r1,r3)
# True
np.allclose(r1,r4)
# True
np.allclose(r1,r5)
# True
np.allclose(r1,r6)

最低0.47元/天解锁文章

RealEmperor

关注

11
点赞
踩
52

收藏

觉得还不错? 一键收藏
0
评论
Python金融大数据分析——第8章高性能的Pyhon 笔记

第8章高性能的Python8.1 Python范型与性能8.2 内存布局与性能8.3 并行计算8.3.1 蒙特卡洛算法8.3.2 顺序化计算8.4 多处理8.5 动态编译8.5.1 介绍性示例8.5.2 二项式期权定价方法8.6 用Cython进行静态编译8.7 在GPU上生成随机数第8章高性能的Python许多高性能库可以用于加速Py...
复制链接

扫一扫