GROMACS Tutorial - dcTMD 估算非平衡状态的下自由能

😇本教程为基于 Multisecond ligand dissociation dynamics from atomistic simulations 第二个模拟体系 benzamidine-trypsin 编写,意在指导读者运行蛋白质-配体的dcTMD(牵引动力学模拟)模拟体系和后续ΔG(x)的估算等。本教程所用到的软件/脚本均可在Stock Lab课题组的网站获得😇。“😇”内的内容为译者添加。

Introduction

该 python 包有助于根据耗散校正的牵引动力学模拟(dcTMD)方法分析目标分子动力学轨迹。TMD模拟通过constraint设置的拉力强制配体与蛋白质发生解结合事件。利用dcTMD,可以估计沿着反应坐标的自由能分布和摩擦系数。有关方法论概述,请参阅原作者的文章 https://pubs.acs.org/doi/full/10.1021/acs.jctc.8b00835

请注意:

此软件包仍处于测试阶段。如果遇到任何错误,还请予以汇报。同时如果您使用了该软件包,恳请您引用这些文章:

S. Wolf, and G. Stock, Targeted molecular dynamics calculations of free energy profiles using a nonequilibrium friction correction., Journal of chemical theory and computation (2018)

V. Tänzel, and M. Jäger, and S. Wolf, Dissipation Corrected Targeted Molecular Dynamics, in preparation (2022)

您可通过下述指令安装:

python3 -m pip install git+ssh://git@github.com/moldyn/dcTMD.git
#OR
pip install dcTMD

Step 1. Create trajectories with Gromacs

下载此链接的教程包 😇GitHub上的,可能需要魔法😇,解压:

tar -xzvf ./tutorial_files.tar

一般来说。对于您自己项目中的输入结构,我们建议您进行至少10ns长度的初始NPT平衡模拟 😇10ns雀食很保险😇。在这里,我们已经为你做了这件事,并产生了一个平衡结构。现在你将需要一堆具有不同初始速度分布的平衡轨迹(最佳在100-200个之间,但这里我们仅演示10个)。为此,使用以下脚本生成和运行具有不同初始速度分布的 .tpr 文件:

mkdir equib
cd equib/

for i in {000..009}
do
gmx grompp -f ../3ptb_AMBER99SB_ben_pushEQUIBRUN.mdp -c ../3ptb_AMBER99SB_ben.gro -r ../3ptb_AMBER99SB_ben.gro -p ../3ptb_AMBER99SB_ben.top -n ../3ptb_AMBER99SB_ben.ndx -o 3ptb_AMBER99SB_ben_pushEQUIBRUN_"$i".tpr -maxwarn 1 
gmx mdrun -v -deffnm 3ptb_AMBER99SB_ben_pushEQUIBRUN_"$i"
done

由于只有 0.1 ns 长度的模拟,因此它们应该在相当短的时间内完成。然后通过以下方式为生成牵引模拟:

cd ..
mkdir v0.001
cd v0.001/

for i in {000..009}
do
gmx grompp -f ../3ptb_AMBER99SB_ben_pushRUN_v0.001.mdp -c ../equib/3ptb_AMBER99SB_ben_pushEQUIBRUN_"$i".gro -p ../3ptb_AMBER99SB_ben.top -n ../3ptb_AMBER99SB_ben.ndx -o 3ptb_AMBER99SB_ben_pushRUN_0.001_"$i".tpr
gmx mdrun -v -deffnm 3ptb_AMBER99SB_ben_pushRUN_0.001_"$i"
done

请注意,符号_0.001代表以 0.001 nm/ps,即 1 m/s 进行牵引。根据我们目前的经验,这是一个最佳速度,在缓慢拉动和计算工作量最小之间进行了最佳权衡。最终3ptb_AMBER99SB_ben_pushRUN_0.001_*_pullf.xvg内容类似下图:
在这里插入图片描述

Step 2.1. dcTMD via Work

从约束拉动模拟中分析非平衡力-时间轨迹的一种方法是首先计算功,然后估计自由能。该方法在WorkEstimator类中实现,并且在计算上比计算力自相关函数(在ForceEstimator类别中实现)更高效,因为该类别通过力-时间轨迹的积分来处理工作数据,使得在获得相似结果的同时显著降低分辨率。因此,我们建议将这种方法用于大型数据集 😇以下内容请自行在Python环境中编写(例如使用 pycharm 或 jupyter notbook),原教程使用 jupyter notbook 进行演示,此处限于翻译环境不再展示每个 cell 的输出。需要注意的是,这个程序包也有指令界面,如果你更喜欢 bash 指令这样的操作方式,可以简单看看这一部分,然后去看Step 2.3. Command Line Interface 指令界面😇

0. Load packages and define variables

# load packages
import numpy as np
import dcTMD
from dcTMD.storing import WorkSet
from dcTMD.dcTMD import WorkEstimator
from dcTMD.utils import plotting
# define variables
velocity = 0.001
res = 1
verbose = True
temperature = 300

1. Create a work set

要计算自由能和摩擦估计,需要一个用于储存功(work)数据集(set)。它包含综合力-时间轨迹、生成包含文件名的数组。这可以通过函数dcTMD.io.load_pullf()来完成,该函数以 glob 模式或文件名内包含 pullf 的文件作为参数。

# load the files
pullf_files = '../../tests/testdata/pullf_filenames.dat' # or
pullf_files = '../../tests/testdata/*pullf.xvg'
filenames = dcTMD.io.load_pullf(pullf_files)

filenames 

随后功集是通过 WorkSet 实例来创建的。分辨率参数控制数据集的步长。在对力-时间轨迹进行积分之后执行缩减。对于长轨迹,例如3500000帧,建议将 resolution 分辨率值设置为 >1000,以确保不要超过您的硬件。

# create WorkSet instance
workset = WorkSet(velocity=velocity,
                  resolution=res,
                  verbose=False,
                  )
workset

# fit/fill workset
workset.fit(filenames)
# save workset
#dcTMD.storing.save('my_workset', workset)

2. Check normality of work distribution

dcTMD 需要满足的主要条件之一是功的正态分布。这可以通过不同的方法进行检查,例如绘制功-时间的轨迹、不同x位置的正态性检查、Kolmogorov-Smirnov 检验、Shapiro-Wilk 检验。注意:如果功的分布不正常,则会影响结果。路径分离是必要的。有关路径分离的理论,请参阅…😇原作者也没给出,您已进入未知领域🦹😇

# plot workset
import matplotlib.pyplot as plt
from dcTMD.utils import plotting

fig, ax = plt.subplots()
plotting.plot_worklines(workset.position_, workset.work_, ax)
plt.show()

# check if work distribution follows a normal distribution
from scipy.stats import kstest, shapiro
from dcTMD.utils import plotting

index = [5000, 15000]  #这个参数对应我们的模拟时间或反应距离(二者是正比关系嘛),这里即为 0.5nm-1.5nm
x = workset.position_

fig, axs = plotting.plot_worknormalitychecks(x, workset.work_, index)

for i, p in enumerate(index):
    # Shapiro-Wilk Test
    shapiro_test = shapiro(workset.work_[:,p])
    print(f'shapiro wilkins results at x={x[p]} is {shapiro_test}')
    # Kolmogorov-Smirnov Test
    kstest_test = kstest(workset.work_[:,p], 'norm')
    print(f'Kolmogorov-Smirnov results at x={x[p]} is {kstest_test}')

结果类似下图:
在这里插入图片描述

3. Derive estimates from workset and get result

此处需要先创建WorkEstimator实例,然后使用WorkEstimator实例拟合以前创建的功集。

# create WorkEstimator instance
workestimator = WorkEstimator(temperature)
# fit existing workset
# or load an existing workset
# workset = dcTMD.storing.load(my_workset)
workestimator.fit(workset)
vars(workestimator) #这句在实际应用中可以不写

在软件包中,有几个简单的绘图函数来帮助你获得结果的大致图像。例如plot_dcTMD_results()。而我们一般还要需要对摩擦估计进行平滑处理,这可以通过dcTMD.utils.smooting.gaussfilter_friction()dcTMD.WorkEstimator.smooth_friction(sigma, mode),sigma 是以 nm 为单位的高斯分步的标准偏差😇或者说平滑窗口大小😇,mode 参数决定了如何将输入阵列扩展到其边界之外。注意:使用大数据集和大平滑窗口可能导致计算时间过长。

# simply plot dcTMD results
fig, axs = plotting.plot_dcTMD_results(workestimator.position_,
                                      workestimator,
                                      workestimator.friction_)

plt.show()

# smooth friction and plot results
workestimator.smooth_friction(sigma=0.1, mode='nearest')

fig, axs = plotting.plot_dcTMD_results(workestimator.position_,
                                      workestimator,
                                      workestimator.friction_smooth_)
plt.show()

在这里插入图片描述

使用不同的平滑窗口大小和模式会显著更改结果。由于模式中的边界处理方式不同,mode=‘earest’会导致对右边界的高估(模拟结束),而mode=‘reflect’会导致左边界偏低。以下是一些示例:

# smooth friction and plot results

workestimator.smooth_friction(sigma=0.1, mode='nearest')

fig, axs = plt.subplots()

plotting.plot_Gamma(workestimator.position_,
                    workestimator.friction_smooth_,
                    axs,
                    label='nearest 0.1nm'
                    )

# using different smoothing windows and modes changes the results significantly
smooth_friction = dcTMD.utils.gaussfilter_friction(workestimator.friction_,
                                                   workestimator.position_,
                                                   sigma=.01,
                                                   mode='reflect',
                                                   )
axs.plot(workestimator.position_, smooth_friction, label='reflect .01nm')
smooth_friction = dcTMD.utils.gaussfilter_friction(workestimator.friction_,
                                                   workestimator.position_,
                                                   sigma=.01, 
                                                   mode='nearest',
                                                   )
axs.plot(workestimator.position_, smooth_friction, label='nearest .01nm')
smooth_friction = dcTMD.utils.gaussfilter_friction(workestimator.friction_,
                                                   workestimator.position_,
                                                   sigma=.1,
                                                   mode='reflect',
                                                   )
                                                   
axs.plot(workestimator.position_, smooth_friction, label='reflect .1nm')
axs.legend()

plt.show()

😇下图左侧为教程数据,中间为译者重新计算的数据,右侧为原作者已发表的文章中的数据,不知为何教程数据差别挺大的😇

在这里插入图片描述

4. Error estimation

结果的误差估计是通过自举法实现的,下面提供两种不同的模式:

n_resamples = 1000

# bootstrapping error in mode std
mode = 'std'
workestimator.estimate_free_energy_errors(n_resamples, mode)
x = workestimator.position_
dG = workestimator.dG_
s_dG = workestimator.s_dG_

fig, ax = plt.subplots()
plotting.plot_dG(x, dG, ax)
ax.fill_between(x, dG-s_dG, dG+s_dG, alpha=0.5)
plt.show()

# bootstrapping error in with confidence interval 
# this gives a lower and upperbound estimate
confidence_interval = 0.9
mode = confidence_interval
workestimator.estimate_free_energy_errors(n_resamples, mode)

x = workestimator.position_
dG = workestimator.dG_
s_dG = workestimator.s_dG_

fig, ax = plt.subplots()
plotting.plot_dG(x, dG, ax)
plt.fill_between(x, s_dG[0], s_dG[1], alpha=0.5)
plt.show()

结果如下图:

在这里插入图片描述

5. Save and load results

两种保存方式:

# save forceestimator instance
dcTMD.storing.save('my_workestimator', workestimator)

# save data as .npz and .dat file
outname = 'my_workestimator_results'
dcTMD.io.write_output(outname, workestimator)

results = np.load(f'{outname}_N{len(workestimator.names_)}_dG.npz')

results.files

Step 2.2. dcTMD via Force

😇因为两种方法较为相似,所以译者对这一部分进行了整合及概括。首先需要加载包和设定参数。随后类似于使用功作为输入,使用力作为输入时,为了计算自由能和摩擦估计,需要一个包含所有力-时间轨迹的力集。这可以通过函数dcTMD.io.load_pullf()来完成,该函数以 glob 模式或文件名内包含pullf的文件作为参数。随后ForceSet实例来创建力集;利用forceset.fit将此力集拟合为相应的功;再使用ForceEstimator完成自由能等的估算😇

# load packages
import numpy as np
from dcTMD.dcTMD import ForceEstimator
from dcTMD.storing import ForceSet, load
import dcTMD
import matplotlib.pyplot as plt
from dcTMD.utils import plotting

# define variables
velocity = 0.001
res = 1
verbose = True
temperature = 300

# load the files
pullf_files = '../../tests/testdata/pullf_filenames.dat'
# OR
pullf_files = '../../tests/testdata/*pullf.xvg'
filenames = dcTMD.io.load_pullf(pullf_files)

# create ForceSet instance
forceset = ForceSet(velocity=velocity,
                  resolution=res,
                  verbose=verbose,
                  )
# fit/fill workset
forceset.fit(filenames)
# save workset
#dcTMD.storing.save('my_forceset', forceset)

# plot workset
fig, ax = plt.subplots()
plotting.plot_worklines(forceset.position_, forceset.work_, ax)
plt.show()

# check if work distribution follows a normal distribution
from scipy.stats import kstest, shapiro
from dcTMD.utils import plotting
index = [5000, 15000]     #这个参数对应我们的模拟时间或反应距离(二者是正比关系嘛),这里即为 0.5nm-1.5nm
x = forceset.position_
fig, axs = plotting.plot_worknormalitychecks(x, forceset.work_, index)
for i, p in enumerate(index):
    # Shapiro-Wilk Test
    shapiro_test = shapiro(forceset.work_[:,p])
    print(f'shapiro wilkins results at x={x[p]} is {shapiro_test}')
    # Kolmogorov-Smirnov Test
    kstest_test = kstest(forceset.work_[:,p], 'norm')
    print(f'Kolmogorov-Smirnov results at x={x[p]} is {kstest_test}')

# Instantiate a ForceEstimator instance and fit it with the ForceSet instance
forceestimator = ForceEstimator(temperature)
forceestimator.fit(forceset)
vars(forceestimator)

# plot dcTMD results
fig, axs = plotting.plot_dcTMD_results(forceestimator.position_,
                                      forceestimator,
                                      forceestimator.friction_)
plt.show()
# smooth friction and plot results
forceestimator.smooth_friction(sigma=0.1, mode='nearest')
fig, axs = plotting.plot_dcTMD_results(forceestimator.position_,
                                      forceestimator,
                                      forceestimator.friction_smooth_)
plt.show()

# different smoothing setting
fig, axs = plt.subplots()
plotting.plot_Gamma(forceestimator.position_,
                    forceestimator.friction_smooth_,
                    axs,
                    label='nearest 0.1nm'
                    )
# using different smoothing windows and modes changes the results significantly
smooth_friction = dcTMD.utils.gaussfilter_friction(forceestimator.friction_,
                                                   forceestimator.position_,
                                                   sigma=.01,
                                                   mode='reflect',
                                                   )
axs.plot(forceestimator.position_, smooth_friction, label='reflect .01nm')
smooth_friction = dcTMD.utils.gaussfilter_friction(forceestimator.friction_,
                                                   forceestimator.position_,
                                                   sigma=.01, 
                                                   mode='nearest',
                                                   )
axs.plot(forceestimator.position_, smooth_friction, label='nearest .01nm')
smooth_friction = dcTMD.utils.gaussfilter_friction(forceestimator.friction_,
                                                   forceestimator.position_,
                                                   sigma=.1,
                                                   mode='reflect',
                                                   )                                            
axs.plot(forceestimator.position_, smooth_friction, label='reflect .1nm')
axs.legend()
plt.show()

# save forceestimator instance
dcTMD.storing.save('my_forceestimator', forceestimator)
# OR save data as .npz and .dat file
outname = 'my_forceestimator_results'
dcTMD.io.write_output(outname, forceestimator)

results = np.load(f'{outname}_N{len(forceestimator.names_)}_dG.npz')
results.files

Step 2.3. Command Line Interface 指令界面

我们的软件包具有一个名为 “dcTMD” 的Python命令行界面(CLI)脚本。脚本加载力文件,生成功/力集,并计算平均功、耗散功、自由能和摩擦系数。输出保存为 .npz.dat 文件,可以使用--save_dataset选项分别保存功/力集。此外,使用--plot选项可以生成结果的概览图。CLI还提供了用于调试和误差分析的选项。要获得所有模块的概述,只需在shell中运行以下命令。

python -m dcTMD

内容如下:

Usage: python -m dcTMD [OPTIONS]

  ------------------------- |         dcTMD         |  ------------------------- 
  Calculate free energy and friction for given constraint force files.

  Analysis tools for dissipation-corrected targeted molecular dynamics, which
  is an enhanced sampling method to enforce rare events in biomolecular
  systems. When publishing results gained with this python package, please
  cite the following publications: (1) Tänzel, Victor and Jäger, Miriam and
  Wolf, Steffen in preparation. (2) Wolf, Steffen, and Gerhard Stock.
  "Targeted molecular dynamics calculations of free energy profiles using a
  nonequilibrium friction correction." Journal of chemical theory and
  computation 14.12 (2018): 6175- 6182.

Options:
  -m, --mode [work|force]  Use either work or force autocovariance function to
                           calculatedcTMD quantities.  [default: work;
                           required]
  -f, --file TEXT          Input: File containing list of all constraint force
                           file namesor glob pattern e.g."*.xvg" to generate a
                           list of all constraint force files using
                           glob.glob()  [required]
  -o, --outname PATH       Output: Path/prefix of output names.
  -T, --temperature FLOAT  Simulation temperature in K.  [required]
  -vel, --velocity FLOAT   Pulling velocity in nm/ps.  [required]
  --res INTEGER            Striding to reduce size of returned free energy and
                           friction  [default: 1]
  -s, --sigma FLOAT        Standard deviation of gaussian filter in nm.
  --resamples INTEGER      Number of resamples used in optional bootstrapping.
                           This is only available in mode work
  -v, --verbose            Enable verbose mode.  [default: False]
  -p, --plot               Plots free energy and smoothed friction.  [default:
                           False]
  -sd, --save_dataset      Save the Work/ForceSet class to file.  [default:
                           False]
  --help                   Show this message and exit.

例如:

python -m dcTMD -f '*.xvg' -o path/to/output/name  -T 290.5 -v 0.001 --verbose --plot --save_dataset

😇意味着将当前文件夹下所有以 .xvg 为后缀的文件导入为约束力文件,输出到path/to/output/name文件夹内,温度设置为 290.5,牵引速度为 0.001,verbose 模式进行平滑和误差分析等(像2.1. 3/4 一样,把所有设置都输出一遍),绘制概略图,保存功/力集,其余选项采用默认参数😇

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值