Modeling cross-regulatory influences on monolignol transcripts and proteins under single and combina

yujie2023

已于 2023-11-25 20:47:42 修改

阅读量80

点赞数 1

文章标签： matlab 算法

于 2023-11-25 19:49:10 首次发布

本文链接：https://blog.csdn.net/qq_57195341/article/details/134619034

版权

在 MATLAB 中，建立函数的基本语法如下：

function [output1, output2, ...] = functionName(input1, input2, ...)
    % 函数体，包括对输入进行处理的代码

    % 在函数体中可以使用输入参数 input1、input2，以及定义的局部变量

    % 计算结果，将结果存储在输出变量 output1、output2 中

    % 结束函数
end

解释：

function 关键字表示函数的声明。
[output1, output2, ...] 是可选的输出参数列表，用方括号括起来，如果函数没有输出参数，可以省略。
functionName 是函数的名称，自定义的标识符，遵循 MATLAB 的变量命名规则。
(input1, input2, ...) 是输入参数列表，如果函数没有输入参数，可以省略括号。
函数体内包含对输入参数进行处理的代码，以及计算结果并存储在输出变量中的代码。
函数体结束时，使用 end 关键字表示函数定义的结束。

例如，一个简单的 MATLAB 函数可以是这样的：

function result = squareNumber(x)
    % 计算输入参数 x 的平方
    result = x^2;
end

这个函数接受一个输入参数 x，计算它的平方，并将结果存储在变量 result 中。

sparse_maximum_likelihood_B.m

function[B,me]=sparse_maximum_likelihood_B(W,B,Y,K,Q,lambda_factor,lambda_factor_prev,sigma2,params)
%定义 MATLAB 函数 `sparse_maximum_likelihood_B
%该函数接受输入参数 `W, B, Y, K, Q, lambda_factor, lambda_factor_prev, sigma2, params`，并返回输出参数 `B, me`。
if(nargin==8) params=[];
         end
 % 如果输入参数的个数为 8，则将 `params` 初始化为空。
if(isfield(params,'maxiter')) maxiter=params.maxiter; else maxiter=1000;
 end
 %如果 `params` 结构体中包含字段 `maxiter`，则将其值赋给 `maxiter`；否则，默认设置 `maxiter` 为 1000。
[M,N]=size(Y);
[Y,meanY]=centerY(Y,K);

[M,N]=size(Y): 获取矩阵 Y 的大小，M 为行数，N 为列数。
[Y,meanY]=centerY(Y,K): 调用 centerY 函数对矩阵 Y 进行中心化处理，并返回中心化后的矩阵 Y 和均值 meanY。

IM=speye(M);

IM=speye(M): 创建大小为 MxM 的单位矩阵 IM。

Ylamb=zeros(M);
for i=1:M
    nontargs=K(i,:);
    Yc=Y{i}(:,nontargs)*Y{i}(:,nontargs)';
    Ylamb(i,:)=Yc(i,:);
end

Ylamb=zeros(M): 创建大小为 MxM 的零矩阵 Ylamb。
for i=1:M ... end: 循环遍历 M 行，对 Ylamb 进行计算。
nontargs=K(i,:): 获取 K 矩阵的第 i 行。
Yc=Y{i}(:,nontargs)*Y{i}(:,nontargs)': 计算矩阵 Y{i}(:,nontargs) 的外积。
Ylamb(i,:)=Yc(i,:): 将计算结果的第 i 行赋给 Ylamb。

lambda_max=max(max(abs((-sigma2*sum(K,2).*IM+Ylamb)./W)));

lambda_max=max(max(abs((-sigma2*sum(K,2).*IM+Ylamb)./W))): 计算正则化参数 lambda 的最大值。

lambda=lambda_max*lambda_factor;

lambda=lambda_max*lambda_factor: 计算正则化参数 lambda。

Dlambda=(2*lambda_factor-lambda_factor_prev)*lambda_max*W;
S=(abs(Q)>=Dlambda);

Dlambda=(2*lambda_factor-lambda_factor_prev)*lambda_max*W: 计算正则化参数的变化量。
S=(abs(Q)>=Dlambda): 创建一个与 Q 大小相同的矩阵 S，元素值为 abs(Q) >= Dlambda 的逻辑值。

for i=1:M/2
    S(i,i+M/2)=0;
end

for i=1:M/2 ... end: 将 S 矩阵的前半部分对角线元素设为零。

s=sum(S,2);
ei=zeros(M,1);
B=B.*S;

s=sum(S,2): 计算 S 矩阵每行元素的和。
ei=zeros(M,1): 创建大小为 Mx1 的零向量 ei。
B=B.*S: 将 B 矩阵中对应于 S 中非零元素的位置上的元素保留，其他位置设为零。

for k=1:maxiter
    Bold=B;
    for i=1:M
        nontargs=K(i,:);
        Nhat=sum(nontargs);
        if (s(i))
            ei(i)=1;
            zi=(IM-B)\ei;
            js_i=find(S(i,:)>0);
            for j=js_i
                m_ij=zi(j);
                B_old=B(i,j);
                if(j~=i)
                    lambdaW=lambda*W(i,j);
                    y_j=Y{i}(j,nontargs)';
                    BiT=B(i,:);
                    BiT(j)=0;
                    a_iT=(ei'-BiT)*Y{i}(:,nontargs);
                    r_ij=y_j'*y_j;
                    beta_ij=a_iT*y_j;
                    if(m_ij==0)
                        Bij=(beta_ij-lambdaW)/r_ij;
                        if Bij>0
                            B(i,j)=Bij;
                        else
                            Bij=(beta_ij+lambdaW)/r_ij;
                            if Bij<0
                                B(i,j)=Bij;
                            else
                                B(i,j)=0;
                            end
                        end
                    else
                        d_ij=1/m_ij+B(i,j);
                        theta_ijp=r_ij*d_ij+beta_ij-lambdaW;
                        k_ijp=d_ij*(beta_ij-lambdaW)-Nhat*sigma2;
                        q_ijp=theta_ijp^2-4*r_ij*k_ijp;
                        Bijpp=(1/(2*r_ij))*(theta_ijp+sqrt(q_ijp));
                        Bijpm=(1/(2*r_ij))*(theta_ijp-sqrt(q_ijp));
                        q_ijm=q_ijp +4*lambdaW*(beta_ij-r_ij*d_ij);
                        theta_ijm=theta_ijp+2*lambdaW;
                        Bijmm=(1/(

这段代码是一个用于网络推断的 MATLAB 脚本。以下是代码的主要步骤和解释：

load ../TranscriptProteinAbundances.mat % load lignin data

Ytable = Ymeanlines_imputed; % Mean of the experimental replicates
Xmask = Xmeanlines_mask; % indicates targeted transcripts

Y = Ytable{:,:}'; % Rows of Y are the transcripts and proteins, columns are the experiments
X = Xmask{:,:}';

[M, N] = size(Y); % M is the number of transcripts and proteins, N is the number of experiments

K = ~logical(X); % binary matrix size MxN, 1 when transcript/protein was not targeted in an experiment, 0 when it was

[Ysc, Asc] = scaleY(Y); % Scale each transcript and protein so that their abundances fall between 0 and 1

%% Set parameters
lambda_factors = 10.^(0:-.1:-3); % SML regularization parameters
params.Kcv = 5; % # of cross-validation folds
params.rho_factors = 10.^(-6:0.1:1); % Ridge regularization parameters
params.lambda_factors = lambda_factors;
params.maxiter = 1000; % max iterations for SML algorithm
params.cv_its = 5; % # of iterations of the kcv-fold cross-validations

%% Network inference
[ilambda_cv] = cross_validation_SML_B(Y, K, params); % Use cross-validation to determine SML regularization parameter
[rho_factor, sigma2] = cross_validation_ridge_B(Ysc, K, params); % Use cross-validation to determine ridge regularization parameter, and estimate sigma2 for SML

[BRhat, mueRhat] = constrained_ridge_B(Ysc, K, rho_factor); % Calculate ridge regression of Ysc using rho_factor from cross_validation

BR = diag(1./Asc) * BRhat * diag(Asc); % Un-scales calculated BRhat and mueRhat to use with Y
mueR = diag(1./Asc) * mueRhat;

W = 1./abs(BRhat); % Use BRhat to compute regularization weights to use with Ysc

Q = Inf; % Ignore discarding rules from SML

% Initialize variables for SML
lambda_factor_prev = 1;
BLhat = zeros(M);

for ilambda = 1:ilambda_cv
    [BLhat, mueLhat] = sparse_maximum_likelihood_B(W, BLhat, Ysc, K, Q, lambda_factors(ilambda), lambda_factor_prev, sigma2, params);    
    
    Q = Inf;
    lambda_factor_prev = lambda_factors(ilambda);
end % ilambda

SL = abs(sign(BLhat)); % Detected edges from SML algorithm
[BChat, mueChat] = constrained_ML_B(BLhat, SL, Ysc, K, sigma2, params); % Use constrainedML algorithm on just the identified edges. This solves for the weights of the identified edges without the l1-norm penalty

BC = diag(1./Asc) * BChat * diag(Asc); % Un-scale BChat and mueChat to use with Y
mueC = diag(1./Asc) * mueChat;

BCsparse = sparse(BC);

Lignin_modelSML.m

% rng_seed=1010;
% rng(rng_seed) %set seed for random number generator, for reproducibility
这段代码涉及 MATLAB 中的随机数生成器的种子设置，目的是为了可重复性。具体来说：

- `rng_seed=1010;`：设置了一个固定的种子值（1010），这个值是可自定义的，但在此代码中被设定为1010。
- `rng(rng_seed)`：使用上述设置的种子值初始化 MATLAB 的随机数生成器。通过设置种子，可以确保每次运行代码时生成的随机数序列是相同的，从而使结果可重复。
这在科学研究中很常见，特别是在涉及到随机性的算法或实验中。通过使用相同的种子值，研究人员可以确保其他人或未来的实验可以复制他们的结果。
%%% Load transcript and protein abundances

load ../TranscriptProteinAbundances.mat % load lignin data

Ytable=Ymeanlines_imputed; % Mean of the experimental replicates
Xmask=Xmeanlines_mask; % indicates targeted transcripts


Y=Ytable{:,:}'; % Rows of Y are the transcripts and proteins, columns are the experiments
X=Xmask{:,:}';

[M,N]=size(Y); % M is number of transcripts and proteins, N is number of experiments

K=~logical(X); % binary matrix size MxN, 1 when transcript/protein was not targeted in an experiment, 0 when it was

[Ysc,Asc]=scaleY(Y); %Scale each transcript and protein so that their abundances fall between 0 and 1


%%set parameters
lambda_factors=10.^(0:-.1:-3); %SML regularization parameters
params.Kcv=5; % # of cross-validation folds
params.rho_factors=10.^(-6:0.1:1); %Ridge regularization parameters
params.lambda_factors=lambda_factors;
params.maxiter=1000; % max iterations for SML algorithm
params.cv_its=5; % # of iterations of the kcv-fold cross-validations

%%network inference     
[ilambda_cv]=cross_validation_SML_B(Y,K,params); % Use cross-validation to determine SML regularization parameter
[rho_factor,sigma2]=cross_validation_ridge_B(Ysc,K,params); % Use cross-validation to determine ridge regularization parameter, and estimate sigma2 for SML

[BRhat,mueRhat]=constrained_ridge_B(Ysc,K,rho_factor); % Calculate ridge regression of Ysc using rho_factor from cross_validation

BR=diag(1./Asc)*BRhat*diag(Asc); % Un-scales calculated BRhat and mueRhat to use with Y
mueR=diag(1./Asc)*mueRhat;

W=1./abs(BRhat); % Use BRhat to compute regularization weights to use with Ysc

Q=Inf; % Ignore discarding rules from SML

% Initialize variables for SML
lambda_factor_prev=1;
BLhat=zeros(M);

for ilambda=1:ilambda_cv
    
    [BLhat,mueLhat]=sparse_maximum_likelihood_B(W,BLhat,Ysc,K,Q,lambda_factors(ilambda),lambda_factor_prev,sigma2,params);    
    
    Q=Inf;
    lambda_factor_prev=lambda_factors(ilambda);
end %ilambda

SL=abs(sign(BLhat)); % Detected edges from SML algorithm
[BChat,mueChat]=constrained_ML_B(BLhat,SL,Ysc,K,sigma2,params); % Use constrainedML algorithm on just the identified edges. This solves for the weights of the identifed edges without the l1-norm penalty

BC=diag(1./Asc)*BChat*diag(Asc); % Un-scale BChat and mueChat to use with Y
mueC=diag(1./Asc)*mueChat;

BCsparse=sparse(BC);

这段 MATLAB 代码执行了一系列操作，其中涉及到 Sparse Maximum Likelihood (SML) 以及一些网络推断的方法。以下是逐行的解释：

load ../TranscriptProteinAbundances.mat: 从文件 “TranscriptProteinAbundances.mat” 中加载转录本和蛋白质的丰度数据。
Ytable=Ymeanlines_imputed; Xmask=Xmeanlines_mask;: 从数据中提取均值，Ytable 包含实验复制品的平均值，Xmask 指示哪些转录本是被定向的。
Y=Ytable{:,:}'; X=Xmask{:,:}';: 将 Ytable 和 Xmask 转换为矩阵形式，其中 Y 包含转录本和蛋白质的行，实验的列，而 X 是一个二进制矩阵，指示哪些转录本是被定向的。
[M,N]=size(Y); K=~logical(X);: 获取矩阵 Y 的大小，K 是一个二进制矩阵，表示哪些转录本是未被定向的（值为1），哪些是被定向的（值为0）。
[Ysc,Asc]=scaleY(Y);: 将转录本和蛋白质的丰度进行缩放，确保它们的值在 0 到 1 之间。
lambda_factors=10.^(0:-.1:-3); params.Kcv=5; params.rho_factors=10.^(-6:0.1:1); params.lambda_factors=lambda_factors; params.maxiter=1000; params.cv_its=5;: 设置网络推断的参数，包括 SML 的正则化参数、交叉验证的折数等。
[ilambda_cv]=cross_validation_SML_B(Y,K,params);: 使用交叉验证确定 SML 正则化参数。
[rho_factor,sigma2]=cross_validation_ridge_B(Ysc,K,params);: 使用交叉验证确定岭回归的正则化参数，并估计 SML 方法中的 sigma2。
[BRhat,mueRhat]=constrained_ridge_B(Ysc,K,rho_factor);: 利用岭回归计算 Ysc 的回归结果，使用从交叉验证中得到的 rho_factor。
BR=diag(1./Asc)*BRhat*diag(Asc); mueR=diag(1./Asc)*mueRhat;: 将计算得到的 BRhat 和 mueRhat 进行反缩放，得到 BR 和 mueR。
W=1./abs(BRhat);: 使用 BRhat 计算用于 Ysc 的正则化权重。
Q=Inf;: 初始化丢弃规则。
lambda_factor_prev=1; BLhat=zeros(M);: 初始化 SML 所需的变量。
for ilambda=1:ilambda_cv: 迭代进行 SML 方法的计算。
[BLhat,mueLhat]=sparse_maximum_likelihood_B(W,BLhat,Ysc,K,Q,lambda_factors(ilambda),lambda_factor_prev,sigma2,params);: 调用 SML 方法进行稀疏最大似然估计。
SL=abs(sign(BLhat));: 从 SML 算法中得到的 BLhat 计算得到边的检测结果。
[BChat,mueChat]=constrained_ML_B(BLhat,SL,Ysc,K,sigma2,params);: 在仅包括被检测到的边的情况下，使用受限最大似然估计算法。
BC=diag(1./Asc)*BChat*diag(Asc); mueC=diag(1./Asc)*mueChat;: 将计算得到的 BChat 和 mueChat 进行反缩放，得到最终的结果。
BCsparse=sparse(BC);: 将 BC 转换为稀疏矩阵。

这段代码实现了一个复杂的网络推断过程，其中包括了 SML 方法以及其他一些技术。这些技术的细节可能涉及到统计学、机器学习和数学领域。如果你对其中的某一部分有具体的问题，我可以提供更详细的解释。

这个函数的目的是对输入矩阵 Y 进行缩放，使得每一行的元素范围都在 [0, 1] 之间。以下是函数的详细解释：

scaleY.m
function [Ysc, Ascaling] = scaleY(Y)
% Y is an M x N matrix with non-negative elements
% Scale Data so each row has the range [0 1]

% Find the maximum value along each row
Ymax = max(Y, [], 2);

% Calculate the scaling factor for each row
Ascaling = 1 ./ Ymax;

% Scale the data by multiplying each row by its corresponding scaling factor
Ysc = diag(Ascaling) * Y;

end

解释：

Y 是一个大小为 M x N 的矩阵，其中的元素都是非负数。
Ymax 计算了每一行的最大值。
Ascaling 是一个向量，包含了每一行的缩放因子。每个缩放因子是 1 除以对应行的最大值。
diag(Ascaling) 创建了一个对角矩阵，其中对角线上的元素是 Ascaling 中的元素。
Ysc 是经过缩放的矩阵，通过将每一行乘以对应的缩放因子得到。

因此，函数的输出 Ysc 是一个每行元素范围在 [0, 1] 之间的矩阵，而 Ascaling 是用于还原缩放的缩放因子。

Sparse_maximum_likelihood_B.m
这是一个用于稀疏极大似然估计（Sparse Maximum Likelihood，SML）的 MATLAB 函数。下面是对代码的详细解释：

输入参数解释：
- W: 正则化惩罚项的权重矩阵。
- B: 待估计的稀疏系数矩阵。
- Y: 观测数据矩阵。
- K: 二进制矩阵，表示哪些观测是非目标观测。
- Q: 二进制矩阵，表示哪些系数需要被忽略。
- lambda_factor: 正则化参数。
- lambda_factor_prev: 先前的正则化参数。
- sigma2: 方差。
- params: 一些其他参数，是一个结构体，包含 maxiter（最大迭代次数）。
函数主体解释：
- Y 和 meanY 被标准化，其中 meanY 存储了每个观测的平均值。
- 计算了一些矩阵，如 IM（单位矩阵）和 Ylamb（根据观测数据生成的矩阵）。
- 计算了正则化参数 lambda 和 Dlambda。
- 通过迭代进行 SML 估计，其中更新了系数矩阵 B，并在每次迭代中更新了一些辅助矩阵和向量。
- 在每次迭代中，通过解线性或二次方程来更新 B 的值。
- 最后，计算 delta_B，如果小于阈值则停止迭代。
输出解释：
- B: 最终的系数矩阵。
- me: 与 B 相关的估计均值。

总体而言，这个函数实现了一种通过迭代求解 SML 问题的方法，其中在每次迭代中通过解线性或二次方程来估计系数矩阵。

centerY.m
这是一个用于中心化数据的 MATLAB 函数。下面是对代码的详细解释：

输入参数解释：
- Y：原始观测数据矩阵。
- K：二进制矩阵，表示哪些观测是非目标观测。
函数主体解释：
- Y_bar 是一个单元格数组，其中每个元素 Y_bar{i} 是对应观测的非目标观测的均值向量。
- Yc 是一个单元格数组，其中每个元素 Yc{i} 是对应观测减去其均值向量得到的中心化观测数据矩阵。
中心化过程解释：
- 对于每个观测 i，计算非目标观测的数量 Nhat。
- 计算 Y_bar{i}，即对应观测的非目标观测的均值向量。
- 计算 Yc{i}，即将原始观测数据矩阵 Y 减去对应观测的均值向量。
输出解释：
- Yc：中心化后的观测数据矩阵数组，每个元素对应一个观测的中心化数据。
- Y_bar：均值向量数组，每个元素对应一个观测的非目标观测的均值向量。

总体而言，这个函数实现了中心化数据的操作，以便在后续处理中更好地处理观测数据。

Constrained_ML_B.m
这是一个 MATLAB 函数，用于执行约束最大似然估计。下面是对该函数的解释：

function [B,me]=constrained_ML_B(B,S,Y,K,sigma2,params)

% 检查是否提供了参数，如果没有，则创建一个空的参数结构
if(nargin==5)
    params=[];
end

% 设置默认的最大迭代次数
if(isfield(params,'maxiter'))
    maxiter=params.maxiter;
else
    maxiter=1000;
end

% 获取输入矩阵的大小
[M,~]=size(Y);

% 创建单位矩阵
IM=speye(M);

% 对输入数据进行中心化处理
[Y,meanY]=centerY(Y,K);

% 初始化一些变量
ei=zeros(M,1);
s=sum(S,2);
I=speye(M);

% 迭代求解
for k=1:(maxiter)
    Bold=B;
    for i=1:M
        % 获取未被目标的实验的索引
        nontargs=K(i,:);
        % 计算未被目标实验的数量
        Nhat=sum(nontargs);
        if (s(i))
            ei(i)=1; % ei 是单位向量
            
            % zi 用于计算 m_ij=cofactor_ij/det(I-B)
            zi=(I-B)\ei;
            js_i=find(S(i,:)>0);
            for j=js_i
                m_ij=zi(j);
                B_old=B(i,j);
                y_j=Y{i}(j,nontargs)';
                
                % 移除 B(i,j) 并获取 B\ij 的第 i 行
                BiT=B(i,:);
                BiT(j)=0;
                a_iT=(ei'-BiT)*Y{i}(:,nontargs);
                
                r_ij=y_j'*y_j;
                beta_ij=a_iT*y_j;
                
                if (m_ij==0) % 转到线性方程
                    B(i,j)=beta_ij/r_ij;
                else % 如果 m_ij ~= 0 转到二次方程
                    d_ij=1/m_ij+B(i,j);
                    theta_ij=r_ij*d_ij+beta_ij;
                    k_ij=d_ij*beta_ij-Nhat*sigma2;
                    q_ij=theta_ij^2-4*r_ij*k_ij;
                    Bijp=(1/(2*r_ij))*(theta_ij+sqrt(q_ij));
                    Bijm=(1/(2*r_ij))*(theta_ij-sqrt(q_ij));
                    candsBij=[Bijm, Bijp]';
                    Lss=sigma2*Nhat*log((d_ij-candsBij).^2)/2-0.5*sum((ones(length(candsBij),1)*a_iT-candsBij*Y{i}(j,nontargs)).^2,2);
                    [~,l_max]= max(Lss);
                    B(i,j)=candsBij(l_max);
                end %m_ij
                dB=B_old-B(i,j);
                zi=zi/(1+dB*m_ij);
            end%j
            % 现在对扰动-基因连接矩阵 F 进行坐标上升
            ei(i)=0;
        end %s(i)
    end %i
    
    % 计算 B 的变化
    delta_B=norm(Bold-B,'fro')/(norm(Bold,'fro')+1e-10);
    
    % 如果 B 的变化小于 1e-2，则停止求解 B
    if delta_B<1e-2
        for i=1:M
            me(i,1)=(IM(i,:)-B(i,:))*meanY{i}; % 计算与 B(i,:) 相关的 mue(i)
        end
        return
    end
end %k

% 计算与 B(i,:) 相关的 mue(i)
for i=1:M
    me(i,1)=(IM(i,:)-B(i,:))*meanY{i};
end

这个函数的目的是通过坐标上升法来执行约束最大似然估计，其中 B 是一个矩阵，S 是一个二值矩阵，Y 是一个细胞数组，K 是一个二值矩阵，sigma2 是方差参数，params 是可选参数。在迭代过程中，它尝试找到最大似然估计的解，同时满足一些约束条件。

Constrained_ridge_B.m
这是一个 MATLAB 函数，用于执行约束 Ridge 回归。下面是对该函数的解释：

function [B,me]=constrained_ridge_B(Y,K,rho_factor)

% 获取输入数据的大小
[M,~]=size(Y);

% 将输入数据以中心值为0的方式进行中心化处理
[Y,meanY]=centerY(Y,K);

% 初始化 B 矩阵和单位矩阵
B=zeros(M);
IM=speye(M);

for i=1:M
    % 如果是转录本，相关蛋白质不能进行调节
    if i<=M/2
        rows_rem=[i i+M/2];
        I=speye(M-2);
    else
        rows_rem=i;
        I=speye(M-1);
    end
    
    % 获取未被目标实验的索引
    nontargs=K(i,:);
    % 计算未被目标实验的数量
    Nhat=sum(nontargs);
    yi=Y{i}(i,nontargs)';
    Yi=Y{i}(:,nontargs)';
    
    % 移除相关的行
    Yi(:,rows_rem)=[];
    
    % 设置 Ridge 参数
    Pi=eye(Nhat);
    rho=rho_factor*norm(Yi'*Pi)^2;
    
    % 计算 Ridge 回归系数
    bi=(Yi'*Pi*Yi+rho*I)\(Yi'*(Pi*yi));

    % 更新 B 矩阵
    indexes=1:M;
    indexes(rows_rem)=[];
    B(i,indexes)=bi';
    
    % 计算与 B(i,:) 相关的 mue(i)
    me(i,1)=(IM(i,:)-B(i,:))*meanY{i};
end

这个函数的目的是通过 Ridge 回归执行约束最大似然估计，其中 Y 是一个细胞数组，K 是一个二值矩阵，rho_factor 是 Ridge 回归的参数。在迭代过程中，它尝试找到最大似然估计的解，同时满足一些约束条件。

Knockdown_visualizations.m

function []=Knockdown_Visualizations(g,TargConst,Yact_table,Xtable,Ypred,legend_flag)
if ~isfield(Ypred,'prevhalf')
    Ypred.prevhalf=zeros(size(Ypred.prevfull));
    prevhalf_flag=0;
else
    prevhalf_flag=1;
end

colors=[180, 180, 180; % Gray 1
    204, 0, 0; % Wolfpack Red
    253, 215, 38; % Hunt Yellow
    209, 73, 5; % Pyroman Flame
    66, 126, 147; % Innovation Blue
    65, 86, 161; % Bioindigo
    230 230 230]./255; % Gray 2


Experiments=Yact_table.Properties.RowNames;
GeneLabels=Yact_table.Properties.VariableNames;
Yact=Yact_table{:,:}';
[M,~]=size(Yact);

%%% Group experiments into their constructs
[Gc, Const]=findgroups(cellfun(@(x) x(6:8),Experiments,'UniformOutput',false));
[Gcl, ConstLines]=findgroups(cellfun(@(x) x(6:end-2),Experiments,'UniformOutput',false));

%%

i=find(strcmp(GeneLabels,['t' g])); %index of transcript for goi
ip=i+M/2; %index of protein for goi
TargConst_ind=find(cellfun(@(x) strcmp(x,TargConst),Const)); %indices of the coi

TargConstLine_ind=find(cellfun(@(x) contains(x,TargConst),ConstLines)); %indices for each line for the coi

WT_ind=find(cellfun(@(x) contains(x,'WT_'),Experiments));
Ywt=Yact_table{WT_ind,1:M}';
Ywt_avg=mean(Ywt,2);


TargGene_inds=find(Gc==TargConst_ind); % index for target gene
it=find(Xtable{TargGene_inds(1),:});
it_prot=it+M/2;
TargGeneNames=GeneLabels(it);

for n=1:length(TargConstLine_ind)
    jinds{n}=find(Gcl==TargConstLine_ind(n)); % indices coi
end


Ymeans=zeros(3,length(TargConstLine_ind)); % mean transcript of interest
Ymeans_p=zeros(4,length(TargConstLine_ind)); % mean protein of interest
Ymeans_t=zeros(length(it),length(TargConstLine_ind)); % mean targeted transcript(s)
Ymeans_tprot=zeros(length(it),length(TargConstLine_ind));% mean targeted protein(s)
for n=1:length(TargConstLine_ind)
    if numel(jinds{n})==1
        
        Ymeans_t(:,n)=Yact(it,jinds{n});
        Ymeans_tprot(:,n)=Yact(it_prot,jinds{n});
        Ymeans(:,n)=[Yact(i,jinds{n}) Ypred.full(i,jinds{n}) Ypred.prevfull(i,jinds{n})];
        Ymeans_p(:,n)=[Yact(ip,jinds{n}) Ypred.full(ip,jinds{n}) Ypred.prevfull(ip,jinds{n}) Ypred.prevhalf(ip,jinds{n})];
        
    else
        
        Ymeans_t(:,n)=mean(Yact(it,jinds{n})');
        Ymeans_tprot(:,n)=mean(Yact(it_prot,jinds{n})');
        Ymeans(:,n)=mean([Yact(i,jinds{n})' Ypred.full(i,jinds{n})' Ypred.prevfull(i,jinds{n})']);
        Ymeans_p(:,n)=mean([Yact(ip,jinds{n})' Ypred.full(ip,jinds{n})' Ypred.prevfull(ip,jinds{n})' Ypred.prevhalf(ip,jinds{n})']);
    end
end

% Calculate WTs
Ymeans_twt=mean([Yact(i,WT_ind)' Ypred.full(i,WT_ind)' Ypred.prevfull(i,WT_ind)']);
Ymeans_pwt=mean([Yact(ip,WT_ind)' Ypred.full(ip,WT_ind)' Ypred.prevfull(ip,WT_ind)' Ypred.prevhalf(ip,WT_ind)']);

% Combine Y_t and Y and Y_t and Y_tp
Ymean_trans=100*[Ymeans_twt; Ymeans']./Ywt_avg(i);
Ymean_prot=100*[Ymeans_pwt; Ymeans_p']./Ywt_avg(ip);

% sort lines by targeted knockdown level
[~, sort_it]=sort(sum(Ymeans_t./Ywt_avg(it),1),'descend');
sort_it=[1 1+sort_it];
Ymean_trans=Ymean_trans(sort_it,:);
Ymean_prot=Ymean_prot(sort_it,:);

wtjinds=[WT_ind jinds];
wtjinds=wtjinds(sort_it);

% plot transcript estimates
figure;
hold on

if prevhalf_flag
    Ymean_trans=[Ymean_trans Ymean_trans(:,1)];
    b=bar(Ymean_trans,'LineWidth',1);
    
    b(1).FaceColor=[.9 .9 .9];
    b(2).FaceColor=[245 194 190]./255;
    b(3).FaceColor=[245 228 190]./255;
    b(4).FaceColor=[255 204 153]./255;
    
    b(1).EdgeColor=colors(1,:);
    b(2).EdgeColor=colors(2,:);
    b(3).EdgeColor=colors(3,:);
    b(4).EdgeColor=colors(4,:);
    
else
    b=bar(Ymean_trans,'LineWidth',1);
    
    b(1).FaceColor=[.9 .9 .9];
    b(2).FaceColor=[245 194 190]./255;
    b(3).FaceColor=[245 228 190]./255;
    
    b(1).EdgeColor=colors(1,:);
    b(2).EdgeColor=colors(2,:);
    b(3).EdgeColor=colors(3,:);
end

pause(1)

for n=1:size(Ymean_trans,1)
    plot(b(1).XData(n)+b(1).XOffset,100*Yact(i,wtjinds{n})./Ywt_avg(i),'ko','MarkerFaceColor',colors(1,:),'MarkerSize',8)
    plot(b(2).XData(n)+b(2).XOffset,100*Ypred.full(i,wtjinds{n})./Ywt_avg(i),'ko','MarkerFaceColor',colors(2,:),'MarkerSize',8)
    plot(b(3).XData(n)+b(3).XOffset,100*Ypred.prevfull(i,wtjinds{n})./Ywt_avg(i),'ko','MarkerFaceColor',colors(3,:),'MarkerSize',8)
    if prevhalf_flag
        plot(b(4).XData(n)+b(4).XOffset,100*Ypred.prevhalf(i,wtjinds{n})./Ywt_avg(i),'ko','MarkerFaceColor',colors(4,:),'MarkerSize',8)
    end
end

wt_line=refline(0,100);
wt_line.LineStyle='--';
wt_line.LineWidth=0.75;
wt_line.Color='k';
ylabel('Abundance (% of WT)','FontSize',18)
set(gca,'XTick',1:length(TargConstLine_ind)+1,'XTickLabel',['WT'; ConstLines(TargConstLine_ind(sort_it(2:end)-1))],'FontSize',18)
set(gcf,'Color','w')
ylimits=ylim;
if ylimits(2)<120
    set(gca,'YLim',[0 120])
end
title(GeneLabels(i),'FontSize',22)
if legend_flag
    leg_text={'Exper.','New Model Est.', 'Old Model-S1 Est.'};
    if prevhalf_flag
        leg_text={'Exper.','New Model Est.', 'Old Model-S1 Est.','Old Model-S2 Est.'};
    end
    legend(leg_text,'Location','Best')
end

% Plot Protein Estimates
figure;
hold on
if prevhalf_flag
    b=bar(Ymean_prot,'LineWidth',1);
    
    b(1).FaceColor=[.9 .9 .9];
    b(2).FaceColor=[245 194 190]./255;
    b(3).FaceColor=[245 228 190]./255;
    b(4).FaceColor=[255 204 153]./255;
    
    b(1).EdgeColor=colors(1,:);
    b(2).EdgeColor=colors(2,:);
    b(3).EdgeColor=colors(3,:);
    b(4).EdgeColor=colors(4,:);
    
else
    b=bar(Ymean_prot(:,1:(end-1)),'LineWidth',1);
    
    b(1).FaceColor=[.9 .9 .9];
    b(2).FaceColor=[245 194 190]./255;
    b(3).FaceColor=[245 228 190]./255;
    
    b(1).EdgeColor=colors(1,:);
    b(2).EdgeColor=colors(2,:);
    b(3).EdgeColor=colors(3,:);
end

pause(1)

for n=1:size(Ymean_trans,1)
    plot(b(1).XData(n)+b(1).XOffset,100*Yact(ip,wtjinds{n})./Ywt_avg(ip),'ko','MarkerFaceColor',colors(1,:),'MarkerSize',8)
    plot(b(2).XData(n)+b(2).XOffset,100*Ypred.full(ip,wtjinds{n})./Ywt_avg(ip),'ko','MarkerFaceColor',colors(2,:),'MarkerSize',8)
    plot(b(3).XData(n)+b(3).XOffset,100*Ypred.prevfull(ip,wtjinds{n})./Ywt_avg(ip),'ko','MarkerFaceColor',colors(3,:),'MarkerSize',8)
    if prevhalf_flag
        plot(b(4).XData(n)+b(4).XOffset,100*Ypred.prevhalf(ip,wtjinds{n})./Ywt_avg(ip),'ko','MarkerFaceColor',colors(4,:),'MarkerSize',8)
    end
end

wt_line=refline(0,100);
wt_line.LineStyle='--';
wt_line.LineWidth=0.75;
wt_line.Color='k';
ylabel('Abundance (% of WT)','FontSize',18)
set(gca,'XTick',1:length(TargConstLine_ind)+1,'XTickLabel',['WT'; ConstLines(TargConstLine_ind(sort_it(2:end)-1))],'FontSize',18)
set(gcf,'Color','w')
ylimits=ylim;
if ylimits(2)<120
    set(gca,'YLim',[0 120])
end
title(GeneLabels(ip),'FontSize',22)
if legend_flag
    leg_text={'Exper.','New Model Est.', 'Old Model-S1 Est.'};
    if prevhalf_flag
        leg_text={'Exper.','New Model Est.', 'Old Model-S1 Est.','Old Model-S2 Est.'};
    end
    legend(leg_text,'Location','Best')
end

%% Plot Targeted Bar plots
colors_gray=gray(3*length(it)+2);

Ymeans_targ=100*[Ywt_avg(it)'; Ymeans_t']./Ywt_avg(it)';
Ymeans_targprot=100*[Ywt_avg(it_prot)'; Ymeans_tprot']./Ywt_avg(it_prot)';

Ymeans_targ=Ymeans_targ(sort_it,:);
Ymeans_targprot=Ymeans_targprot(sort_it,:);

% Plot Transcript Estimates
figure;
hold on
b=bar(Ymeans_targ,'LineWidth',1);
for nt=1:length(it)
    b(nt).FaceColor=colors_gray(2*length(it)+nt+1,:);
    b(nt).EdgeColor=colors_gray(3*nt+1,:);
end

pause(1)
for nt=1:length(it)
    for n=1:size(Ymean_trans,1)
        plot(b(nt).XData(n)+b(nt).XOffset,100*Yact(it(nt),wtjinds{n})./Ywt_avg(it(nt)),'ko','MarkerFaceColor',colors_gray(3*nt+1,:),'MarkerSize',8)
    end
end
wt_line=refline(0,100);
wt_line.LineStyle='--';
wt_line.LineWidth=0.75;
wt_line.Color='k';
ylabel('Abundance (% of WT)','FontSize',18)
set(gca,'XTick',1:length(TargConstLine_ind)+1,'XTickLabel',['WT';ConstLines(TargConstLine_ind(sort_it(2:end)-1))],'FontSize',18)
set(gcf,'Color','w')
ylimits=ylim;
if ylimits(2)<120
    set(gca,'YLim',[0 120])
end
title('Transcripts of Targeted Genes','FontSize',22)
if legend_flag
    legend(TargGeneNames,'Location','Best')
end

这是一个 MATLAB 函数，用于可视化实验结果和模型预测，特别是针对靶向基因的敲除。以下是对每一句代码的解释：

function []=Knockdown_Visualizations(g,TargConst,Yact_table,Xtable,Ypred,legend_flag): 定义了一个 MATLAB 函数 Knockdown_Visualizations，用于绘制实验结果和模型预测的可视化。接受多个输入参数，包括基因名 g、目标构建名 TargConst、实验数据表 Yact_table、实验设计表 Xtable、模型预测结果 Ypred，以及是否显示图例的标志 legend_flag。
if ~isfield(Ypred,'prevhalf'): 检查 Ypred 结构体中是否包含字段 ‘prevhalf’，如果没有，将其设置为零矩阵，并将标志 prevhalf_flag 设置为 0。
else: 如果 Ypred 结构体中包含 ‘prevhalf’ 字段，将 prevhalf_flag 设置为 1。
colors=[180, 180, 180; ...]/255;: 定义颜色矩阵，用于绘制图表中的不同元素的颜色。
Experiments=Yact_table.Properties.RowNames;: 获取实验数据表中的实验名称。
GeneLabels=Yact_table.Properties.VariableNames;: 获取实验数据表中的基因名称。
Yact=Yact_table{:,:}';: 将实验数据表的转置作为实验数据矩阵。
[M,~]=size(Yact);: 获取实验数据矩阵的大小，其中 M 为行数。
% Group experiments into their constructs: 将实验分组到它们的构建中。
[Gc, Const]=findgroups(cellfun(@(x) x(6:8),Experiments,'UniformOutput',false));: 使用 findgroups 函数，根据实验名称的第 6 到第 8 个字符将实验分组，并获取构建的名称。
[Gcl, ConstLines]=findgroups(cellfun(@(x) x(6:end-2),Experiments,'UniformOutput',false));: 同样，将实验分组到它们的构建线中，获取构建线的名称。
i=find(strcmp(GeneLabels,['t' g]));: 获取基因 g 对应的转录本在 GeneLabels 中的索引。
ip=i+M/2;: 获取对应的蛋白质的索引。
TargConst_ind=find(cellfun(@(x) strcmp(x,TargConst),Const));: 获取目标构建的索引。
TargConstLine_ind=find(cellfun(@(x) contains(x,TargConst),ConstLines));: 获取目标构建线的索引。
WT_ind=find(cellfun(@(x) contains(x,'WT_'),Experiments));: 获取野生型实验的索引。
Ywt=Yact_table{WT_ind,1:M}';: 获取野生型实验的实验数据。
Ywt_avg=mean(Ywt,2);: 计算野生型实验的平均实验数据。
TargGene_inds=find(Gc==TargConst_ind);: 获取目标基因的实验索引。
it=find(Xtable{TargGene_inds(1),:});: 获取目标基因的实验设计表索引。
it_prot=it+M/2;: 获取目标基因的蛋白质实验设计表索引。
TargGeneNames=GeneLabels(it);: 获取目标基因的名称。
for n=1:length(TargConstLine_ind): 开始循环，遍历目标构建线。
jinds{n}=find(Gcl==TargConstLine_ind(n));: 获取每个目标构建线中实验的索引。
Ymeans=zeros(3,length(TargConstLine_ind));: 初始化储存平均转录本数据的矩阵。
Ymeans_p=zeros(4,length(TargConstLine_ind));: 初始化储存平均蛋白质数据的矩阵。
Ymeans_t=zeros(length(it),length(TargConstLine_ind));: 初始化储存平均靶向转录本数据的矩阵。
Ymeans_tprot=zeros(length(it),length(TargConstLine_ind));: 初始化储存平均靶向蛋白质数据的矩阵。
for n=1:length(TargConstLine_ind): 开始循环，遍历目标构建线。
if numel(jinds{n})==1: 如果目标构建线中的实验数量为 1。
Ymeans_t(:,n)=Yact(it,jinds{n});: 将实验数据中靶向转录本的数据赋值给矩阵。
Ymeans_tprot(:,n)=Yact(it_prot,jinds{n});: 将实验数据中靶向蛋白质的数据赋值给矩阵。
Ymeans(:,n)=[Yact(i,jinds{n}) Ypred.full(i,jinds{n}) Ypred.prevfull(i,jinds{n})];: 将实验数据中转录本的数据和模型预测的数据赋值给矩阵。
Ymeans_p(:,n)=[Yact(ip,jinds{n}) Ypred.full(ip,jinds{n}) Ypred.prevfull(ip,jinds{n}) Ypred.prevhalf(ip,jinds{n})];: 将实验数据中蛋白质的数据和模型预测的数据赋值给矩阵。
else: 如果目标构建线中的实验数量大于 1。
Ymeans_t(:,n)=mean(Yact(it,jinds{n})');: 计算靶向转录本的平均值。

Model_predictions.m

function Ypred=Model_Predictions(Xact,Xtarg,Xfulltrans)
if nargin<3
    prevhalf_flag=0;
else
    prevhalf_flag=1;
    if size(Xfulltrans)~=[size(Xtarg,1)/2 size(Xtarg,2)]
        error('Wrong dimensions in Xfulltrans (full transcript profile for previous model).')
    end
end

if size(Xact)~=size(Xtarg)
    error('Xact and Xtarg must have same dimensions.')
end

load ../ModelDetails.mat
load ../TranscriptProteinAbundances.mat Ywt_avg

Ywt_avg=Ywt_avg{:,:}';

[M,Nact]=size(Xtarg);

% For Old Model:
% 'full' uses experimental transcript values for just the targeted and avg.
% WT for the un-targeted transcripts
muprevfull=[Ywt_avg(1:M/2,1); zeros(M/2,1)];
Xadj_prevfull=Xtarg;

if prevhalf_flag
    % 'half' uses experimental transcript values for all transcripts, not just targeted
    muprevhalf=zeros(M,1);
    Xadj_prevhalf=[Xfulltrans;zeros(M/2,Nact)];
end

% For New Model:
% Set proteins of targeted genes to percentage of their wild-type abundance
Ytranswtavg=Ywt_avg(1:M/2,1);
Yprotwtavg=Ywt_avg(1+M/2:M,1);
Yprotnewtarg=Yprotwtavg.*(Xtarg(1:M/2,:)./Ytranswtavg);

Xadj_full=[Xtarg(1:M/2,:); Yprotnewtarg]; % adjust targeted values to include the proteins

% Allocate space for variables
Ypred.prevhalf=zeros(M,Nact);
Ypred.prevfull=zeros(M,Nact);
Ypred.full=zeros(M,Nact);

for j=1:Nact
    
    Kappa=diag(~([Xact(1:M/2,j); Xact(1:M/2,j)])); %for experiment j, diagonal matrix with 0 for targeted transcripts/proteins, 1 for others
    Kappa_prev=diag([~Xact(1:M/2,j);ones(M/2,1)]);
    
    % predictions from previous model
    Ypred.prevhalf(:,j)=(eye(M)-Kappa_prev*Boldmodel)\(Kappa_prev*muprevhalf+Xadj_prevhalf(:,j));
    Ypred.prevfull(:,j)=(eye(M)-Kappa_prev*Boldmodel)\(Kappa_prev*muprevfull+Xadj_prevfull(:,j));
    
    % predictions from our proposed model
    Ypred.full(:,j)=(eye(M)-Kappa*B)\(Kappa*mue+Xadj_full(:,j));
end

这是一个 MATLAB 函数，用于进行模型预测。以下是对每一句代码的解释：

function Ypred=Model_Predictions(Xact,Xtarg,Xfulltrans): 定义了一个 MATLAB 函数 Model_Predictions，它接受三个输入参数 Xact、Xtarg 和 Xfulltrans，并返回预测结果 Ypred。
if nargin<3: 检查输入参数的数量是否小于 3。
prevhalf_flag=0;: 如果输入参数数量小于 3，则将 prevhalf_flag 设置为 0。
else: 如果输入参数数量大于等于 3，则执行以下步骤。
prevhalf_flag=1;: 将 prevhalf_flag 设置为 1，表示有第三个输入参数。
if size(Xfulltrans)~=[size(Xtarg,1)/2 size(Xtarg,2)]: 检查 Xfulltrans 的维度是否符合预期，如果不符合，则抛出错误。
error('Wrong dimensions in Xfulltrans (full transcript profile for previous model).'): 如果维度不符，抛出错误信息。
end: 结束条件判断。
if size(Xact)~=size(Xtarg): 检查 Xact 和 Xtarg 的维度是否相同，如果不相同，则抛出错误。
error('Xact and Xtarg must have same dimensions.'): 如果维度不同，抛出错误信息。
load ../ModelDetails.mat: 从文件 ../ModelDetails.mat 中加载模型细节。
load ../TranscriptProteinAbundances.mat Ywt_avg: 从文件 ../TranscriptProteinAbundances.mat 中加载转录本和蛋白质的丰度信息，将其赋值给 Ywt_avg。
Ywt_avg=Ywt_avg{:,:}';: 将 Ywt_avg 转置为行向量。
[M,Nact]=size(Xtarg);: 获取矩阵 Xtarg 的大小，其中 M 是行数，Nact 是列数。
% For Old Model:: 注释，说明以下是用于旧模型的代码。
% 'full' uses experimental transcript values for just the targeted and avg.: 注释，说明 ‘full’ 模式使用实验中靶向和平均的转录本值。
% WT for the un-targeted transcripts: 注释，说明未靶向的转录本使用野生型（WT）的值。
muprevfull=[Ywt_avg(1:M/2,1); zeros(M/2,1)];: 旧模型的 ‘full’ 模式中，使用野生型的转录本值，其中靶向的部分使用实验值，未靶向的部分使用零。
Xadj_prevfull=Xtarg;: 旧模型的 ‘full’ 模式中，将 Xtarg 赋值给 Xadj_prevfull。
if prevhalf_flag: 如果存在第三个输入参数。
muprevhalf=zeros(M,1);: 旧模型的 ‘half’ 模式中，所有的转录本都使用实验值，未靶向的部分使用零。
Xadj_prevhalf=[Xfulltrans;zeros(M/2,Nact)];: 旧模型的 ‘half’ 模式中，将 Xfulltrans 和零组成的矩阵赋值给 Xadj_prevhalf。
% For New Model:: 注释，说明以下是用于新模型的代码。
% Set proteins of targeted genes to percentage of their wild-type abundance: 注释，说明将靶向基因的蛋白质设置为其野生型丰度的百分比。
Ytranswtavg=Ywt_avg(1:M/2,1);: 获取野生型转录本的平均丰度。
Yprotwtavg=Ywt_avg(1+M/2:M,1);: 获取野生型蛋白质的平均丰度。
Yprotnewtarg=Yprotwtavg.*(Xtarg(1:M/2,:)./Ytranswtavg);: 根据靶向基因的转录本值，调整靶向基因的蛋白质值。
Xadj_full=[Xtarg(1:M/2,:); Yprotnewtarg];: 将调整后的靶向基因的转录本和蛋白质值组成新的矩阵，赋值给 Xadj_full。
% Allocate space for variables: 注释，说明为变量分配空间。
Ypred.prevhalf=zeros(M,Nact);: 初始化旧模型 ‘half’ 模式的预测结果矩阵。
Ypred.prevfull=zeros(M,Nact);: 初始化旧模型 ‘full’ 模式的预测结果矩阵。
Ypred.full=zeros(M,Nact);: 初始化新模型的预测结果矩阵。
for j=1:Nact: 开始循环，遍历所有实验。
Kappa=diag(~([Xact(1:M/2,j); Xact(1:M/2,j)]));: 创建一个对角矩阵，其中靶向部分的元素为 0，未靶向的部分为 1，用于新模型的预测。
Kappa_prev=diag([~Xact(1:M/2,j);ones(M/2,1)]);: 创建一个对角矩阵，其中 ‘half’ 模式中靶向和未靶向的元素分别为 0 和 1，用

TranscriptProteinModelPredictions.m

clear
clc

load ../TranscriptProteinAbundances.mat

%% To emulate experiments in paper
Ytable=Yallexps_imputed;
Xtable=Xallexps_mask;
[N,M]=size(Ytable);
Xmask=Xtable{:,:}';
Xtarg=Xmask.*Yallexps_imputed{:,:}';
Xfulltrans=Yallexps_imputed{:,1:M/2}';

Ypred=Model_Predictions(Xmask,Xtarg,Xfulltrans);

GeneLabels=Ytable.Properties.VariableNames;
Experiments=Ytable.Properties.RowNames;
[Gc, Const]=findgroups(cellfun(@(x) x(6:8),Experiments,'UniformOutput',false)); %Group experiments into their different constructs (targeted knockdowns)

potential_genes_of_interest=cellfun(@(x) x(2:end),GeneLabels(1:20),'UniformOutput',0); % List of potential genes of interest for visualization
potential_knockdowns_of_interest=Const; % List of potential constructs/targeted knockdowns for visualization

%% Create bar plots showing new and old model estimates (as seen in paper)
gene_of_interest='HCT1'; % See potential_genes_of_interest for other options
knockdown_of_interest='i29'; % See potential_knockdowns_of_interest for other options
legend_flag=1; %1 for legend on plots, 0 for no legend

Knockdown_Visualizations(gene_of_interest,knockdown_of_interest,Ytable,Xtable,Ypred,legend_flag)

这是一段MATLAB代码，用于模拟实验数据并进行可视化。我将逐行解释每一句：

clear: 清除工作区的所有变量。
clc: 清除命令窗口的内容。
load ../TranscriptProteinAbundances.mat: 从文件路径 ../TranscriptProteinAbundances.mat 中加载数据，其中包含转录本和蛋白质的丰度信息。
Ytable=Yallexps_imputed;: 将加载的数据中的表格 Yallexps_imputed 赋值给变量 Ytable。
Xtable=Xallexps_mask;: 将加载的数据中的表格 Xallexps_mask 赋值给变量 Xtable。
[N,M]=size(Ytable);: 获取表格 Ytable 的大小，N是行数，M是列数。
Xmask=Xtable{:,:}';: 将表格 Xtable 转置并转换为矩阵，赋值给 Xmask。
Xtarg=Xmask.*Yallexps_imputed{:,:}';: 将 Xmask 与原始数据的转录本信息相乘，得到目标矩阵 Xtarg。
Xfulltrans=Yallexps_imputed{:,1:M/2}';: 从原始数据中提取前一半的转录本信息，赋值给 Xfulltrans。
Ypred=Model_Predictions(Xmask,Xtarg,Xfulltrans);: 调用函数 Model_Predictions，使用 Xmask、Xtarg 和 Xfulltrans 作为参数，得到预测结果 Ypred。
GeneLabels=Ytable.Properties.VariableNames;: 获取表格 Ytable 的变量名，即基因标签。
Experiments=Ytable.Properties.RowNames;: 获取表格 Ytable 的行名，即实验标签。
[Gc, Const]=findgroups(cellfun(@(x) x(6:8),Experiments,'UniformOutput',false));: 根据实验标签的第6到第8个字符，将实验分组为不同的构造 (Const)，并返回分组的索引和构造名称。
potential_genes_of_interest=cellfun(@(x) x(2:end),GeneLabels(1:20),'UniformOutput',0);: 获取前20个基因的标签，去掉第一个字符（假设是 ‘t’），赋值给 potential_genes_of_interest。
potential_knockdowns_of_interest=Const;: 将构造的名称赋值给 potential_knockdowns_of_interest。
gene_of_interest='HCT1';: 选择一个感兴趣的基因，这里选择的是 ‘HCT1’。
knockdown_of_interest='i29';: 选择一个感兴趣的构造/靶向敲除，这里选择的是 ‘i29’。
legend_flag=1;: 设置一个标志，表示是否在绘图中显示图例，这里设置为显示。
Knockdown_Visualizations(gene_of_interest,knockdown_of_interest,Ytable,Xtable,Ypred,legend_flag): 调用函数 Knockdown_Visualizations，传递基因、构造、数据表格和预测结果等信息，生成相关的可视化图表。

yujie2023

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Modeling cross-regulatory influences on monolignol transcripts and proteins under single and combina

Modeling cross-regulatory influences on monolignol transcripts and proteins undersingle and combinatorial gene knockdowns in Populus trichocarpa
复制链接

扫一扫