R语言 iterativeBMAsurv包 crossVal()函数中文帮助文档(中英文对照)

R语言 iterativeBMAsurv包 crossVal()函数中文帮助文档(中英文对照)
http://www.biostatistic.net/thread-12526-1-1.html
(出处: 生物统计家园)
crossVal(iterativeBMAsurv)
crossVal()所属R语言包：iterativeBMAsurv

                                    Cross Validation for Iterative Bayesian Model Averaging
                                     交叉验证的迭代贝叶斯模型平均

                                     译者：生物统计家园网 机器人LoveR

描述----------Description----------

This function performs k runs of n-fold cross validation on a training dataset for survival analysis on microarray data, where
这个函数执行ķ运行n倍交叉验证，对微阵列数据的生存分析的训练集，其中

用法----------Usage----------

crossVal(exset, survTime, censor, diseaseType=“cancer”, nbest=10, maxNvar=25, p=100, cutPoint=50, verbose=FALSE, noFolds=10, noRuns=10)

参数----------Arguments----------

参数：exset
Data matrix for the training set where columns are variables and rows are observations. In the case of gene expression data, the columns (variables) represent genes, while the rows (observations) represent samples. The data is not assumed to be pre-sorted by rank.
数据为训练集，列变量和行矩阵意见。在基因表达数据的情况下，列（变量）代表的基因，而行（意见）代表样品。不承担由排名预先排序的数据。

参数：survTime
Vector of survival times for the patient samples in the training set. Survival times are assumed to be presented in uniform format (e.g., months or days), and the length of this vector should be equal to the number of rows in exset.
矢量病人在训练集样本的存活时间。假设存活时间要统一格式（例如，数月或数天），这个向量的长度应该是平等的行在exset数。

参数：censor
Vector of censor data for the patient samples in the training set. In general, 0 = censored and 1 = uncensored. The length of this vector should equal the number of rows in exset and the number of elements in survTime.
向量的病人在训练集样本的数据进行审查。在一般情况下，0 =审查，1 =未经审查的。这个向量的长度应等于行在exset元素在survTime的数量。

参数：diseaseType
String denoting the type of disease in the training dataset (used for writing to file). Default is ‘cancer’.
字符串，表示在训练集（用于写入文件）的疾病类型。默认为“癌症”。

参数：nbest
A number specifying the number of models of each size returned to bic.surv in the BMA package. The default is 10.
返回一个数字，指定每个大小的模型bic.survBMA包。默认为10。

参数：maxNvar
A number indicating the maximum number of variables used in each iteration of bic.surv from the BMA package. The default is 25.
一个数字，指示中用于bic.survBMA包的每个迭代变量的最大数目。默认值为25。

参数：p
A number indicating the maximum number of top univariate genes used in the iterative bic.surv algorithm. This number is assumed to be less than the total number of genes in the training data. A larger p usually requires longer computational time as more iterations of the bic.surv algorithm are potentially applied. The default is 100.
一个数字，指示迭代bic.surv算法中使用的顶级单因素基因的最大数量。这个数字被认为是比在训练数据的基因总数少。一个更大的P级通常需要更长的计算时间bic.surv算法迭代潜在应用。默认是100。

参数：cutPoint
Threshold percent for separating high- from low-risk groups. The default is 50.
分离高从低风险群体的阈值％。默认值是50。

参数：verbose
A boolean variable indicating whether or not to print interim information to the console. The default is FALSE.
一个布尔变量，表示是否打印到控制台的临时信息。默认值为FALSE。

参数：noFolds
A number specifying the desired number of folds in each cross validation run. The default is 10.
一个数字，指定在每个交叉验证运行所需数量的褶皱。默认为10。

参数：noRuns
A number specifying the desired number of cross validation runs. The default is 10.
运行一个数字，指定交叉验证所需数量。默认为10。

Details

详情----------Details----------

This function performs k runs of n-fold cross validation, where k and n are specified by the user through the noRuns and noFolds arguments respectively. For each run of cross validation, the training set, survival times, and censor data are re-ordered according to a random permutation. For each fold of cross validation, 1/nth of the data is set aside to act as the validation set. In each fold, the iterateBMAsurv.train.predict.assess function is called in order to carry out a complete run of survival analysis. This means the univariate ranking measure for this cross validation function is Cox Proportional Hazards Regression; see iterateBMAsurv.train.wrapper to experiment with alternate univariate ranking methods. With each run of cross validation, the survival analysis statistics are saved and
这个函数执行k的n倍交叉验证，K和N通过noRuns和noFolds参数分别指定用户运行。对于每个交叉验证，训练集，存活时间，检查员运行数据重新排序，根据随机排列。对于每个倍交叉验证，1/n日的数据是预留作为验证集。在每个倍，被称为iterateBMAsurv.train.predict.assess功能以进行生存分析的一个完整的运行。这意味着这种交叉验证功能的排名措施是单因素Cox比例风险回归;看到iterateBMAsurv.train.wrapper候补单因素的排名方法与实验。每个交叉验证的运行，生存分析统计资料保存和

值----------Value----------

The output of this function is a series of files giving information on cross validation results. The file beginning with ‘foldresults’ contains information for every fold in the form of a 2 x 2 table indicating the number of test samples in each category (high-risk or censored, high-risk or uncensored, low-risk or censored, low-risk or uncensored). This file also gives the accumulated percentage of uncensored statistic from each run. The file beginning with ‘runresults’ gives the total number of test samples assigned to each category along with percentage uncensored across the entire run. The end of this file contains this same information, averaged across all runs. The file beginning with ‘stats’ gives the statistics from each fold, including the p-value, chi-square statistic, and variance matrix. Finally, the file beginning with ‘avg_p_value_chi_square’ gives the overall means and standard deviations of the p-values and chi-square statistics across all runs and all
这个函数的输出是一系列文件，提供交叉验证结果的资料。的foldresults文件的开头包含在一个2×2表表明在每个类别中的试验样品的数量（高风险或审查，高风险或未经审查的，低风险或审查，低的形式为每倍的信息风险或未经审查的）。该文件还提供未经审查的统计，从每次运行的累计百分比。该文件的开头runresults“给出了随着整个运行未经审查的百分比分配给每个类别的测试样品总数。这个文件包含此相同的信息，平均在所有运行。与“统计”的文件开始给每个倍数的统计，包括p值，卡方统计，方差矩阵。最后，“AVG \ _p \ _value \ _chi \ _square”给人的整体手段和P-值的标准偏差和卡方统计，在所有运行和所有文件的开头

注意----------Note----------

The BMA package is required. Also, smaller training sets may lead to cross validation folds where all test samples are assigned to one risk group or all samples are in the same censor category (all samples are either censored or uncensored). In this case, the fold is skipped, and cross validation proceeds from the next fold. This particular error will be evidenced by a missing fold result in the output files. All averages will be calculated as if this fold had
BMA包是必需的。此外，较小的训练集，可能会导致交叉验证所有测试样品被分配到一个危险群或全部样品的褶皱是在同一检查员类别（所有样品被审查或审查）。在这种情况下，被跳过倍，从下折交叉验证的收益。这个特殊的错误，将在输出文件丢失倍，结果证明。将计算所有平均值，如果这一倍

参考文献----------References----------

Iterative Bayesian Model Averaging for Survival Analysis. Manuscript in Progress.
Bayesian model selection in social research (with Discussion). Sociological Methodology 1995 (Peter V. Marsden, ed.), pp. 111-196, Cambridge, Mass.: Blackwells.
Bayesian Model Averaging in Proprtional Hazard Models: Assessing the Risk of a Stroke. Applied Statistics 46: 433-448.
Bayesian Model Averaging: Development of an improved multi-class, gene selection and classification tool for microarray data. Bioinformatics 21: 2394-2402.

参见----------See Also----------

iterateBMAsurv.train.predict.assess iterateBMAsurv.train.wrapper, iterateBMAsurv.train, singleGeneCoxph, predictBicSurv, predictiveAssessCategory, trainData, trainSurv, trainCens
iterateBMAsurv.train.predict.assessiterateBMAsurv.train.wrapper，iterateBMAsurv.train，singleGeneCoxph，predictBicSurv，predictiveAssessCategory，trainData，trainSurv，

举例----------Examples----------

library (BMA)
library(iterativeBMAsurv)
data(trainData)
data(trainSurv)
data(trainCens)

Perform 1 run of 2-fold cross validation on the training set, using p=10 genes and nbest=5 for fast computation[＃1 2倍交叉验证的训练集上运行，使用快速计算P = 10基因和nbest的= 5]

cv <- crossVal (exset=trainData, survTime=trainSurv, censor=trainCens, diseaseType=“DLBCL”, noRuns=1, noFolds=2, p=10, nbest=5)

Upon completion of this function, all relevant output files will be in the working R directory.[＃此功能完成后，所有相关的输出文件将在r目录的工作。]

转载请注明:出自生物统计家园网(http://www.biostatistic.net)。

注：
注1：为了方便大家学习，本文档为生物统计家园网机器人LoveR翻译而成，仅供个人R语言学习参考使用，生物统计家园保留版权。
注2：由于是机器人自动翻译，难免有不准确之处，使用时仔细对照中、英文内容进行反复理解，可以帮助R语言的学习。
注3：如遇到不准确之处，请在本贴的后面进行回帖，我们会逐渐进行修订。