数学建模优化类算法分类_必须知道的用于疾病建模的新聚类算法

数学建模优化类算法分类

This article explains a new method for clustering disease data by both subtype and stage called SuStaIn (Subtype & Stage Inference). It explains the concept, summarises the maths, and provides a link to the python code.

本文介绍了一种通过以下方法对疾病数据进行聚类的新方法 子类型阶段都称为SuStaIn(子类型和阶段推断)。 它解释了概念,总结了数学,并提供了python代码的链接

Classic clustering algorithms like K-Means and Gaussian Mixture Model (GMM) are great for modelling data when we want to find cross-sectional subtypes (aka clusters). This kind of subtyping is used a lot in medicine. A well-known general example is that of subtyping diabetes into “Type I” and “Type II” using a single blood sugar measurement. This can help doctors decide whether to prescribe insulin injections or lifestyle changes.

当我们要查找横截面子类型(又称为聚类)时,像K-Means和高斯混合模型(GMM)这样的经典聚类算法非常适合对数据建模。 这种子类型在医学​​中被大量使用。 众所周知的一般示例是使用单次血糖测量将糖尿病分为“ I型”和“ II型” 。 这可以帮助医生决定是否开胰岛素注射或改变生活方式。

Image for post
Figure 1. Subtypes of a disease by phenotype using cross-sectional biodata; for example, a single measurement of blood sugar or a medical image. An example would be diabetes which is subtyped into “Type I” and “Type II”. Image created by author. 图1 。 使用横断面生物数据按表型划分的疾病亚型; 例如,一次血糖测量或医学图像测量。 一个例子是糖尿病,其被分为“ I型”和“ II型”。 图片由作者创建。

Grouping a disease by stage is also very useful in medicine, this time for modelling disease progression. For example, a model for grouping cancer into stages 1–4 was developed using longitudinal data (multiple measurements from the same person over time). The model itself was developed using longitudinal data but once developed, allowed doctors to determine which stage a patient is at using only a single cross sectional measurement (i.e., tumour size in millimetres). Knowing the stage of cancer may help doctors decide whether radiotherapy or chemotherapy is needed.

在医学中,将疾病按阶段分组也是非常有用的,这一次可以模拟疾病的进展。 例如,使用纵向数据(一段时间内同一个人的多次测量)开发了将癌症分为1至4期的模型。 该模型本身是使用纵向数据开发的,但是一旦开发,医生就可以使用单个横截面测量(即,以毫米为单位的肿瘤大小)来确定患者处于哪个阶段。 了解癌症的阶段可能有助于医生确定是否需要放疗或化疗。

Image for post
Figure 2. Grouping disease by stage, using longitudinal data which measures progression over time. An example would be cancer which is usually grouped into stages 1–4. Image created by Author. 图2.使用纵向数据对疾病进行分期,纵向数据可衡量随时间的进展。 一个例子是癌症,通常分为1-4期。 图片由作者创建。

The downside of these kinds of staging models is that they assume all patients come from the same type of the disease. I.e., they account for disease progression, but there is no account of disease subtypes.

这些分期模型的缺点是,它们假定所有患者都来自同一类型的疾病。 即,它们解释了疾病的进展,但没有解释疾病的亚型

Conversely, the “Cross-sectional subtypes” mentioned previously explain subtypes but not progression. I.e., they assume all patients are at the same stage.

相反,前面提到的“横断面亚型”解释了亚型,但没有解释进展。 他们假设所有患者都处于同一阶段。

So what if we want to do both, i.e. find subtypes of a disease based on how it progresses over time, and create that model using only cross-sectional data?

那么,如果我们想同时做这两种事情,即根据疾病随时间的进展找到疾病的亚型,并仅使用横截面数据来创建该模型,该怎么办?

引入Z分数SuStaIn(子类型和阶段推断)算法 (Introducing the Z-Score SuStaIn (Subtype & Stage Inference) Algorithm)

The Z-Score SuStaIn is an unsupervised machine-learning technique that identifies population subgroups (clusters) with distinct patterns of disease progression based on biomarkers. This is shown abstractly in figure 3 below.

Z-Score SuStaIn是一种无监督的机器学习技术,可基于生物标记物识别具有不同疾病进展模式的人群亚组(集群)。 如下图3所示。

Image for post
Figure 3. The circles represent groups of biomarkers. The colours (green, orange, blue) represent different subtypes i.e., different progression patterns. Note: the subtypes produced in SuStaIn are not based on clinically defined phenotypic disease subtypes (such as Type I and Type II diabetes), but rather on how they each progress 图3 。 圆圈代表生物标志物的组。 颜色(绿色,橙色,蓝色)代表不同的亚型,即不同的渐进模式。 注意:SuStaIn中产生的亚型不是基于临床定义的表型疾病亚型(例如I型和II型糖尿病),而是基于它们 over time. Image created by author. 随着时间的进展。 图片由作者创建。

Biomarker: “Any substance, structure or process that can be measured in the body or its products and can influence or predict the incidence of outcome or disease” (WHO, 2011). This includes everything from blood sugar measurements and heart rate to

  • 1
    点赞
  • 21
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值