In March 2020, to help data scientists working on COVID-19 diagnostic tools, TrainingData.io provided a free collaborative workspace preloaded with the open-source dataset including chest X-ray and chest CT images. We would like to thank our users for their contributions. We learned a lot from our users. Now, TrainingData.io has launched a drag-and-drop user experience to bring the power of AI to every data-scientist, hospital, and clinic around the world fighting COVID-19 infection.
2020年3月,为了帮助数据科学家研究COVID-19诊断工具,TrainingData.io提供了一个免费的协作工作区,该工作区预载了包括胸部X射线和胸部CT图像在内的开源数据集。 我们要感谢用户的贡献。 我们从用户那里学到了很多东西。 现在,TrainingData.io已开始提供拖放式用户体验,以将AI的强大功能带给世界各地与COVID-19感染作斗争的每个数据科学家,医院和诊所。
In the fight against COVID-19, the medical staff has four different kinds of diagnostic data available to them: a) RT-PCR test results, b) Antibodies test results, c) Chest XRay imaging and, d) Chest CT imaging. RT-PCR test results and Antibodies test results are being used as the first step for the detection of COVID-19. Chest XRay and Chest CT imaging are being used to observe the progression of the disease through the lungs of a patient diagnosed with COVID-19.
在与COVID-19的斗争中,医务人员可以使用四种不同的诊断数据:a)RT-PCR测试结果,b)抗体测试结果,c)胸部X射线成像和d)胸部CT成像。 RT-PCR测试结果和抗体测试结果被用作检测COVID-19的第一步。 胸部X射线和胸部CT成像正被用于通过诊断为COVID-19的患者的肺部观察疾病的进展。
Lung damage in asymptomatic cases: researchers have studied and published clinical patterns of asymptomatic infections in Nature medicine. They have found that asymptomatic cases of COVID-19 show long term damage caused to lungs. In one such study, it was found that Chest CT exams of asymptomatic patients showed “striped shadows”, and in some cases “ground-glass opacities” which is a sign of lung inflammation.
无症状情况下的肺损害 :研究人员在Nature Medicine上研究和发表了无症状感染的临床模式。 他们发现无症状的COVID-19病例显示了对肺造成的长期损害。 在一项此类研究中,发现无症状患者的胸部CT检查显示“条纹阴影”,在某些情况下显示“毛玻璃浑浊”,这是肺部炎症的征兆。
Due to the sheer size of the population affected by this virus, the ability to study lung damage in asymptomatic cases depends on the easy availability of software tools to automatically detect and visualize COVID-19 infection in Chest CT exams.
由于受此病毒影响的人群的绝对数量,在无症状病例中研究肺部损伤的能力取决于在胸部CT检查中可自动检测和可视化COVID-19感染的软件工具的简便可用性。
胸部CT数据中的COVID-19毛玻璃不透明(GGO)和合并 (COVID-19 Ground Glass Opacities (GGOs) & consolidations in Chest CT data)
When a COVID-19 patient has the virus progressing through their body, there is a build-up of fluid in the tiny air sacs in the lungs called alveoli [2]. The presence of this fluid causes inflammation of the lungs. The growth in inflammation of the lungs can be observed in XRay and CT imaging. The inflammation of the lungs shows up in the form of ground-glass opacities (GGOs) that are followed by ground glass consolidations.
当一名COVID-19病人的病毒在体内传播时,肺中微小的气囊中会积聚大量液体,称为肺泡[2] 。 这种液体的存在会引起肺部炎症。 可以在X射线和CT成像中观察到肺部炎症的增长。 肺部炎症以毛玻璃样混浊(GGO)的形式出现,随后是毛玻璃结实。
The medical staff has to use some criteria to make decisions about putting a patient on an oxygen-therapy or ventilator system or putting a recovering-patient off the ventilator system. Visualizing GGOs/consolidation patterns in CT imaging plays an important role in helping medical staff to make proper decisions.
医务人员必须使用一些标准来做出决定,决定将患者置于氧气治疗或呼吸机系统上,还是让康复患者脱离呼吸机系统。 可视化CT成像中的GGO /合并模式在帮助医务人员做出正确决定方面起着重要作用。
在TrainingData.io上对COVID-19的GGO /合并进行细分 (Segmentation of GGOs/consolidations for COVID-19 on TrainingData.io)
Starting with a chest CT dataset from the patients diagnosed with COVID-19 in PCR tests, data scientists need to create segmentation masks of Lung, and segmentation masks of ground-glass-opacities (COVID-19 infection). TrainingData.io provides a privacy-preserving parallel-distributed framework for outsourcing annotation work to multiple radiologists. All slices in a CT exam can be accurately annotated at the pixel level and visualized in 3D annotation client provided by TrainingData.io.
从在PCR测试中被诊断为COVID-19的患者的胸部CT数据集开始,数据科学家需要创建肺的分割蒙版和毛玻璃样浑浊的分割蒙版(COVID-19感染)。 TrainingData.io提供了一个保护隐私的并行分发框架,用于将注释工作外包给多位放射科医生。 可以在像素级别上准确注释CT检查中的所有切片,并在TrainingData.io提供的3D注释客户端中将其可视化。
GGO /合并的半自动AI辅助细分 (Semi-automatic AI-assisted segmentation of GGOs/consolidations)
TrainingData.io provides a segmentation ML model that generates ground-glass-opacities/consolidations for an input chest CT exam. This model can be used to seed the initial segmentation. Radiologists can later fix the results of automatic segmentation.
TrainingData.io提供了细分ML模型,该模型可为输入的胸部CT检查生成毛玻璃混浊/合并。 该模型可用于播种初始分割。 放射科医生可以稍后修复自动分割的结果。
只需单击几下即可从带注释的胸部CT数据集到ML(细分)模型 (From annotated chest CT dataset to ML (segmentation) model in a few clicks)
Why would a regional hospital or clinic need an ML model for the segmentation of GGOs/consolidations trained using its dataset? The answer to this question lies in the science behind the mutation of viruses. COVID-19 is caused by a virus called SARS-CoV-2. This virus has been found to mutate at a fast rate with a large proportion of the genetic diversity being found in all the affected countries. Different mutations of the virus may affect the local population in different ways.
为什么区域医院或诊所需要使用ML模型来对使用其数据集训练的GGO /合并进行细分? 这个问题的答案在于病毒突变背后的科学。 COVID-19由称为SARS-CoV-2的病毒引起。 已经发现该病毒快速突变,在所有受影响国家中发现了很大比例的遗传多样性。 病毒的不同突变可能以不同方式影响当地人群。
Once the CT dataset and the segmentation masks are ready, a data scientist has to re-train the existing machine learning model to better suit the local demographics. This can be achieved with few clicks in TrainingData.io web-application.
一旦准备好CT数据集和分割蒙版,数据科学家就必须重新训练现有的机器学习模型,以更好地适应本地人口统计。 只需在TrainingData.io Web应用程序中单击几下即可实现。
Note: GGOs/consolidations generated on TrainingData.io are only for research purposes, not meant for clinical diagnosis.
注意 :在TrainingData.io上生成的GGO /合并仅用于研究目的,并不用于临床诊断。
[1] https://www.nature.com/articles/s41591-020-0965-6.pdf
[1] https://www.nature.com/articles/s41591-020-0965-6.pdf
[2] https://www.webmd.com/lung/what-does-covid-do-to-your-lungs
[2] https://www.webmd.com/lung/what-does-covid-do-to-your-lungs
This blog was first published in TrainingData.io Blog.
该博客最初发布在TrainingData.io Blog中 。
The first post in the series: https://towardsdatascience.com/covid-19-imaging-dataset-chest-xray-ct-for-annotation-collaboration-5f6e076f5f22
该系列的第一篇文章: https : //towardsdatascience.com/covid-19-imaging-dataset-chest-xray-ct-for-annotation-collaboration-5f6e076f5f22