How Big Data Can Help Save the World

How Big Data Can Help Save the World

Our ability to collect data far outpaces 1 ​ ^1​ 1 our ability to fully utilize it—yet those data may hold the key to solving some of the biggest global challenges facing us today.

outpaces [ˌaʊtˈpeɪs] 超过

utilize [ˈju:təlaɪz] 利用

我们搜集信息的能力 远远强于 1 ^1 1 分析使用的能力,然而,这些消息可能包含了我们现如今正在面临的全球性挑战的解决办法。


Take, for instance, the frequent outbreaks of waterborne illnesses as a consequence of war or natural disasters. The most recent example can be found in Yemen, where roughly 10,000 new suspected cases of cholera are reported each week—and history is riddled with 1 ^1 1 similar stories. What if we could better understand the environmental factors that contributed to the disease, predict which communities are at higher risk, and put in 2 ^2 2 place protective measures to stem the spread?

suspected [səs’pektɪd] 怀疑,疑似

cholera [ˈkɒlərə] 霍乱

riddle [ˈrɪdl] 谜语,筛分 riddled 充满

factor [ˈfæktə®] 因素

比如,战后或自然灾难引起的水源性传播疾病频繁爆发。最近的例子发生在也门,每个星期也门新发现约一万例疑似霍乱病例。而且历史总是 (充满) 1 ^1 1 相似的(故事)。如果我们能更好地理解环境因素对该病的影响,提前预测高风险社区, (投入,做) 2 ^2 2 以保护性方法来阻止源头传播,将会怎么样呢?


Answers to these questions and others like them could potentially help us avert catastrophe.

avert [əˈvɜ:t] 避免

解决这些以及与之相似的问题,可能有助于帮助我们预防这种大灾难的发生。


We already collect data related to 1 ​ ^1​ 1 virtually everything, from birth and death rates to crop yields and traffic flows. IBM estimates that each day, 2.5 quintillion bytes of data are generated. To put that in perspective: that’s the equivalent of all the data in the Library of Congress being produced more than 166,000 times per 24-hour period. Yet we don’t really harness the power of all this information. It’s time that changed—and thanks to recent advances in data analytics and computational services, we finally have the tools to do it.

virtually [ˈvɜ:tʃuəli] 无形中,几乎,事实上

quintillion [kwɪnˈtɪljən] (美、法)百万的三次方,(英、德)百万的五次方

perspective [pəˈspektɪv] 透镜,观点,洞察力

equivalent [ɪˈkwɪvələnt] 相当的,等价的

harness [ˈhɑ:nɪs] 控制,马具

我们几乎为每样 (有关) 1 ​ ^1​ 1 东西收集数据,从出生率死亡率到粮食变量和交通状况。IBM公司估计每天有2.5x10¹⁸个字节的数据产生。从这个角度来看:这等同于美国国会图书馆每24小时产生的数据的16.6万倍。但我们并不能掌控所有的信息。但由于近来先进的数据分析和计算机服务,我们终于有了改变它的工具。


As a data scientist for Los Alamos National Laboratory, I study data from wide-ranging, public sources to identify patterns in hopes of 1 ^1 1 being able to predict trends that could be a threat to global security. Multiple data streams are critical because the ground-truth 2 ^2 2 data (such as surveys) that we collect is often delayed, biased, sparse, incorrect or, sometimes, nonexistent.

patterns ['pætənz] 方式,模式

bias [ˈbaɪəs] 偏见

sparse [spɑ:s] 稀疏的

作为洛斯阿拉莫斯国家实验室(Los Alamos National Laboratory)的数据科学家,我研究来自广泛公共来源的数据,以确定模式,希望 (寄希望于 1 ^1 1) 能够预测可能对全球安全构成威胁的趋势。多个数据流是至关重要的,因为我们收集的 基本事实 2 ​ ^2​ 2 数据(比如调查)常常是延迟的、有偏见的、稀疏的、不正确的,有时甚至是不存在的。


For example, knowing mosquito incidence in communities would help us predict the risk of mosquito-transmitted disease such as dengue, the leading cause of illness and death in the tropics. However, mosquito data at a global (and even national) scale are not available.

mosquito [məˈski:təʊ] 蚊子

incidence [ˈɪnsɪdəns] 影响范围,关联

tropics ['trɒpɪks] 热带地区

scale [skeɪl] 规模,测量

比如,了解社区中的蚊子发病率将有助于我们预测蚊子引起的疾病比如登革热等的风险,登革热是导致热带地区疾病和死亡的首要原因。然而,目前还没有全球(甚至全国)规模的蚊虫数据。


To address 1 ^1 1 this gap, we’re using other sources such as satellite imagery, climate data and demographic information to estimate dengue risk. Specifically, we had success predicting the spread of dengue in Brazil at the regional, state and municipality level using these data streams as well as clinical surveillance data and Google search queries that used terms related to the disease. While our predictions aren’t perfect, they show promise. Our goal is to combine information from each data stream to further refine our models and improve their predictive power.

demographic [ˌdemə’ɡræfɪk] 人口统计学

regional [ˈri:dʒənl] 区域性,地区的

municipality [mju:ˌnɪsɪˈpæləti] 自治市

clinical [ˈklɪnɪkl] 临床的

surveillance [sɜ:ˈveɪləns] 监督,监视,检测

query [ˈkwɪəri] 询问,查询

term [tɜ:m] 术语

refine [rɪˈfaɪn] 改善,提炼

为了弥补 (解决) 1 ^1 1 这一差距,我们正在利用卫星图像、气候数据和人口信息等其他来源来估计登革热风险。具体来说,我们成功地利用这些数据流、临床监测数据和使用与疾病有关的术语的谷歌搜索查询,预测了登革热在巴西的地区、州和市一级的蔓延。虽然我们的预测并不完美,但它们显示出了希望。我们的目标是将来自每个数据流的信息结合起来,以进一步完善我们的模型并提高它们的预测能力。


Similarly, to forecast the flu season, we have found that Wikipedia and Google searches can complement clinical data. Because the rate of people searching the internet for flu symptoms often increases during their onset, we can predict a spike in cases 1 ^1 1 where clinical data lags.

complement [ˈkɒmplɪment] 补充

symptom ['sɪmptəm] 征兆,症状

onset 攻击,开端

spike [spaɪk] 猛增,尖状物

同样,为了预测流感季节,我们发现维基百科和谷歌搜索可以补充临床数据。由于人们在互联网上搜索流感症状的比率在发病期间经常增加, (以防 1 ^1 1) 在临床数据滞后的情况下,我们可以预测病例的激增。


We’re using these same concepts to expand our research beyond disease prediction to better understand public sentiment. In partnership with 1 ^1 1 the University of California, we’re conducting a three-year study using disparate data streams to understand whether opinions expressed on social media map to 2 ^2 2 opinions expressed in surveys.

sentiment [ˈsentɪmənt] 感情,观点

conduct [kənˈdʌkt] 传导,带领

disparate [ˈdɪspərət] 完全不同的

我们用同样的概念来扩展我们的研究以更好地理解大众的想法。我们正在进行一项 加州大学 合作 1 ^1 1 的为期三年的研究,该研究运用不同的数据流来了解社交媒体上所表达的观点是否与调查中所表述的 一致 2 ^2 2


For example, in Colombia, we are conducting a study to see whether social media posts about the peace process between the government and FARC, the socialist guerilla movement, can be ground-truthed with survey data. A University of California, Berkeley researcher is conducting on-the-ground surveys 1 ^1 1 throughout Colombia(including in isolated rural areas)to poll citizens about the peace process. Meanwhile, at Los Alamos, we’re analyzing social media data and news sources from the same areas to determine if they align with 2 ^2 2 the survey data.

socialist [ˈsəʊʃəlɪst] 社会主义的

guerilla [gə’rɪlə] 游击队

throughout [θru:ˈaʊt] 自始至终,遍及…地域,在…期间

poll [pəʊl] 民意调查

align [əˈlaɪn] 使成一线

例如,在哥伦比亚,我们正在进行一项研究,看看关于政府和FARC(社会主义游击队运动)之间和平进程的社交媒体帖子是否可以用调查数据来证实。加州大学伯克利分校的一名研究员正在哥伦比亚各地(包括偏远的农村地区)进行 实地调查 1 ^1 1 ,调查公民对和平进程的看法。与此同时,在洛斯阿拉莫斯,我们正在分析来自同一地区的社交媒体数据和新闻来源,以确定它们是否与调查数据 一致 2 ​ ^2​ 2


If we can demonstrate that social media accurately captures a population’s sentiment, it could be a more affordable, accessible and timely alternative to what are otherwise expensive and logistically challenging surveys. In the case of 1 ^1 1 disease forecasting, if social media posts did indeed serve as 2 ​ ^2​ 2 a predictive tool for outbreaks, those data could be used in educational campaigns to inform citizens of the risk of an outbreak (due to vaccine exemptions, for example) and ultimately reduce that risk by promoting protective behaviors (such as washing hands, wearing masks, remaining indoors, etc. ).

demonstrate [ˈdemənstreɪt] 证明,游行

alternative [ɔ:lˈtɜ:nətɪv] 替代的,可供选择的

logistically:logistic [lə’dʒɪstɪkl] 逻辑

campaign [kæmˈpeɪn] 战役,运动

vaccine [ˈvæksi:n] 疫苗

exemption [ɪgˈzempʃn] 免除

如果我们能证明社交媒体能准确捕捉公众情绪,相较于昂贵、交通十分不便的调查而言,它就可以成为一种更实惠、可获取和及时的替代方法。 (万一;如果) 1 ^1 1 如预测疾病时,如果社交媒体数据确实是有 (充当) 2 ^2 2 效预测疾病爆发的工具,这些数据就可以用来教育公众,告诉他们有疾病爆发的风险(例如疫苗豁免),并最终通过促进保护性措施来减小危害(如洗手、戴口罩、待在室内等)。


All of this illustrates the potential for big data to solve big problems. Los Alamos and other national laboratories that are home to some of the world’s largest supercomputers have the computational power augmented by machine learning and data analysis to take this information and shape it into a story that tells us not only about one state or even nation, but the world as a whole. The information is there; now it’s time to use it.

illustrate [ˈɪləstreɪt] 说明

augment [ɔ:gˈment] 增强

所有这些都表明用大数据解决大问题的潜力。洛斯阿拉莫斯和其他国家实验室拥有世界最大的超级电脑,且因为机器学习和数据分析,其运算能力更加强大,因此可以运用信息,传递消息,不仅仅惠及一个州,一个国家,而且是整个世界。信息就在那里,是时候使用它了。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
在MATLAB中,NURBS(非均匀有理B样条)是一种强大的数学工具,用于表示和处理复杂的曲线和曲面。NURBS在计算机图形学、CAD(计算机辅助设计)、CAM(计算机辅助制造)等领域有着广泛的应用。下面将详细探讨MATLAB中NURBS的绘制方法以及相关知识点。 我们需要理解NURBS的基本概念。NURBS是B样条(B-Spline)的一种扩展,其特殊之处在于引入了权重因子,使得曲线和曲面可以在不均匀的参数空间中进行平滑插值。这种灵活性使得NURBS在处理非均匀数据时尤为有效。 在MATLAB中,可以使用`nurbs`函数创建NURBS对象,它接受控制点、权值、 knot向量等参数。控制点定义了NURBS曲线的基本形状,而knot向量决定了曲线的平滑度和分布。权值则影响曲线通过控制点的方式,大的权值会使曲线更靠近该点。 例如,我们可以使用以下代码创建一个简单的NURBS曲线: ```matlab % 定义控制点 controlPoints = [1 1; 2 2; 3 1; 4 2]; % 定义knot向量 knotVector = [0 0 0 1 1 1]; % 定义权值(默认为1,如果未指定) weights = ones(size(controlPoints,1),1); % 创建NURBS对象 nurbsObj = nurbs(controlPoints, weights, knotVector); ``` 然后,我们可以用`plot`函数来绘制NURBS曲线: ```matlab plot(nurbsObj); grid on; ``` `data_example.mat`可能包含了一个示例的NURBS数据集,其中可能包含了控制点坐标、权值和knot向量。我们可以通过加载这个数据文件来进一步研究NURBS的绘制: ```matlab load('data_example.mat'); % 加载数据 nurbsData = struct2cell(data_example); % 转换为cell数组 % 解析数据 controlPoints = nurbsData{1}; weights = nurbsData{2}; knotVector = nurbsData{3}; % 创建并绘制NURBS曲线 nurbsObj = nurbs(controlPoints, weights, knotVector); plot(nurbsObj); grid on; ``` MATLAB还提供了其他与NURBS相关的函数,如`evalnurbs`用于评估NURBS曲线上的点,`isoparm`用于生成NURBS曲面上的等参线,以及`isocurve`用于在NURBS曲面上提取特定参数值的曲线。这些工具对于分析和操作NURBS对象非常有用。 MATLAB中的NURBS功能允许用户方便地创建、编辑和可视化复杂的曲线和曲面。通过对控制点、knot向量和权值的调整,可以精确地控制NURBS的形状和行为,从而满足各种工程和设计需求。通过深入理解和熟练掌握这些工具,可以在MATLAB环境中实现高效的NURBS建模和分析。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值