机器学习解决什么问题_机器学习可以解决教育中的问题吗?

机器学习解决什么问题

By Dave Kearney and Zach Wasilew

Dave KearneyZach Wasilew

There are a few essential items you need to solve a problem with machine learning: a straightforward problem to solve, a bunch of data that illustrates the problem, and a human to organize the data to use machine learning to solve the problem. Oh, and a bunch of computing power. It seems straightforward, right? Not so fast! Jumping those hurdles can be challenging. Machine learning can assist with many opportunities, and although machine learning could significantly improve many difficulties in education, not all problems fit the mold.

解决机器学习问题需要一些基本项目:要解决的简单问题,说明问题的大量数据以及组织数据以使用机器学习解决问题的人员。 哦,还有很多计算能力。 看起来很简单,对吧? 没那么快! 跨越这些障碍可能是具有挑战性的。 机器学习可以提供很多机会,尽管机器学习可以显着改善许多教育方面的困难,但并非所有问题都适合。

Image for post

As an example, let’s look at how machine learning has been used to solve photo recognition. A straightforward assignment for a human would be to identify objects in a photograph. The average human would have difficulty recognizing cancer in an X-Ray or CRT scan, but identifying a wine glass or a dog in a photo might be more manageable. So let’s stick with simple.

作为示例,让我们看看如何使用机器学习来解决照片识别。 对人类的直接分配就是识别照片中的物体。 一般人在X射线或CRT扫描中难以识别癌症,但是在照片中识别酒杯或狗可能更容易管理。 因此,让我们坚持简单。

We can easily have humans look at thousands of pictures and tag (or label) them with objects that appear; this picture has a dog; this picture has a cat. Data scientists know how to organize tagged photos and use machine learning algorithms to generate models that represent human learning. A data scientist “trains” a Machine Learning model to recognize items in a picture by feeding the algorithm many past examples. After training with many cases, the model becomes more adept at finding objects until, eventually, the algorithm becomes faster and more accurate than the humans that taught it.

我们可以很容易地使人们看成千上万张图片,并用出现的物体标记(或标记)它们。 这幅画有一只狗。 这张照片有一只猫。 数据科学家知道如何组织标记的照片并使用机器学习算法来生成代表人类学习的模型。 数据科学家通过为算法提供许多过去的示例,来“训练”机器学习模型以识别图片中的项目。 经过大量案例的训练后,该模型变得更加善于发现对象,直到最终该算法变得比教它的人更快,更准确。

Image for post

Anyone who has used Apple Photos or Google Photos knows the result of this technology. Instead of spending hours and hours looking for pictures of Aunt Matilda for her birthday album, you teach an application what Aunt Matilda looks like with a few pictures, and the app finds the rest (most of the time).

使用过Apple Photos或Google Photos的任何人都知道这项技术的结果。 您无需花大量时间寻找Matilda姨妈的生日专辑照片,而是教一个应用程序Matilda姑妈看起来像几张照片,然后其余大部分时间由应用程序找到。

直截了当的问题 (Straightforward Problems)

Machine learning cannot solve every problem. At least not yet. A rule of thumb is, “if you can teach an intern to do a repetitive task, you can probably help that task with machine learning.” For instance, you can train most adult humans to drive a car, obey a bunch of rules, and avoid hitting things. The fact that we are close to developing driverless cars powered by machine learning illustrates this point. Driving is a task that is on the way to a solution with machine learning.

机器学习不能解决所有问题。 至少还没有。 经验法则是:“如果您可以教实习生完成重复的任务,则可以通过机器学习来帮助完成该任务。” 例如,您可以训练大多数成年人驾驶汽车,遵守一系列规则并避免撞到东西。 我们即将开发基于机器学习的无人驾驶汽车这一事实说明了这一点。 驾驶是一项需要借助机器学习解决方案的任务。

Another example is finding new antibiotics. Researchers at MIT recently discovered that they could use characteristic data from past antibiotics, and use machine learning to predict which antibiotics can be used as new medicines. As a further example, hedge funds have been using ML for years to use past financial data to predict future prices.

另一个例子是寻找新的抗生素。 麻省理工学院的研究人员最近发现,他们可以使用过去抗生素的特征数据,并使用机器学习来预测哪些抗生素可以用作新药。 再举一个例子,对冲基金多年来一直使用ML来使用过去的财务数据来预测未来的价格。

Image for post

The main idea is that the problem and its data needs to be somewhat structured and repeatable. With enough time, a person could use the existing data to predict probable outcomes in the future.

主要思想是问题及其数据需要某种程度的结构化和可重复性。 只要有足够的时间,一个人就可以使用现有数据来预测未来可能出现的结果。

大量干净的,带标签的数据 (Large Quantities of Clean, Labeled Data)

Just labeling (or tagging) a few pictures of Aunt Matilda is not enough data to inform a machine learning model. You only have to tag a few because Apple and Google already trained their models using thousands (if not millions) of photos. To prepare a model, you generally need a large amount of example data with different and distinct outcomes. Some people use a rule of thumb that greater than 5,000 samples are required. However, that threshold depends on the problem you’re solving. Training an ML program to recognize blood cell anomalies, for example, might take thousands of illustrations while teaching a car to drive through New York City would require millions.

仅标记(或标记)Matilda姨妈的几张照片还不足以提供机器学习模型的数据。 您只需标记几个,因为Apple和Google已经使用数千张(如果不是几百万张)照片训练了他们的模型。 要准备模型,通常需要大量具有不同且截然不同结果的示例数据。 有些人根据经验法则要求需要超过5,000个样本。 但是,该阈值取决于您要解决的问题。 例如,训练一个ML程序来识别血细胞异常可能需要成千上万的插图,而教一辆汽车穿越纽约市则需要数百万个插图。

Image for post

For machine learning to work, the data also needs to be clean and accurately labeled. For example, let’s say I’m trying to predict the number of COVID-19 patients that will arrive in the county of San Francisco next week. The information is essential to estimate the staffing, beds, and supplies I will need across all hospitals. I have a bunch of data coming from different sources that utilize different formats; some of the tables have fields that are named differently or use words instead of numbers. In some instances, there is no consistency in the size or scope of the data collected and available.

为了使机器学习正常工作,还需要对数据进行清洁和准确的标记。 例如,假设我正在尝试预测下周将到达旧金山县的COVID-19患者的数量。 该信息对于估算我在所有医院中所需的人员,床位和用品至关重要。 我有一堆来自不同来源的数据,它们使用不同的格式。 有些表的字段名称不同或使用单词代替数字。 在某些情况下,所收集和可用数据的大小或范围不一致。

All of the columns must contain consistent data before we can perform any kind of automated data analysis. Machine learning is lousy at taking completely unstructured data lacking uniformity and making sense out of it. That’s why you need a data scientist.

在我们执行任何类型的自动化数据分析之前,所有列都必须包含一致的数据。 机器学习在获取缺乏一致性的完全非结构化数据并使其变得有意义方面很糟糕。 这就是为什么您需要一名数据科学家。

什么是数据科学家? (What is a Data Scientist?)

Image for post

Structuring the data so that the machines can make sense of it is no simple task. You need someone who understands how to organize the sets of data and, in some cases, restate the problem. A data scientist, among other things, is a person with an understanding of statistics and a computer science background that can analyze, process, and model data. They can then interpret the results to create actionable plans for companies and other organizations.

对数据进行结构化以使机器可以理解它并非易事。 您需要一个了解如何组织数据集并在某些情况下重述问题的人员。 除其他事项外,数据科学家是一个了解统计数据并具有可以分析,处理和建模数据的计算机科学背景的人。 然后,他们可以解释结果以为公司和其他组织创建可行的计划。

The thing about data scientists today is their scarcity. Silicon Valley is vacuuming up anyone who knows anything about it. The number of positions open for data scientists is increasing dramatically. But the number of folks graduating from college with this unique knowledge is not keeping up with demand. Several online education companies like Coursera and Udemy are providing online certificate programs to help meet the demand. The scarcity of talent is why data scientists are fast becoming the highest-paid people in the industry.

今天的数据科学家所面临的是他们的匮乏。 硅谷正在清理任何对此一无所知的人。 可供数据科学家使用的职位数量急剧增加。 但是,拥有这一独特知识的大学毕业生的数量跟不上需求。 Coursera和Udemy等多家在线教育公司正在提供在线证书计划,以帮助满足需求。 人才稀缺是为什么数据科学家Swift成为业内收入最高的人的原因。

处理数据需要计算机功能 (Crunching Data Requires Computer Power)

Image for post

The amount of computing power required by AI/Machine Learning around the world is skyrocketing. We hear the phrase “data is the new oil” more frequently. Companies are finding that there is real value in “mining” their data stores. But the computing power required could require a lot of shiny new hardware to make the data scientists happy.

全世界的AI /机器学习所需的计算能力正在飞速增长。 我们更常听到“数据是新油”的说法。 公司发现“挖掘”其数据存储具有真正的价值。 但是所需的计算能力可能需要大量闪亮的新硬件才能使数据科学家满意。

Luckily you don’t need to own the dedicated resources to enable Machine Learning. All major cloud providers like AWS, Google, and Microsoft will allow you to quickly rent the cloud computing you need to process the problem and keep your work discrete and confidential. You can briefly rent the power you need for the project on which you are working.

幸运的是,您不需要拥有专用资源即可启用机器学习。 所有主要的云提供商(例如AWS,Google和Microsoft)都将允许您快速租用处理问题所需的云计算,并使您的工作分散且保密。 您可以短暂地租用您正在进行的项目所需的电源。

将其带回教育 (Bringing it Back to Education)

There are many problems in education that machine learning can help solve, and there are problems that aren’t yet ready. Let’s consider an example that could be in reach: teaching math to kindergarteners. We know that kids optimally learn math in a variety of different modalities and at different speeds. We also know there are many different ways to teach math. Generally, however, schools and governments tend to choose a curriculum that helps a statistical majority of kids. Then they train the teachers and administrators on that specific curriculum and approach. If a child who does not learn “that way” is fortunate, the student has a teacher who modifies the curriculum specifically for them; unfortunately, that leaves behind many children.

机器学习可以帮助解决教育中的许多问题,还有尚未解决的问题。 让我们考虑一个可能达到的例子:向幼儿园的孩子教数学。 我们知道,孩子们可以以各种不同的方式和以不同的速度最佳地学习数学。 我们也知道有许多不同的数学教学方法。 但是,一般来说,学校和政府倾向于选择能够帮助大多数统计数字的孩子的课程。 然后,他们针对特定的课程和方法对教师和管理人员进行培训。 如果一个没有“那样”学习的孩子是幸运的,那么学生有一位老师专门为他们修改课程; 不幸的是,这留下了许多孩子。

What if, instead, we could collect and aggregate and organize the data on how thousands of kindergarten kids learn math: how they approach the math problems, the problems they find comfortable, and the problems they find hard. Educators with data scientists utilizing machine learning algorithms can create a customized learning model and a plan for each child. Then, the administrators and educators need to use the ML feedback and with fidelity provide a personalized approach to the children that need it. Thus, together, educators and AI/Machine Learning can improve education for students.

相反,如果我们可以收集和汇总和整理有关成千上万的幼儿园孩子如何学习数学的数据:他们如何解决数学问题,发现自己感到舒适的问题以及发现困难的问题。 具有数据科学家的教育者利用机器学习算法可以为每个孩子创建定制的学习模型和计划。 然后,管理员和教育工作者需要使用ML反馈,并忠实地为需要它的孩子提供个性化的方法。 因此,教育者和AI /机器学习一起可以改善对学生的教育。

If you start to look at education as a data problem and consider the collection of data to be a priority, you can begin to see the opportunities that are low hanging fruit in our schools. Using machine learning, among the many different possibilities, we could identify learning disabilities earlier, intervene where kids are falling behind, and help kids identify career paths suited to their success. In short, we can help all kids thrive in a system that leaves so many behind. No one gets left behind with customized plans that work for each child.

如果您开始将教育视为一个数据问题,并认为收集数据是当务之急,那么您将开始看到在我们学校中难得一见的机会。 通过使用机器学习,我们可以在许多不同的可能性中及早发现学习障碍,干预孩子落后的地方,并帮助孩子确定适合他们的成功的职业道路。 简而言之,我们可以帮助所有孩子在一个如此众多的系统中system壮成长。 没有人会为每个孩子制定定制的计划。

下一部分:教育的障碍 (Next Installment: The Barriers to Education)

Let’s examine real-world education problems impeding the least fortunate kids amongst us. How can ML and AI fuel better outcomes and support equity in education?

让我们研究阻碍我们中最不幸的孩子的现实教育问题。 机器学习和人工智能如何促进更好的成果并支持教育公平?

Image for post
Image for post
Dave Kearney 戴夫·科尼(Dave Kearney)

At Kury.us, Dave’s deep background in technology, data, and software intersects with his passion for improving the lives of children at risk. As an innovator and serial entrepreneur, he brings a fresh perspective to many of education’s most pressing issues. He is currently on the board of NapaLearns, a non-profit leader in innovation and education. Also, his many roles have included technology consultant and CTO in the for-profit education arena. Dave finds solutions that close the resource and education gap for children at risk.

在Kury.us,Dave在技术,数据和软件方面的深厚背景与他对改善处于危险中的儿童的生活的热情相交。 作为创新者和连续企业家,他为许多教育领域最紧迫的问题带来了崭新的视角。 他目前是创新和教育领域的非营利组织NapaLearns的董事会成员。 此外,他的许多职务包括营利性教育领域的技术顾问和CTO。 戴夫(Dave)找到解决方案,以弥补面临风险的儿童的资源和教育差距。

Image for post
Zach Wasilew 扎克·瓦西鲁(Zach Wasilew)

Zach has spent years in the business of education. Helping educational organizations expand access to more students, improve educational outcomes, grow enrollment, and ensuring that the educational institution thrives. He has helped build higher ed institutions, charter schools serving K-12, and early childhood education centers; a majority of the students from these institutions have been identified as at-risk. At Kury.us Zach combines these entrepreneurial experiences with data to tailor solutions for each child individually, and the institutions collectively so that the students and their families succeed.

扎克(Zach)在教育界工作了多年。 帮助教育机构扩大接触更多学生的机会,改善教育成果,增加入学人数,并确保教育机构蓬勃发展。 他曾帮助建立高等教育机构,为K-12服务的特许学校和早期儿童教育中心; 这些机构的大多数学生已被确定为处于危险之中。 Zach在Kury.us上将这些创业经验与数据结合起来,为每个孩子和各个机构量身定制解决方案,从而使学生及其家庭获得成功。

Originally published at https://kury.us on August 12, 2020.

最初于 2020年8月12日 发布在 https://kury.us

翻译自: https://medium.com/swlh/can-machine-learning-solve-problems-in-education-kury-us-29e096a2fa3

机器学习解决什么问题

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值