php如何减缓gc_管理信息传播-使用数据科学减缓错误信息的传播

php如何减缓gc

With more people now than ever relying on social media to stay updated on current events, there is an ethical responsibility for hosting companies to defend against false information. Disinformation, which is a type of misinformation that is intended to manipulate and mislead, can create unrest and panic. Other types of misinformation such as rumors and hoaxes, if left unchecked, also has the potential to bring mental and physical harm to unwary readers. The key to stopping the spread of misinformation is taking swift action against them since they have the tendency to travel very quickly. In fact, studies show that falsehood spreads exponentially faster than the truth (source). Social media companies have put in place protocols to limit the virality of inaccurate content, but they only take effect once the content has been reviewed by third-party fact-checking partners. Therefore, the focus is on rapid assessment of veracity. We’ve seen remarkable ingenuity from technology companies in this capacity. Namely, the use of Machine Learning algorithms to complement fact-checking programs for identifying inaccurate content. However, this is yet to be a complete solution. In this article, we’ll study the process and explore how it might evolve.

如今,比以往任何时候都更多的人依赖社交媒体来了解最新新闻,因此托管公司有道德责任承担防范虚假信息的责任。 虚假信息是一种旨在操纵和误导的虚假信息,会引起骚动和恐慌。 如果不加以制止,其他类型的错误信息,例如谣言和恶作剧,也有可能给粗心的读者带来精神和身体上的伤害。 阻止错误信息传播的关键是对它们采取Swift的行动,因为它们倾向于快速传播。 实际上,研究表明,虚假的传播速度比真相的传播速度快( 来源 )。 社交媒体公司已经制定了协议来限制不准确内容的病毒性,但是只有在第三方事实检查合作伙伴对内容进行审核后,它们才会生效。 因此,重点是对准确性进行快速评估。 我们已经看到技术公司在此方面具有非凡的创造力。 即,使用机器学习算法来补充事实检查程序,以识别不正确的内容。 但是,这尚未成为一个完整的解决方案。 在本文中,我们将研究该过程并探讨其可能如何发展。

如何识别错误信息 (How Misinformation is Identified)

Image for post
Fact-Checking Program workflow
事实检查计划工作流程

The process of evaluating the content’s accuracy begins with an internal screening of potential falsehood. This involves the utilization of Automation and Machine Learning models to pick up various signals. If the content is determined to potentially be misinformation, it’s routed to fact-checking partners for further review. After manual research and/or consultation with the primary source, a content rating is assigned. The resulting rating notifies the social media company if action needs to be taken. Further, the rating also helps train the Machine Learning models to become better at catching misinformation in the future. Below is how Machine Learning contributes to the process:

评估内容准确性的过程始于对潜在虚假性的内部筛选。 这涉及利用自动化和机器学习模型来拾取各种信号。 如果确定内容可能是错误信息,则将其发送给事实检查合作伙伴以进行进一步检查。 在对主要来源进行人工研究和/或咨询后,会分配内容分级。 如果需要采取行动,则由此产生的评级将通知社交媒体公司。 此外,该等级还有助于训练机器学习模型,使其在将来更好地捕捉错误信息。 以下是机器学习对流程的贡献:

  • The prediction models significantly reduce the number of reviews third-party fact-checking partners need to perform

    预测模型大大减少了第三方事实检查合作伙伴需要执行的审阅次数
  • Finding duplicate or near-duplicate content frees up capacity for fact-checking partners to review new instances of misinformation

    查找重复或几乎重复的内容可释放事实检查合作伙伴查看新的错误信息实例的能力

It’s quite a robust process, but not one without challenges. Below are the main challenges for this process:

这是一个强大的过程,但并非没有挑战。 以下是此过程的主要挑战:

  • The large and growing number of active users makes the platform a target for coordinated propaganda attacks, bringing urgency and heavy workload for the fact-checking program

    大量活跃用户使该平台成为协调宣传攻击的目标,为事实检查程序带来了紧迫性和繁重的工作量
  • The scarcity of verified deceptive content to be used as the corpora for predictive classification model training is a roadblock for Machine Learning methods. This is further exacerbated by the desire to have more narrow categories of “truthiness” since they require different treatments, thus diluting the available data

    缺乏可用于预测分类模型训练的经过验证的欺骗性内容是机器学习方法的障碍。 由于对“真实性”的分类更窄,因此它们的需求进一步加剧,因为它们需要不同的处理方式,从而稀释了可用数据
  • “Bad actors” who hide misleading context behind genuine content are hard to detect. For example, a Meme can use text layered on top of a photo or video to form deceitful content

    在真实内容后隐藏误导性上下文的“坏演员”很难被发现。 例如,一个Meme可以使用在照片或视频上分层的文字来构成欺骗性内容
  • Satirical may be misunderstood by people and are even more difficult for computers

    讽刺语可能会被人们误解,并且对于计算机而言甚至更加困难
Image for post
Monthly Active Users continue to grow as social media become the dominant medium for people to get news
随着社交媒体成为人们获取新闻的主要媒介,每月活跃用户持续增长

仔细检查筛选过程 (A Closer Look at the Screening Process)

Image for post
Automation and Machine Learning look for signals to screen content
自动化和机器学习寻找屏幕内容的信号

开发中 (In Development)

Technology companies are working to improve this process by significantly expanding their databases that will help them build Artificial Intelligence to combat sophisticated attacks such as “deep fakes” and “weaponized memes”. The effectiveness of the algorithms and models largely depend on the having a diverse data set to train on. Fortunately, with the wide collaboration across the technology community in terms of data sharing, the models are becoming better at understanding content. Nevertheless, this is work in progress.

科技公司正在努力通过显着扩展其数据库来改善此过程,这将帮助它们构建人工智能来对抗复杂的攻击,例如“深造假”和“武器化模因”。 算法和模型的有效性在很大程度上取决于要训练的多样化数据集。 幸运的是,随着整个技术社区在数据共享方面的广泛合作,这些模型在理解内容方面变得越来越好。 尽管如此,这项工作仍在进行中。

推荐建议 (Recommendations)

There are considerations that should be explored to make immediate improvements. One recommendation that I’m exploring is the prioritization and specialization of contents for third-party fact-checkers. We can perform A/B testing to compare the turn-over and overall virality to measure the impact of these measures.

应该探索一些考虑因素以立即进行改进。 我正在探索的一项建议是对第三方事实检查者的内容进行优先级划分和专业化处理。 我们可以进行A / B测试,以比较周转率和整体病毒性来衡量这些措施的影响。

  • Prioritization of dangerous content that have a propensity to spread before they become viral

    优先确定容易传播的易于传播的危险内容
  • Specialization of content directs content to third-party fact-checkers within their area of expertise to cut the amount of time require to review

    内容的专业化将内容定向到其专业领域内的第三方事实检查人员,以减少审核所需的时间

摘要 (Summary)

Infodemic is a disease that has plague us long before the recent health crisis. Without proper management, it can do tremendous harm to our society. Thankfully, there are technological tools to help us mitigate those risks. We reviewed the fact-checking progress and specifically how Machine Learning is being applied in this use case.

信息病是在最近的健康危机之前很久困扰我们的疾病。 如果没有适当的管理,它将对我们的社会造成巨大伤害。 值得庆幸的是,有技术工具可以帮助我们减轻这些风险。 我们回顾了事实检查的进展,特别是在此用例中如何应用机器学习。

翻译自: https://towardsdatascience.com/managing-infodemics-slowing-the-spread-of-misinformation-b8b74e3e2618

php如何减缓gc

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值