《Toward Multi-Modal Approach for Identification and Detection of Cyberbullying in Social Networks》

multi-modal系列论文研读目录



文章链接

1.论文题目含义

社交网络中网络欺凌的多模态识别与检测

2.ABSTRACT

Given the widespread use of social networks in people’s everyday lives, cyberbullying has emerged as a major threat, especially affecting younger users on these platforms. This matter has generated significant societal apprehensions. Prior studies have primarily concentrated on analyzing text in relation to cyberbullying. However, the dynamic nature of cyberbullying covers many goals, communication platforms, and manifestations. Conventional text analysis approaches are not effective in dealing with the wide range of bullying data seen in social networks. In order to tackle this difficulty, our suggested multi-modal detection approach integrates data from diverse sources including photos, videos, comments, and temporal information from social networks. In addition to textual data, our approach employs hierarchical attention networks to record session features and encode various media information. The resulting multi-modal cyberbullying detection platform provides a comprehensive approach to address this emerging kind of cyberbullying. By conducting experimental analysis on two actual datasets, our framework exhibits greater performance in comparison to many state-of-the-art models. This highlights its effectiveness in dealing with the intricate nature of cyberbullying in social networks.
鉴于社交网络在人们日常生活中的广泛使用,网络欺凌已成为一个主要威胁,尤其影响到这些平台上的年轻用户。这件事引起了社会的极大忧虑。先前的研究主要集中在分析与网络欺凌有关的文本。然而,网络欺凌的动态性涵盖了许多目标,通信平台和表现形式。传统的文本分析方法在处理社交网络中看到的广泛的欺凌数据时是无效的。为了解决这个难题,我们建议的多模态检测方法集成了来自不同来源的数据,包括照片,视频,评论和来自社交网络的时间信息。除了文本数据,我们的方法采用分层注意力网络记录会话功能和编码各种媒体信息。由此产生的多模式网络欺凌检测平台提供了一种全面的方法来解决这种新兴的网络欺凌。通过对两个实际数据集进行实验分析,与许多最先进的模型相比,我们的框架表现出更好的性能。这突出了它在处理社交网络中网络欺凌的复杂性质方面的有效性。

3.INDEX TERMS

Cyberbullying, multi-modality, social media, hierarchy attention.
网络欺凌、多模态、社会媒体、层级关注。

4.INTRODUCTION

  1. Young people mostly utilize social networking as their main platform for social engagement. However, the widespread prevalence of cyberbullying on digital platforms presents a significant danger to the welfare of young individuals [1]. Over 40% of American teenagers have experienced cyberbullying on social media, according to data from the White House and the American Psychological Association [2]. Recent British research has highlighted the extent of the problem, revealing that cyberbullying is more common than bullying in real-life situations. Specifically, 12% of students reported experiencing cyberbullying. The prevalence of social cyberbullying occurrences is increasing annually, transforming into a complex dilemma that involves various objectives, routes, and manifestations. The rising problem has caused significant adverse effects on the physical and mental well-being of the victims, leading some to contemplate suicide [3]. In addition to being a distressing experience for individuals, it has evolved into a significant public health issue, leading to an increase in research efforts in the fields of psychology and computer science. This study is to understand the attributes of cyberbullying and, ultimately, to find efficient approaches to detect and tackle instances of bullying on social networks.
    年轻人大多利用社交网络作为他们参与社会活动的主要平台。然而,数字平台上网络欺凌的普遍存在对年轻人的福利构成了重大威胁[1]。根据白宫和美国心理学会的数据,超过40%的美国青少年在社交媒体上经历过网络欺凌[2]。英国最近的研究强调了这个问题的严重程度,揭示了网络欺凌比现实生活中的欺凌更常见。具体而言,12%的学生报告经历过网络欺凌。社交网络欺凌事件的发生率每年都在增加,转化为涉及各种目标,路线和表现形式的复杂困境。这个日益严重的问题对受害者的身心健康造成了严重的不良影响,导致一些人考虑自杀[3]。除了对个人来说是一种痛苦的经历外,它还演变成一个重要的公共卫生问题,导致心理学和计算机科学领域的研究工作增加。本研究旨在了解网络欺凌的属性,并最终找到有效的方法来检测和处理社交网络上的欺凌事件。
  2. In the domain of automated cyberbullying identification, where there is a high occurrence of harmful verbal attacks, current efforts mostly focus on examining textual characteristics. Several methods have been created to classify text and identify instances of cyberbullying. Cyberbullying is defined as when people or groups repeatedly post offensive or violent content on social media using embedded devices with the intention of hurting or upsetting other people. However, depending exclusively on the examination of textual characteristics, one faces difficulties in determining if the content is aimed at certain persons or groups without contextual information. Moreover, the existence of unsuitable visual information within conventional text-based material presents a potential hazard on social media platforms. Hence, it is crucial to highlight the significance of essential information contained in diverse types of social media, such as photographs, videos, comments, and social networks.
    在自动网络欺凌识别领域,有害的言语攻击发生率很高,目前的努力主要集中在检查文本特征。已经创建了几种方法来对文本进行分类并识别网络欺凌的实例。网络欺凌是指人们或团体使用嵌入式设备在社交媒体上反复发布攻击性或暴力内容,意图伤害或扰乱他人。然而,完全依赖于对文本特征的检查,人们在确定内容是否针对没有上下文信息的某些人或群体时面临困难。此外,在传统的基于文本的材料中存在不合适的视觉信息,这对社交媒体平台构成了潜在的危险。因此,强调照片、视频、评论和社交网络等各种社交媒体中所包含的重要信息的重要性至关重要。
  3. Present efforts centered around multi-modal information tend to give priority to particular modalities. Comments are commonly perceived as concise exchanges over a specific topic. The study [4] utilized contextual information to improve the understanding of the whole context and the determination of conduct. Nevertheless, this study failed to consider the interplay between individual comments while striving to comprehend the correlation between comments. Soni [5] proposed an alternate method that incorporated visual attributes to overcome the constraints of textual elements. Although these methods demonstrate improved performance in comparison to text analysis alone, they are not effective in overcoming the limitations associated with single-mode information. Additionally, important traits like persistence and the gradual recurrence of hostile acts are displayed by cyberbullying [2]. Interrupting conversations about cyberbullying while successfully averting secondary damage presents a new problem. Therefore, a new difficulty in cyberbullying detection is identifying multi-modal bullying material quickly enough to stop further discussion.
    目前围绕多模态信息的努力倾向于优先考虑特定的模态。评论通常被认为是对特定主题的简洁交流。该研究[4]利用上下文信息来提高对整个上下文的理解和行为的确定。然而,这项研究未能考虑个人评论之间的相互作用,而努力理解评论之间的相关性。Soni [5]提出了一种替代方法,该方法结合了视觉属性以克服文本元素的约束。虽然这些方法与单独的文本分析相比表现出更好的性能,但它们在克服与单模式信息相关的限制方面并不有效。此外,网络欺凌还表现出持续性和敌对行为的逐渐复发等重要特征[2]。打断关于网络欺凌的对话,同时成功避免二次伤害,这是一个新问题。因此,网络欺凌检测的一个新困难是快速识别多模式欺凌材料,以阻止进一步的讨论。
  4. We reinterpret the phenomena as a method that uses textual, visual, and extra meta information to determine whether a post is related to a bullying topic in response to these changing forms of cyberbullying. Our novel Multi-Modal Cyberbullying Detection (MMCD) framework is introduced to address the aforementioned issues. To consistently recognize various instances of cyberbullying on social networks, this framework combines textual, visual, and other metainformation. In particular, we propose that offensive remarks be made on blogs that engage in cyberbullying. Utilizing Hierarchical Attention Networks (HAN) [6], we evaluate the significance of each comment by modeling it and then encoding visual and other meta-data. To improve cyberbullying detection performance, these traits and textual content are combined. Among this work’s principal contributions are:
    我们重新解释的现象,作为一种方法,使用文本,视觉和额外的Meta信息,以确定是否与一个欺凌主题的帖子,以响应这些不断变化的形式的网络欺凌。我们的新的多模态网络欺凌检测(MMCD)框架介绍,以解决上述问题。为了一致地识别社交网络上的各种网络欺凌实例,该框架结合了文本、视觉和其他元信息。特别是,我们建议在从事网络欺凌的博客上发表攻击性言论。利用分层注意力网络(HAN)[6],我们通过对每个评论进行建模,然后对视觉和其他元数据进行编码来评估每个评论的重要性。为了提高网络欺凌检测性能,这些特征和文本内容相结合。这项工作的主要贡献包括:
    • We provide an innovative view of the complete problem-solving process that is based on the combination and combined processing of multi-modal data to successfully handle the various types of cyberbullying.
    ·我们提供了一个完整的问题解决过程的创新视图,该过程基于多模态数据的组合和组合处理,以成功处理各种类型的网络欺凌。
    • We developed an original multi-modal framework for cyberbullying detection. This framework employs an innovative approach by independently modeling textual, visual, and other information. It incorporates a selfattention-based BiLSTM model, a HAN model focusing on word and comment levels, and other embedding techniques. The integration of these components enables efficient information merging, contributing to the effective resolution of the complex issue of cyberbullying.
    ·我们开发了一个用于网络欺凌检测的原始多模式框架。该框架采用了一种创新的方法,通过独立建模文本,视觉和其他信息。它集成了一个基于自我注意力的BiLSTM模型,一个专注于单词和评论级别的HAN模型,以及其他嵌入技术。这些组件的集成可以实现有效的信息合并,有助于有效解决网络欺凌的复杂问题。
    • We acquired data based on multi-modality from prominent social media website, namely Twitter and Facebook, to validate the efficacy of our approach. Additionally, we conduct a thorough investigation into the impact of multi-modal data on cyberbullying.
    ·我们从著名的社交媒体网站(即Twitter和Facebook)获取了基于多模态的数据,以验证我们方法的有效性。此外,我们对多模态数据对网络欺凌的影响进行了彻底的调查。

5.LITERATURE REVIEW

A significant amount of earlier work on cyberbullying detection has focused on text feature analysis as a way to identify bullying behaviors. Emotional analysis and text classification are common methods that employ N-gram models, BoW, and TF-IDF [7], [8], and [9]. Classification methods such as Random Forest, Support Vector Machine (SVM), Logistic Regression, and Naive Bayes have proven to be effective in handling these features [8], [10], [11], [12]. Specifically, Chavan and Shylaja [13] detected bullying behavior by extracting variables such as TF-IDF, BoW, and bullying lexicon using SVM and logistic regression. Aside from text features, network attributes have also been studied [14]. Academics have studied a variety of social network metrics, including the volume of tweets, geographic distribution, and the strength of users’ social networks. Using the nature of social networks, Chelmis et al. [15] developed a system to detect cases of bullying. On the other hand, Algaradi et al. developed a detection model that effectively combines metrics related to networks, user behavior, and tweet content. To improve the overall performance of their system, Cheng et al. [11] designed a complex heterogeneous network that included metadata like user profiles, photos, videos, time, location, and comments. With the post-vector representations learned by network embedding, they proceeded to employ SVM and Random Forest classification techniques. With these models, we have a thorough strategy for identifying cyberbullying, and we have solved the problem of weak text features.
早期大量关于网络欺凌检测的工作都集中在文本特征分析上,以此来识别欺凌行为。情感分析和文本分类是采用N-gram模型,BoW和TF-IDF的常用方法[7],[8]和[9]。分类方法,如随机森林,支持向量机(SVM),逻辑回归和朴素贝叶斯已被证明是有效的处理这些功能[8],[10],[11],[12]。具体来说,Chavan和Shylaja [13]通过使用SVM和逻辑回归提取TF-IDF,BoW和欺凌词汇等变量来检测欺凌行为。除了文本特征,网络属性也被研究[14]。学术界研究了各种社交网络指标,包括推文数量、地理分布和用户社交网络的强度。利用社交网络的性质,Chelmis等人[15]开发了一个系统来检测欺凌事件。另一方面,Algaradi等人开发了一种检测模型,该模型有效地结合了与网络,用户行为和推文内容相关的指标。为了提高系统的整体性能,Cheng等人[11]设计了一个复杂的异构网络,其中包括用户配置文件,照片,视频,时间,位置和评论等元数据。通过网络嵌入学习后向量表示,他们继续采用SVM和随机森林分类技术。有了这些模型,我们就有了一个全面的识别网络欺凌的策略,我们也解决了弱文本特征的问题。
As an end-to-end approach, deep learning also shows improved text representation capabilities [16]. Text classification has come a long way thanks to convolutional neural networks (CNN) [17], recurrent neural networks (RNN) [18], and hybrid architectures like RCNN, which combines CNN and RNN, [19]. Miswriting is a frequent occurrence on social media and is frequently used in bullying texts to avoid discovery. Park and colleagues [20] tackled this problem by implementing a hybrid model that connected character and word convolutional neural networks in a smooth manner for efficient categorization. 21. Zhang et al. [21] came up with a new way to encode text that combined convolution layers with a Gate Recurrent Unit (GRU) to include both structural and sequence information.作为一种端到端的方法,深度学习还显示出改进的文本表示能力[16]。得益于卷积神经网络(CNN)[17]、循环神经网络(RNN)[18]以及RCNN(结合了CNN和RNN)等混合架构[19],文本分类已经取得了长足的进步。在社交媒体上,书写错误是经常发生的事情,并且经常用于欺凌文本以避免被发现。Park及其同事[20]通过实现一种混合模型来解决这个问题,该模型以平滑的方式连接字符和单词卷积神经网络,以实现有效的分类。21.Zhang等人。[21]提出了一种新的文本编码方法,将卷积层与门递归单元(GRU)相结合,以包含结构和序列信息。
Cyberbullying detection systems now use attention techniques to highlight key terms. An attention mechanismbased bidirectional RNN (BiRNN) model [22] with an was presented by Zhang et al. [23] to detect bullying text. Word weights were adjusted using an attention mechanism in this model, which also integrated contextual data via BiRNN. Deep models, which are similar to these methods, have relied heavily on specific meta-information. By merging latent representations from text and information, Founta et al. [24] developed a hybrid model. Yafooz et al. [25] employed transfer leaning approaches to detect and classify the kids cyberbullying on social media. They used two Arabic dataset collected from YouTube Videos, then applied several pretrained models. The best accuracy has been recorded using AraBERT model which reached to 95% and 96% using both datasets. Similarly, Alhejaili et al. [26] detecting hate speech using machine learning classifiers.网络欺凌检测系统现在使用注意力技术来突出关键术语。Zhang等人提出了一种基于注意力机制的双向RNN(BiRNN)模型[22]。[23]用于检测欺凌文本。在这个模型中,使用注意力机制来调整单词权重,该模型还通过BiRNN整合了上下文数据。与这些方法类似的深层模型在很大程度上依赖于特定的元信息。通过合并文本和信息的潜在表征,Founta等人[24]开发了一种混合模型。Yafooz等人。[25]采用迁移学习方法来检测和分类社交媒体上的儿童网络欺凌。他们使用了从YouTube视频中收集的两个阿拉伯语数据集,然后应用了几个预训练的模型。最好的准确性已记录使用AraBERT模型,达到95%和96%使用两个数据集。类似地,Alhejaili等人[26]使用机器学习分类器检测仇恨言论。
Owing to social media’s diversity, some researchers have integrated visual data into their textual analysis in addition to broadening it. Soni et al. [5] made an effort to derive visual cues to fill in the gaps left by the absence of text. Li et al. [4] examined child semantics and child comments on related issues to gain a richer comprehension of context by utilizing the parent-child link between comments. Also, the study [27], which built a time-dependent hierarchical attention network to gather comment features, shows that techniques for classifying documents have been used on comments. These techniques highlight the advantages of multi-modal data for cyberbullying detection. In the topic of cyberbullying detection, current research developments include feature fusion and feature extraction from multimodal datasets.由于社交媒体的多样性,一些研究者除了扩展文本分析之外,还将视觉数据整合到文本分析中。Soni等人[5]努力获得视觉线索,以填补文本缺失所留下的空白。Li et al. [4]研究了儿童语义学和儿童对相关问题的评论,通过利用评论之间的父子联系来获得更丰富的上下文理解。此外,研究[27]建立了一个时间依赖的层次注意力网络来收集评论特征,表明对文档进行分类的技术已用于评论。这些技术突出了多模态数据用于网络欺凌检测的优势。在网络欺凌检测的主题中,当前的研究进展包括多模态数据集的特征融合和特征提取。

6.PROBLEM FORMULATION

在这里插入图片描述
N个社交媒体帖子的语料库由p = {p1,p2,p3,…pn}表示。媒体对象Mi,诸如图像或视频、时间戳、用户简档、喜欢和分享、文本内容(Ti)、附加信息(Oi)(比如时间戳、用户简档、喜欢和分享的),构成了每个帖子(Pi)。帖子(Pi)的数学表示由Pi = {Ti,Ci,Mi,Oi}给出,其中Ti表示文本内容,Ci表示评论的集合,Mi表示媒体对象,并且Oi包含附加的元数据。
在这里插入图片描述
此外,C(1)i的长度由l(1)i表示,并且C(1)i表示Ci中的第一评论。此外,每个帖子Pi都被赋予一个二进制标签Yi = {0,1},其中1表示欺凌行为,0表示相反。在我们的数据集表示中,每个帖子Pi中的评论长度由集合Ci中的评论总数决定。初始评论Ci的长度被指示为C(1)i,其中C(1)i表示集合Ci中的第一评论。这种标记使我们能够区分和检查每个帖子中的各个评论,为评论的结构和内容提供有价值的见解。在我们的网络欺凌检测框架中,符号lii作为参考,以准确测量和分析集合中第一条评论的长度特征。
We present a function F that learns bullying behavior by taking into account the context, comments, media, and other pertinent data in order to formalize the cyberbullying detection process. This can be stated as follows:我们提出了一个函数F,它通过考虑上下文、评论、媒体和其他相关数据来学习欺凌行为,以正式化网络欺凌检测过程。这可以表述如下:
在这里插入图片描述
In this context, the binary label denoted as Yi signifies whether harassing behavior was present (1) or absent (0) in the given post Pi. The objective of the function F is to identify the correlations that exist between the different components of the post and the likelihood of cyberbullying.在该上下文中,表示为Yi的二进制标签表示在给定帖子Pi中骚扰行为是存在(1)还是不存在(0)。函数F的目的是确定帖子的不同组成部分与网络欺凌可能性之间存在的相关性。
The function F is subjected to training, validation, and evaluation to acquire knowledge and identify instances of bullying behavior. During the training process, labeled data instances are utilized to optimize the model parameters using techniques such as gradient descent. Afterward, the function is verified on distinct data to ensure its applicability to various scenarios and optimize the hyperparameters. Ultimately, the system’s performance is assessed by measuring metrics such as accuracy and precision on a separate test dataset. This process guarantees that F accurately detects harassing behavior in various posts and situations, offering valuable information about its sensitivity and performance.函数F经过训练、验证和评估,以获取知识并识别欺凌行为的实例。在训练过程中,标记的数据实例被用来使用诸如梯度下降之类的技术来优化模型参数。然后,在不同的数据上验证该函数,以确保其适用于各种场景并优化超参数。最终,系统的性能通过测量指标来评估,例如在单独的测试数据集上的准确度和精度。这个过程保证了F在各种帖子和情况下准确地检测骚扰行为,提供有关其敏感性和性能的宝贵信息。

7.MULTI-MODAL APPROACH FOR CYBERBULLYING DETECTION 多模态网络欺凌检测方法

  1. In this section, we present the MMCD architecture in detail. Two separate processes, encoding and decoding, allow the model to operate. During the encoding phase, a lot of different components are used to encode different types of data. These components include a BiLSTM-based Topicoriented encoder, a hierarchical attention mechanism based on comments, media embedding, and extra meta-information embedding layers. In addition, the comments encoder takes into account the sequential structure of the collection of comments.在本节中,我们将详细介绍MMCD架构。两个独立的过程,编码和解码,允许模型运行。在编码阶段,会使用许多不同的元件来编码不同类型的数据。这些组件包括一个基于BiLSTM的面向主题的编码器、一个基于评论的分层注意机制、媒体嵌入和额外的元信息嵌入层。此外,注释编码器还考虑注释集合的顺序结构。

  2. The decoding procedure utilizes a multilayer perceptron (MLP) to train the multi-mode data separately. Eventually, the multi-mode data is integrated for comprehensive training. Figure 1 depicts the schematic representation of the proposed Multi-Modal Cyberbullying Detection framework.解码过程利用多层感知器(MLP)分别训练多模式数据。最后,将多模式数据进行综合训练。图1描绘了所提出的多模式网络欺凌检测框架的示意图。
    在这里插入图片描述

  3. The MMCD architecture incorporates multiple modalities, such as text, comments, media, and metadata, which are individually processed and combined to improve cyberbullying detection. The textual content is encoded using BiLSTM networks, effectively capturing both forward and backward sequential dependencies in sentences. Attention mechanisms are utilized to assess the importance of individual words, highlighting relevant information during the encoding process. Comments are processed through a hierarchical attention mechanism based on document classification. Bidirectional GRU models sequentially encode words, utilizing attention mechanisms to emphasize significant words in comments. The media data is encoded using one-hot encoding and then passed through a multi-layer perceptron to extract features efficiently. Simultaneously, metadata encompassing timestamps and user profiles are incorporated to furnish a supplementary context. Attention mechanisms play a vital role in various modalities, allowing the model to concentrate on relevant information during encoding. In the decoding phase, the encoded representations from various modalities are combined, and fully connected units adjust the weights of vectors from each modality. The MMCD architecture utilizes a comprehensive approach to effectively analyze and integrate various modalities, thereby improving its ability to detect cyberbullying.MMCD架构包含多种形式,例如文本、评论、媒体和元数据,这些形式被单独处理和组合以改进网络欺凌检测。文本内容使用BiLSTM网络进行编码,有效地捕获句子中的向前和向后顺序依赖关系。注意机制被用来评估单个单词的重要性,在编码过程中突出相关信息。评论通过基于文档分类的分层注意机制进行处理。双向GRU模型顺序编码单词,利用注意力机制强调评论中的重要单词。媒体数据使用独热编码进行编码,然后通过多层感知器有效地提取特征。同时,包含时间戳和用户配置文件的元数据被合并,以提供补充上下文。注意机制在各种模态中发挥着至关重要的作用,使模型在编码过程中专注于相关信息。在解码阶段,来自各种模态的编码表示被组合,并且全连接单元调整来自每个模态的向量的权重。MMCD架构利用综合方法有效地分析和整合各种模式,从而提高其检测网络欺凌的能力。

A. MULTI-MODAL ENCODER 多模编码器

在这里插入图片描述
BiLSTM网络在自然语言处理领域发挥着关键作用,特别是在分析句子序列的任务中。长短期记忆(LSTM)作为递归神经网络(RNN)的一种改进,引入了三个基本的门控组件:输入门it、遗忘门ft和输出门ot。这些门的状态复杂地依赖于前一个状态ht-1,其中ht表示时间t的状态。等式1将其计算表示为将S形函数σ应用于向量xt和权重矩阵Wxi的乘积、向量ht-1和权重矩阵Whi的乘积以及偏置向量bi的和的结果。
在这里插入图片描述
等式2通过将S形函数应用于xt乘以Wxf、ht−1乘以Wff和bf的总和来计算ft的值。
在这里插入图片描述
等式3表示通过将sigmoid函数应用于xt乘以Wxo、ht−1乘以Who和bo的和来计算ot。
在这里插入图片描述
将xt视为时间t时的单词嵌入向量。还包括对应于输入xt的权重矩阵Wxi、Wxf、Wxo、Whi、Whf、Who以及偏置bi、bf、bo。
隐藏层ht通过组合候选存储器c_t和c_t的当前值来确定。为了得到c t的值,将双曲正切函数应用于xt和Wxc,ht−1和Whc,沿着bc的乘积之和。
在这里插入图片描述
在等式4中,我们表示在递归神经网络内当前细胞状态ct的计算。这涉及到遗忘门ft和前一个单元状态ct-1的逐元素乘法,然后是输入门it和候选单元状态c t的逐元素乘积的加法。
在这里插入图片描述

等式5将隐藏状态ht的计算表示为输出门ot与单元状态ct的双曲正切的逐元素乘积。
其中等式6中的矩阵Wxc和Whc表示权重,而bc表示偏置项。
在这里插入图片描述
在这里插入图片描述
我们采用BiLSTM对文本内容进行编码,从前后两个方向捕获句子属性。第i个单词的隐藏状态(表示为LS2hi)通过利用应用于嵌入向量Xi的LSTM模型来计算。对句子中的每个单词i执行计算,i的范围从1到n。等式7表示该过程。
等式8表示对于范围从n到1的I,在相反方向hi上的隐藏状态的计算。
此外,在词级的自我注意机制,以提高检测的负面短语。用于句子建模的双向LSTM的细节在算法1中给出。
在这里插入图片描述

B. EMBEDDING OFCOMMENTS USING HIERARCHIAL ATTENTION NETWORKS 使用分层注意力网络来嵌入评论

The method for analyzing and incorporating comments into the cyberbullying detection model entails several essential steps. Initially, comments undergo preprocessing, including tokenization and embedding techniques, to standardize and enhance their representation. Each word within a comment is embedded using an embedding matrix, and bidirectional GRU models are employed to encode the words sequentially, resulting in hidden vectors representing the comments at the word level. Subsequently, a hierarchical attention mechanism is applied to capture the contextual significance of words within comments. This mechanism evaluates the importance of each word in the context of the entire comment, enabling the model to focus on salient words while encoding comment-level information. Additionally, an MLP with a hidden layer further refines the representation of the hidden vectors obtained from the attention mechanism, enhancing the model’s ability to capture intricate relationships between words and generate a comprehensive representation of the comments. The hierarchical attention encoding process is formalized through an algorithmic approach, delineating the sequential steps involved in embedding and encoding comments using the hierarchical attention mechanism. Through these steps, our method systematically analyzes and integrates comments into the cyberbullying detection model, facilitating effective capture of the nuanced structure and content of social media comments.分析评论并将其纳入网络欺凌检测模型的方法需要几个基本步骤。最初,评论经过预处理,包括标记化和嵌入技术,以标准化和增强其表示。使用嵌入矩阵嵌入评论中的每个单词,并采用双向GRU模型对单词进行顺序编码,从而产生表示单词级别的评论的隐藏向量。随后,一个层次的注意力机制被应用到捕捉评论中的单词的上下文意义。该机制评估每个单词在整个评论上下文中的重要性,使模型能够在编码评论级信息的同时专注于突出单词。此外,具有隐藏层的MLP进一步细化了从注意力机制获得的隐藏向量的表示,增强了模型捕获单词之间复杂关系并生成评论的综合表示的能力。层次注意编码过程是通过一个算法的方法,划定嵌入和编码使用层次注意机制的评论所涉及的顺序步骤。通过这些步骤,我们的方法系统地分析并整合评论到网络欺凌检测模型中,从而有效地捕捉社交媒体评论的细微结构和内容。
based on document classification. This work highlights the significance of the attention mechanism at both the word level and comments level. This enables the HAN model to prioritize significant content throughout the encoding of the document. We employ word-level attention for each comment and subsequently utilize attention mechanisms at the comment level, processing the comments with a hierarchical attention architecture. We employ a Bidirectional Gated Recurrent Unit (GRU) to encode information at both the word-level and comment level. GRU, like LSTM, is a form of RNN that consists of two gates: the update gate and the reset gate. Let C be a set of comments, where L is the total number of comments. Let ci be the i-th comment in C, consisting of Li words denoted as wit, where t ranges from 1 to Li. To represent these words, we use an embedding matrix and compute the embedding xij as the product of the word embedding Wewij. Next, we feed the words of comment i into the Bidirectional GRU model for the purpose of encoding:
基于文档分类。这项工作强调了注意机制在单词和评论两个层面上的重要性。这使得HAN模型能够在整个文档编码过程中优先考虑重要内容。我们对每个评论采用词级注意力,然后在评论级利用注意力机制,用分层注意力架构处理评论。我们采用双向门控递归单元(GRU)在词级和评论级编码信息。GRU与LSTM一样,是RNN的一种形式,由两个门组成:更新门和重置门。假设C是一组注释,其中L是注释的总数。设ci是C中的第i个注释,由Li个表示为wit的词组成,其中t的范围从1到Li。为了表示这些单词,我们使用嵌入矩阵并将嵌入xij计算为单词嵌入Wewij的乘积。接下来,我们将注释i的单词输入到双向GRU模型中进行编码:
在这里插入图片描述
在这里插入图片描述
其中,从单词wi1到wiLi的隐藏状态由h_it表示,而后向隐藏状态由← h_it表示。我们通过连接这些隐藏向量来联合收割机在向前和向后方向上组合它们:hi = [hit,← h ]。考虑到每个词对评论都有不同的影响,该模型的目的是使用注意力机制来重建这些词的向量,以便专门为重要的词生成评论。更准确地说,我们采用了一个多层感知器,其中包括一个隐藏层提取一个更先进的隐藏层表示。
在这里插入图片描述
在这里插入图片描述
其中Ww表示权重矩阵,bw表示字级的偏置。uit和表示单词级上下文的向量uw之间的相似性被量化。接下来,我们标准化权重矩阵:
分层注意编码的细节在2.

C. ALTERNATIVE EMBEDDING APPROACH 替代嵌入方法

Our first step is to generate a one-hot encoding using media tags such as text, scenery, portraits, and others to encode media-related information. To reduce the complexity of the one-hot encoding, we utilize a multi-layer perceptron to extract features efficiently.我们的第一步是使用文本、风景、肖像等媒体标签生成一个独热编码,以编码媒体相关信息。为了降低独热编码的复杂度,我们利用多层感知器来有效地提取特征。

D. MULTI-MODAL DECODER 多模式解码器

We propose that various modalities have distinct contributions to the identification of bullying conduct. Expanding on this idea, we present a new method to modify the weights of the vectors of different modalities throughout the decoding stage.我们建议,各种模式有不同的贡献,以确定欺凌行为。扩展这一想法,我们提出了一种在整个解码阶段修改不同模态向量权重的新方法。
在这里插入图片描述
在编码阶段,我们从几种模态中检索数据,并使用不同维度的向量表示它们:vt表示评论,vc表示文本,vm表示媒体,vo表示其他元信息。在解码过程中,我们为每个模态使用单独的全连接单元,并计算每个层的值:
在这里插入图片描述
在这里插入图片描述
其中完全链接单元的隐藏层表示为HDT、HDC、HDM和HDO。变量Wdt、Wdc、Wdm、Wdo对应于权重矩阵,而bdt、bdc、bdm、bdo指示偏差。因此,我们调整每个隐藏向量的大小以匹配模态信息的重要性。每个隐藏向量的输出计算如下:
在这里插入图片描述
在这里插入图片描述
这里,W表示权重矩阵,B表示每个隐藏向量的偏置。得到的输出向量是outt、outc、outm和outo。组合这些单独的输出向量的行为产生结果out = [outt ; outc; outm; outo]。最终的解码过程随后使用向量out来执行。算法3中给出了多模型编码器公式。
在这里插入图片描述

8.IMPLEMENTATION ENVIRIONMENT 执行说明

The study aims to evaluate and categorize content sourced from Instagram, Vine, and Twitter. The hardware infrastructure comprises an Intel Core i7-6700 central processing unit (CPU) with 18 gigabytes of random access memory (RAM), working on the Windows 10 operating system. The major programming language used is Python version 3.6, and development takes place in the Visual Studio Code (VS Code) integrated development environment. The project’s technological stack includes essential libraries such as Matplotlib for data visualization, NLTK for text analysis, Keras for neural network development, Pandas for efficient data preparation, Sklearn for machine learning classification, and TweetInvi for interfacing with the Twitter API. This comprehensive framework guarantees efficient analysis and categorization of various information across these social media networks. Table 1, summarizes the implementation environment of the proposed model.该研究旨在评估和分类来自Instagram,Vine和Twitter的内容。硬件基础设施包括一个英特尔酷睿i7-6700中央处理器(CPU),配备18千兆字节的随机存取存储器(RAM),可在Windows 10操作系统上运行。使用的主要编程语言是Python 3.6版,开发在Visual Studio Code(VS Code)集成开发环境中进行。该项目的技术堆栈包括基本库,如用于数据可视化的Matplotlib,用于文本分析的NLTK,用于神经网络开发的Keras,用于高效数据准备的Pandas,用于机器学习分类的Sklearn以及用于与Twitter API接口的TweetInvi。这一全面的框架保证了对这些社交媒体网络中的各种信息进行有效的分析和分类。表1总结了所提出的模型的实现环境。
在这里插入图片描述

A. DATASET DESCRIPTION 数据集描述

For our experimental assessments, we utilized two datasets obtained from well-known social media platforms: Instagram, a platform that focuses on uploading photos and videos, and Twitter, a widely used service for microblogging and social networking. The datasets provided are openly available and cover a wide range of data types, such as tweets, text, pictures, and others.对于我们的实验评估,我们利用了从知名社交媒体平台获得的两个数据集:Instagram,一个专注于上传照片和视频的平台,以及Twitter,一个广泛使用的微博和社交网络服务。所提供的数据集是公开可用的,涵盖了广泛的数据类型,如推文、文本、图片和其他。

1) VINE DATASET 葡萄酒数据集

The dataset is referred to as the Vine dataset [28]. Vine is a social media platform that enables users to share brief, six-second, repetitive video segments. The dataset has 970 268 posts, of which 666 demonstrate normal conduct and demonstrate bullying behavior. Every Vine post comes with content, user comments, and video tags, which together provide a comprehensive dataset for research.该数据集被称为Vine数据集[28]。Vine是一个社交媒体平台,用户可以分享简短的、6秒的重复视频片段。该数据集有970 268个帖子,其中666个展示了正常的行为,并展示了欺凌行为。Vine的每个帖子都附带内容、用户评论和视频标签,这些内容共同为研究提供了一个全面的数据集。

2) INSTAGRAM DATASET Instagram邮件

Instagram is a widely recognized social media platform that allows users to share photographs and videos. The dataset comprises 2,218 posts, of which 678 were classified as instances of abuse and 1,540 were classified as normal [29]. Additionally, a combined total of155,260 comments are associated with these posts. Detailed data, such as user profiles, image annotations, timestamps, and user feedback, is accessible for review.Instagram是一个被广泛认可的社交媒体平台,允许用户分享照片和视频。该数据集包括2,218个职位,其中678个被归类为虐待事件,1,540个被归类为正常[29]。此外,总共有155,260条评论与这些帖子相关。可访问详细数据(如用户配置文件、图像注释、时间戳和用户反馈)以供审查。

3) TWITTER DATASET TWITTER数据集

The information gathered from the popular microblogging site Twitter includes, among other things, a heterogeneous collection of tweets, hashtags, users, and locations (see references [30], [31], [32]). This dataset includes examples of both friendly and hostile behavior. Every tweet in the dataset has links to a variety of components, such as user profiles, textual content, hashtags, timestamps, and user feedback. Out of the 30,000 posts included in the dataset, 14,250 were found to contain abusive content, and the remaining 15,750 were categorized as typical.从流行的微博客网站Twitter收集的信息包括,除其他外,一个异构的tweets,hashtag,用户和位置的集合(见参考文献[30],[31],[32])。这个数据集包括友好和敌对行为的例子。数据集中的每条推文都有链接到各种组件,如用户配置文件、文本内容、主题标签、时间戳和用户反馈。在数据集中的30,000个帖子中,有14,250个被发现包含滥用内容,其余15,750个被归类为典型的。

9.EXPERIMENTAL ANALYSIS AND RESULTS

This section evaluates our methods using experiments on three real-time datasets from Twitter, Vine, and Instagram. An 80/20 split of the data is made into training and evaluation sets. We use measurements like accuracy (ACC) and F1 scores to assess performance. Using a pre-trained Glove model to train words and encode them as 300-dimensional vectors, the process includes word embedding. Figures that are composed of only black lines and shapes. These figures should have no shades or half-tones of gray, only black and white.
本节使用来自Twitter、Vine和Instagram的三个实时数据集的实验来评估我们的方法。将80/20的数据分成训练集和评估集。我们使用准确性(ACC)和F1分数等指标来评估性能。使用预训练的Glove模型来训练单词并将其编码为300维向量,该过程包括单词嵌入。仅由黑色线条和形状组成的图形。这些数字应该没有阴影或半色调的灰色,只有黑色和白色。

A. BASELINE APPROACH 基准办法

We benchmarked our cyberbullying detection technique against other baseline methodologies in order to thoroughly assess its efficacy. For this, classic machine learning algorithms like Naive Bayesian models, Random Forest, Support Vector Machines (SVMs), and Logistic Regression were used. We adjusted these models to use TF-IDF vectors for text data representation at the word and character levels in order to accommodate the variety of text forms. Furthermore, we added psychological insights from the Linguistic Inquiry Word Count (LIWC) instrument to our models.我们将我们的网络欺凌检测技术与其他基线方法进行了比较,以彻底评估其有效性。为此,使用了经典的机器学习算法,如朴素贝叶斯模型,随机森林,支持向量机(SVM)和逻辑回归。我们调整了这些模型,使用TF-IDF向量在单词和字符级别的文本数据表示,以适应各种文本形式。此外,我们还将来自语言调查字数统计(LIWC)工具的心理学见解添加到我们的模型中。

B. RESULTS

  1. We evaluate the effectiveness of different models on the Twitter, Instagram, and Vine datasets using accuracy and F1 scores as metrics. Given the uneven data distribution in these datasets, we prioritize evaluating F1 scores. The provided results, shown in Tables 2, highlight how well MMCD performs compared to the other models, as evidenced by its better F1 and ACC scores.我们使用准确性和F1分数作为指标,在Twitter、Instagram和Vine数据集上评估不同模型的有效性。鉴于这些数据集中的数据分布不均匀,我们优先评估F1评分。所提供的结果(如表2所示)突出显示了MMCD与其他型号相比的表现,如其更好的F1和ACC评分所证明的。
    在这里插入图片描述
  2. MMCD performs better in the Twitter dataset than the top baseline model, MMCD, with 2.9% higher F1 and 2.5% higher ACC scores. A significant improvement in F1 scores more than offsets the slight increase in ACC scores, which is particularly helpful in identifying instances of cyberbullying. Notably, on the Vine dataset, the MMCD model demonstrates superior performance, with a 1.9% higher F1 score and a 0.3% higher ACC score compared to its own baseline. These results underscore the advantages of our approach over previous models, emphasizing enhanced accuracy and stability in the detection of cyberbullying.MMCD在Twitter数据集中的表现优于顶级基线模型MMCD,F1和ACC得分分别高出2.9%和2.5%。F1分数的显著提高抵消了ACC分数的轻微增加,这对识别网络欺凌的情况特别有帮助。值得注意的是,在Vine数据集上,MMCD模型表现出了上级的性能,与其自身基线相比,F1评分高出1.9%,ACC评分高出0.3%。这些结果强调了我们的方法比以前的模型的优势,强调在检测网络欺凌的准确性和稳定性增强。
  3. The MMCD model integrates temporal components, annotations, and the framework of social media interactions. However, it fails to consider media data, such as information associated with images and videos. On the other hand, the approach of Cheng et al. depends on manually designed textual characteristics and takes into account the occurrence of keywords in comments. However, these characteristics do have specific constraints. The results validate the efficacy of utilizing media data to detect occurrences of cyberbullying, highlighting the higher proficiency of deep learning compared to traditional approaches in extracting unique characteristics. The Hierarchical Attention Network (HAN) surpasses existing deep learning models, exhibiting exceptional performance in terms ofboth F1 scores andACC scores across a wide range of datasets. This highlights the effectiveness of attention mechanisms in hierarchically encoding textual information. The findings emphasize the significance of comments and the communal element in recognizing instances of cyberbullying, with attention processes playing a pivotal role in improving the accuracy of detection. The LSTM model with attention demonstrates exceptional efficacy when applied to the Twitter dataset, outperforming the normal LSTM model. The statistically significant improvements in F1 scores by 2.7% and ACC values by 2.3% indicate notable enhancements in the model’s performance. Incorporating attention processes enhances the stability and accuracy of the model. In addition, the MMCD model surpasses the HAN model by highlighting the diverse significance of posts and comments in detecting cyberbullying on social media, hence boosting the efficacy of the detection method.MMCD模型集成了时间组件,注释和社交媒体交互的框架。然而,它没有考虑媒体数据,例如与图像和视频相关联的信息。另一方面,Cheng等人的方法依赖于手动设计的文本特征,并考虑了评论中关键词的出现。然而,这些特性确实有特定的限制。结果验证了利用媒体数据检测网络欺凌事件的有效性,突出了深度学习在提取独特特征方面比传统方法更高的熟练程度。分层注意力网络(HAN)超越了现有的深度学习模型,在广泛的数据集上表现出卓越的F1得分和ACC得分。这突出了注意机制在分层编码文本信息中的有效性。研究结果强调了评论和社区元素在识别网络欺凌事件中的重要性,注意力过程在提高检测准确性方面发挥了关键作用。当应用于Twitter数据集时,具有注意力的LSTM模型表现出非凡的功效,优于普通的LSTM模型。F1分数统计学显著改善2.7%,ACC值统计学显著改善2.3%,表明模型性能显著增强。注意过程的重复提高了模型的稳定性和准确性。此外,MMCD模型通过突出帖子和评论在检测社交媒体上的网络欺凌方面的不同意义而超越了HAN模型,从而提高了检测方法的有效性。
  4. Furthermore, our MMCD model exhibits greater performance in comparison to the Text-CNN model. The Text-CNN model encounters difficulties when it encounters a dearth of sequential information in the text. Traditional approaches of identifying cyberbullying, regardless of its language attributes, face challenges in achieving a high degree of efficacy. Conversely, the Random Forest model demonstrates higher performance in comparison to other classifiers, indicating that each feature serves a unique purpose across different classifiers. This underscores the possibility of attaining superior results through the amalgamation of varied attributes此外,与Text-CNN模型相比,我们的MMCD模型表现出更好的性能。Text-CNN模型在遇到文本中缺乏顺序信息时会遇到困难。识别网络欺凌的传统方法,无论其语言属性如何,在实现高度有效性方面面临挑战。相反,与其他分类器相比,随机森林模型表现出更高的性能,这表明每个特征在不同的分类器中都有独特的用途。这强调了通过融合不同属性获得上级结果的可能性。
  5. The findings of this study lend credence to the notion that the MMCD model’s demonstration of the importance of attention processes and the incorporation of comprehensive features plays a significant role in increasing the efficiency of cyberbullying detection. The significance of taking into account both social and textual factors together highlights the intricacy of this undertaking within a practical social media setting, as shown in Figure 2, 3, and 4.这项研究的结果提供了可信的概念,即MMCD模型的示范的重要性,注意过程和综合功能的纳入起着重要的作用,在提高效率的网络欺凌检测。同时考虑社会和文本因素的重要性突出了在实际社会媒体环境中这项工作的复杂性,如图2、3和4所示。
    在这里插入图片描述
    在这里插入图片描述
    在这里插入图片描述

C. PARAMETER ANALYSIS 参数分析

  1. We explore a variety of embedding techniques throughout the training phase in order to assess the impact of various word embedding strategies on our model. First, we choose a range of pre-trained models with varying dimensions, including enword2vec-3002, en-glove-6b-300d, en-glove-42b-300d, and en-glove-840b-300d.我们在整个训练阶段探索了各种嵌入技术,以评估各种单词嵌入策略对我们模型的影响。首先,我们选择一系列具有不同维度的预训练模型,包括enword 2 vec-3002,en-glove-6 b-300 d,en-glove-42 b-300 d和en-glove-840 b-300 d。
  2. The results in Figures 5 and 6 show how the size of the corpus affects how well-pre- trained word embeddings work. As the corpus size grows, the Glove model consistently gets higher F1 scores and accuracy scores.图5和图6中的结果显示了语料库的大小如何影响预先训练好的词嵌入的工作方式。随着语料库规模的增长,Glove模型始终获得更高的F1分数和准确性分数。
    在这里插入图片描述
    在这里插入图片描述
  3. The en-glove-840b-300d model achieves superior performance compared to the en-glove-42b-300d model on the Instagram dataset, with an improvement of 0.16% in F1 scores and 2.71% in ACC scores. The Twitter dataset also shows similar patterns, with the en-glove-840b-300d model outperforming the en-glove-42b-300d model. The en-glove840b-300d model achieves a 0.18% increase in F1 scores and a 2.46% increase in ACC scores.与Instagram数据集上的en-glove-42 b-300 d模型相比,en-glove-840 b-300 d模型实现了上级性能,F1得分提高了0.16%,ACC得分提高了2.71%。Twitter数据集也显示了类似的模式,en-glove-840 b-300 d模型的表现优于en-glove-42 b-300 d模型。en-glove 840 b-300 d型号的F1评分增加了0.18%,ACC评分增加了2.46%。
  4. When analyzing the Vine dataset, improvements can be shown, as the en-glove-840b- 362 300d model outperforms the en-glove-42b-300d model by 0.66% in F1 scores and 1.5% in ACC scores. Once again, this pattern is observed in the Twitter dataset, where the en-glove-840b-300d model has superior performance compared to the en-glove-42b300d model. Specifically, it exhibits improvements of 0.72% in F1 scores and 1.25% in ACC 366 values.当分析Vine数据集时,可以显示出改进,因为en-glove-840 b- 362 300 d模型在F1得分中比en-glove-42 b-300 d模型高出0.66%,在ACC得分中高出1.5%。在Twitter数据集中再次观察到这种模式,其中en-glove-840 b-300 d模型与en-glove-42 b300 d模型相比具有上级性能。具体而言,F1评分改善0.72%,ACC 366值改善1.25%。
  5. The larger corpus size is responsible for these benefits, as it includes a wider range of languages. The 840b model, which includes more than 10% additional pre-trained words in both datasets compared to the 42b model, is essential for improving performance. Compared to the en-word2vec-300 model, the Glove models consistently perform better on all datasets.较大的语料库大小是这些好处的原因,因为它包括更广泛的语言。与42 b模型相比,840 b模型在两个数据集中增加了超过10%的预训练单词,这对于提高性能至关重要。与en-word 2 vec-300模型相比,Glove模型在所有数据集上的表现都更好。
  6. In comparison to the en-word2vec-300 model, the Glove models consistently demonstrate greater performance on all datasets. The consistent enhancements in the performance of the Glove model, namely the 388 en-glove-840b-300d version, emphasize its appropriateness for tasks that demand a subtle comprehension of language, such as identifying instances of cyberbullying on social media platforms like Instagram, Vine, and Twitter. The results emphasize the significance of choosing a word embedding model that uses a bigger and more diversified collection of texts. This will improve the overall effectiveness of the framework when applied to various social media situations.与en-word 2 vec-300模型相比,Glove模型在所有数据集上都表现出更好的性能。Glove模型(即388 en-glove-840 b-300 d版本)的性能持续增强,强调其适用于需要微妙理解语言的任务,例如识别社交媒体平台上的网络欺凌事件,如Instagram,Vine和Twitter。结果强调了选择一个使用更大和更多样化的文本集合的词嵌入模型的重要性。这将提高该框架在应用于各种社交媒体情况时的整体有效性。
  7. To examine the learning rate’s sensitivity and influence of the learning rate, we systematically manipulate its values and assess their impact on overall performance, with a specific emphasis on F1 scores.为了研究学习率的敏感性和学习率的影响,我们系统地操纵其值,并评估其对整体表现的影响,特别强调F1分数。
  8. Figure 7 indicates the durability of our model across a wide range of learning rates. However, it shows that the model’s performance is not optimal when the learning rate is high or low. A learning rate that is too high impedes precise updates of parameters, leading to sub-optimal results. On the other hand, an extremely low learning rate does not effectively learn all the necessary information within a restricted number of iterations, resulting in less than optimal performance. Notwithstanding these factors, our model exhibits strong performance throughout a broad spectrum of learning rates, offering adaptability for fine-tuning according to specific goals.图7显示了我们的模型在各种学习率下的持久性。然而,它表明,该模型的性能不是最佳的学习率时,高或低。过高的学习率会阻碍参数的精确更新,导致次优结果。另一方面,极低的学习率不能在有限的迭代次数内有效地学习所有必要的信息,导致性能低于最佳性能。尽管有这些因素,我们的模型在广泛的学习率范围内表现出强大的性能,根据特定的目标提供微调的适应性。在这里插入图片描述

10.DISCUSSION AND COMPARATIVE ANALYSIS 讨论和比较分析

The simulation results presented in the previous section highlight the exceptional effectiveness of our proposed model in accurately differentiating cyberbullying from noncyberbullying content. By utilizing cutting-edge deep learning methods, our model demonstrates outstanding efficiency and precision, even when handling large datasets. The training process is strong and produces a highly effective generation ofmodalities. One notable advantage is the skillful integration of different modalities, which enhances overall performance. Our method differs from previous studies that only use one type of data. Instead, we combine four main types of data, text, comments, media, and metadata— which leads to a significant improvement in performance, as shown in Table 3. It is worth mentioning that models that use only two modalities consistently show worse results, which further emphasizes the superiority of our approach. Our model demonstrates better results compared to existing state-of-the-art models, as evidenced by higher accuracy and F-measure scores. The improved performance is observed across Twitter, Vine, and Instagram datasets. On the Twitter dataset, our model achieves an accuracy of 92.1% and an F-measure of 86.40%, surpassing other methods. Comparable levels of performance are consistently seen on Vine and Instagram datasets, with accuracy scores of 83.80% and 86.41%, respectively, along with corresponding F-measure scores. Our model on Instagram demonstrates an accuracy of 86.41% and an F-measure of 86%, highlighting its superiority compared to current cutting-edge models [34], [35], [36], [37]. The results highlight the consistent superiority of our method across different datasets and modalities, confirming its effectiveness in accurately identifying cyberbullying from non-cyberbullying content.上一节的模拟结果突出了我们提出的模型在准确区分网络欺凌和非网络欺凌内容方面的非凡有效性。通过利用尖端的深度学习方法,即使在处理大型数据集时,我们的模型也表现出了出色的效率和精度。训练过程是强有力的,产生了非常有效的训练模式。一个显著的优势是巧妙地集成不同的模态,从而提高整体性能。我们的方法不同于以往只使用一种类型数据的研究。相反,我们将四种主要类型的数据(文本、评论、媒体和元数据)组合在一起,从而显著提高了性能,如表3所示。值得一提的是,仅使用两种模态的模型始终显示出较差的结果,这进一步强调了我们的方法的优越性。与现有的最先进的模型相比,我们的模型显示了更好的结果,如更高的准确度和F-测量评分所证明的。在Twitter、Vine和Instagram数据集上观察到了性能的提高。在Twitter数据集上,我们的模型达到了92.1%的准确率和86.40%的F-测度,超过了其他方法。在Vine和Instagram数据集上,表现水平相当,准确性得分分别为83.80%和86.41%,沿着相应的F-测度得分。我们在Instagram上的模型显示了86.41%的准确率和86%的F-测量,突出了它与当前尖端模型相比的优越性[34],[35],[36],[37]。实验结果表明,该方法在不同的数据集和模式下具有一致的优越性,证实了该方法在准确识别网络欺凌和非网络欺凌内容方面的有效性。
在这里插入图片描述
Moreover, the proposed framework utilizes distinct hyperparameters to enhance its performance. For example, the Twitter dataset uses a learning rate of 0.01, a batch size of 128, and an embedding dimension of 512. The selection of these hyperparameters aims to strike a balance between the complexity of the model and the efficiency of computational processes in order to achieve optimal training and inference. Similarly, the Vine dataset utilizes a learning rate of 0.001, a batch size of 64, and an embedding dimension of300. These same hyperparameters are also used for the Instagram dataset. The meticulous selection of these hyperparameters enhances the model’s capacity to efficiently acquire and apply patterns from various datasets, resulting in exceptional performance in detecting cyberbullying content.此外,所提出的框架利用不同的超参数,以提高其性能。例如,Twitter数据集使用的学习率为0.01,批量大小为128,嵌入维数为512。这些超参数的选择旨在在模型的复杂性和计算过程的效率之间取得平衡,以实现最佳的训练和推理。类似地,Vine数据集使用0.001的学习率、64的批量大小和300的嵌入维度。这些相同的超参数也用于Instagram数据集。这些超参数的精心选择增强了模型有效获取和应用各种数据集模式的能力,从而在检测网络欺凌内容方面获得了出色的性能。
Table 4 presents the model’s performance on three separate datasets: Vine, Instagram, and Twitter. Each dataset contains both occurrences of cyberbullying and non-cyberbullying behavior. Within the Vine dataset, 85 instances of cyberbullying are correctly identified by the model out of a total of 100 instances. However, the model mistakenly classifies 15 instances as non-cyberbullying, leading to an error rate of 15%. The Twitter dataset, consisting of 10,500 instances of confirmed cyberbullying, is accurately identified by the model in 8,500 cases. However, the model incorrectly classifies 1,000 cases as non-cyberbullying, resulting in a 10% error rate. Regarding the Facebook dataset, the model accurately detects 12,000 occurrences of cyberbullying out of the total 15,000 confirmed instances. However, it incorrectly classifies 3,000 instances as non-cyberbullying, leading to a 20% rate of misclassification.表4展示了该模型在三个独立数据集上的性能:Vine、Instagram和Twitter。每个数据集包含网络欺凌和非网络欺凌行为的发生。在Vine数据集中,该模型在总共100个实例中正确识别了85个网络欺凌实例。然而,该模型错误地将15个实例分类为非网络欺凌,导致15%的错误率。Twitter数据集由10,500个确认的网络欺凌实例组成,该模型在8,500个案例中准确识别。然而,该模型错误地将1,000个案例分类为非网络欺凌,导致10%的错误率。关于Facebook数据集,该模型在总共15,000个确认的实例中准确地检测到12,000次网络欺凌事件。然而,它错误地将3,000个实例分类为非网络欺凌,导致20%的错误分类率。
在这里插入图片描述
The MMCD model’s incapacity to integrate media data is a significant constraint that could have implications for the thoroughness of the outcomes. Media content, such as images and videos, frequently includes vital contextual information that can impact the understanding of text-based content. The MMCD model’s omission of media data in the analysis may result in neglecting significant cues and subtleties that could enhance the accuracy of comprehending cyberbullying behavior. As a result, the accuracy of the model’s predictions may be affected, resulting in incorrect categorizations or incomplete evaluations of cyberbullying cases. This constraint highlights the significance ofthoroughly incorporating various data modalities in future versions of the MMCD model to guarantee a more comprehensive and nuanced approach to detecting cyberbullying.MMCD模型无法整合媒体数据是一个重要的制约因素,可能会影响结果的彻底性。图像和视频等媒体内容通常包含重要的上下文信息,这些信息可能会影响对基于文本的内容的理解。MMCD模型在分析中忽略了媒体数据,这可能会导致忽视重要的线索和微妙之处,这些线索和微妙之处可以提高理解网络欺凌行为的准确性。因此,模型预测的准确性可能会受到影响,导致对网络欺凌案件的分类不正确或评估不完整。这一限制凸显了在未来版本的MMCD模型中彻底整合各种数据模式的重要性,以保证更全面和细致入微的方法来检测网络欺凌。
The proposed model’s ethical considerations revolve around privacy preservation, potential biases, and unintended consequences. Ensuring user privacy and data protection is paramount, especially when analyzing sensitive information from social media platforms. Additionally, mitigating biases in the model’s training data and outputs is essential to prevent discriminatory outcomes or false accusations.拟议模型的道德考虑围绕着隐私保护,潜在的偏见和意想不到的后果。确保用户隐私和数据保护至关重要,特别是在分析社交媒体平台的敏感信息时。此外,减少模型训练数据和输出中的偏差对于防止歧视性结果或虚假指控至关重要。

11.CONCLUSION

In this research, we introduce an innovative and reliable framework for detecting cyberbullying that utilizes three modules to extract unique information from different modalities inside a social network. The initial module utilizes bidirectional LSTM with attention techniques to effectively capture the intrinsic features of posts. To conduct a detailed study of post-comments, we propose the use of hierarchical attention networks. These networks operate dynamically at both the word and comment levels. Furthermore, to effectively encode meta-information, including video and image content, our methodology incorporates a Multilayer Perceptron. The sophisticated and flexible cyberbullying detection system that was created using this thorough methodology was put to the test by carefully analyzing three real datasets gathered from social platforms.在这项研究中,我们引入了一个创新的和可靠的框架,利用三个模块来提取独特的信息,从不同的形式内的社交网络来检测网络欺凌。初始模块利用双向LSTM和注意力技术来有效地捕捉帖子的内在特征。为了对后评论进行详细的研究,我们建议使用层次注意网络。这些网络在单词和评论两个层面上动态运行。此外,为了有效地编码元信息,包括视频和图像内容,我们的方法采用了多层感知器。通过仔细分析从社交平台收集的三个真实的数据集,使用这种彻底的方法创建了复杂而灵活的网络欺凌检测系统。
Subsequent investigations into the amalgamation of diverse information modalities for the purpose of cyberbullying detection are of abject importance. In the context of social media, this necessitates a meticulous analysis of the intricate interrelationships among diverse fields of information. An indepth comprehension of growing trends and patterns is crucial for the modeling of emerging kinds of cyberbullying behavior. In addition, to improve the precision and usefulness of cyberbullying detection models, future endeavors should focus on improving current approaches, investigating sophisticated deep learning structures, and integrating real-time data processing capabilities. Moreover, doing a more in-depth examination of the socio-psychological factors related to cyberbullying and including explainable AI methods can enhance the creation of cyberbullying detection systems that are both more efficient and socially accountable.随后对为检测网络欺凌而合并各种信息模式的调查至关重要。在社交媒体的背景下,这需要对不同信息领域之间错综复杂的相互关系进行细致分析。深入理解不断增长的趋势和模式对于建模新兴的网络欺凌行为至关重要。此外,为了提高网络欺凌检测模型的精度和实用性,未来的努力应该集中在改进当前的方法,研究复杂的深度学习结构,并集成实时数据处理功能。此外,对与网络欺凌相关的社会心理因素进行更深入的研究,并包括可解释的人工智能方法,可以加强创建更有效和对社会负责的网络欺凌检测系统。

  • 4
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值