情态 语态_情绪与情态与对话情感

情态 语态

In this article, we will see how verbal functional categories used in customer dialogue text and how these categories contribute to the semantics, especially the text sentiment.

在本文中,我们将了解客户对话文本中使用的语言功能类别以及这些类别如何促进语义(尤其是文本情感)。

Verb phrase in a sentence sometimes can carry huge semantics, sometimes hint the sentiment only by itself even though if one does not see the rest of the context words, hence contribute to sentiment analysis models as important features. For example, cross all the not-included-in-any-VP words from the following customer reviews:

句子中的动词短语有时可能具有巨大的语义,有时即使它看不到上下文中的其余单词,有时也只能通过自身来暗示情感,因此有助于情感分析模型作为重要特征。 例如,将以下客户评论中所有未包含在VP中的单词划掉:

The product isn't working properly. 
I didn't like this product.
I'm not satisfied with the product quality at all.

In order to charge meaning to the sentence, many languages like the verb to admit different inflections such as tense and person. Moreover most of the time we want to express our feeling and opinions about how the proposed action by the verb happened: are we sure, did we see the action by our own eyes, do we think it’s likely or unlikely?

为了使句子具有意义,许多语言都喜欢动词来接受时态和人称等不同的变化。 此外,在大多数时候,我们想表达对动词提议的动作是如何发生的感觉和看法:我们确定吗,我们是否亲眼看到了该动作,我们认为这是可能的还是不太可能的?

This is more of a semantic capability, thus one usually needs more than inflecting the verb, more than grammatical constructions. Function of the verb is a broad topic, but I will explain some basic concepts before the statistical parts. You can skip to the next section if you have this background.

这更多的是一种语义能力,因此通常需要的不仅仅是动词的变位, 还有语法的构造。 动词的功能是一个广泛的话题,但是我将在统计部分之前解释一些基本概念。 如果您有此背景,则可以跳到下一部分。

Tense is a grammatical realization of time by means of verbal inflection. English has 2 tenses: past and present. Future is not a tense not having an inflectional marker, but it is rather a time . Future time is formed either by will or with adverbs such as tomorrow , 8 o'clock or next week/month etc. As you see tense is a grammatical concept and time is rather a semantic concept.

Tense是通过言语变化实现的时间语法化。 英语有2个时态: pastpresent 。 没有变化的标志, Future不是时态,而是timeFuture time是由will或副词(如tomorrow8 o'clocknext week/month等)形成的。正如您所看到的, tense是一个语法概念,而time则是一个语义概念。

Another concept is aspect, a grammatical category which reflects the action given by the verb happened with respect to time. English has two aspects:

另一个概念是aspect ,它是一个语法类别,反映了动词相对于时间发生的动作。 英语有两个方面:

action complete:    perfective      has moved, had moved
action in progress: progressive is moving, was moving

One can summarize tense and aspect as follows:

可以将时态和方面总结如下:

Voice in English is either passive or active, as we learnt in high school. Passive voice has further semantic subcategories, but in this article, we will stay at high school level grammar 😉

正如我们在高中时所学的那样,英语Voice既可以是被动的也可以是主动的。 被动语态还有其他语义子类别,但是在本文中,我们将停留在高中语法

Mood is a grammatical category indicating whether a verb expresses a fact (indicative mood) or conditionality (subjunctive mood). Some examples are:

Mood是一种语法类别,指示动词表达的是事实(指示性语气)还是条件性(虚拟语气)。 一些例子是:

Sun rises at 6 o'clock here.                           indicative
It is important the manager be informed of the changes subjunctive

Mood is grammatical and associates to two semantic notions: modality and illocution. Illocution of a sentence can be thought as sentence type :

情绪是语法上的,与两个语义概念相关:情态和illocution 。 句子的Illocution可以认为是sentence type

Go there!                                              imperative
Do you want to go there? interrogative
God save the queen! optative
I will see it. declarative

Modality is a semantic notion that is related to speaker’s opinion and belief about the event’s believability, obligatoriness, desirability, or reality. Modality in English can be achieved by modal verbs (will/would, can/could, may/might, shall/should, must), modal adverbs (maybe, perhaps, possibly, probably), some subordinate clauses including (wish, it’s time,possible, probable, chance, possibility ), some modal nouns (decree, demand,necessity, requirement, request) or some modal adjectives (advisable, crucial, imperative, likely, necessary, probable, possible).

Modality是一种语义概念,与发言人对事件的可信度,义务性,期望性或现实性的看法和信念有关。 英语中的情态可以通过情态动词( will/wouldcan/couldmay/mightwill/would shall/shouldmust ),情态副词( maybeperhapspossiblyprobably )来实现,一些从句包括( wish, it's time,possible, probable, chance, possibility ),某些模态名词( decreedemandnecessityrequirementrequest )或某些模态形容词( advisablecrucialimperativelikelynecessaryprobablepossible )。

I would love you if things were different                irrealis
You may go permission
I may come with you too possibility
I might come with you too possibility
I must go obligation
He must be earning good money necessity
I can ride a bike ability
I can come with you too possibility
It is possible that we might see big changes around us. possibility
It might be the truth doubt
I'm sure they'll come confidence
Lights are on, so he must be in the office evidentiality

From now on, we will see the different verbal features that customers use to interact with conversational agents and how those usages lead to different semantics.

从现在开始,我们将看到客户用于与会话代理进行交互的不同语言功能,以及这些用法如何导致不同的语义。

Let’s begin with our Chris, our voice assistant in your car and see some typical user utterances in automotive conversational AI.

让我们从您的汽车语音助手Chris开始,看看汽车对话AI中的一些典型用户话语。

Chris datasets include many imperative sentences:

克里斯数据集包括许多命令式句子:

navigate
navigate home
start navigation
stop navi
play Britney Spears
play music
send a message
read my messages

Sometimes the utterance consists only of a noun phrase:

有时,话语仅包含一个名词短语:

music
die navigation
new messages

Particles are always a part of any voice assistant dialogue dataset:

粒子始终是任何语音助手对话数据集的一部分:

yes
no
please
yes please

Of course, some cursing and insulting are included, some in the form of sarcasm:

当然,其中包括一些诅咒和侮辱,有些是讽刺的形式:

you suck
you are dumb
you are miserable
a**chloch
you are so intelligent (!)

Chris is a driver assistant, so it is pretty normal that the utterances are succinct and to the point. It is not due to rudeness or roughness, just because one needs to speak short while driving. Compare the following two sentences, obviously, the first one is easier if you are driving:

克里斯是一名驾驶员助理,所以说话简洁明了很正常。 这不是由于粗鲁或粗糙,只是因为在驾驶时需要简短讲话。 比较下面两个句子,显然,如果您开车,第一个句子会更容易:

Hey Chris, drive me home
Hey Chris, shall we drive home together?

Imperative sentences in SLU are very common and definitely does not mean rudeness nor related to any speaker sentiment. Nothing interesting here really, first groups of utterances have verbs in the imperative mood, active voice and unmarked aspect. No modal verbs, no modal expressions or past tense. Sentiment for a voice assistant, in this case, better be calculated from the speech signal.

SLU中的命令式句子非常普遍,绝对不表示粗鲁,也不与任何说话者的情感有关。 在这里真的没什么有趣的,第一组话语在祈使语气,活跃的语气和没有标记的方面都有动词。 没有情态动词,没有情态表达或过去时。 在这种情况下,最好根据语音信号来计算语音助手的情绪。

Image for post
Chris being polite and intelligent at the same time
克里斯同时彬彬有礼

Spoken language may not be very exciting so far, then we can switch to written language which allows longer sentences hence more verb forms 😄 I used Women’s E-Commerce Clothing Reviews dataset to explore usage of verbal features. I will use lovely spaCy matcher (definitely not just because I am a contributor 😄) in this section. The dataset includes user reviews and ratings about purchases from a e-commerce website.

到目前为止,口语可能并不十分令人兴奋,然后我们可以切换到允许更长句子,因此有更多动词形式的书面语言😄我使用“ 女性电子商务服装评论”数据集来探讨语言功能的用法。 在本节中,我将使用可爱的spaCy匹配器 (绝对不仅仅是因为我是贡献者😄)。 数据集包括用户评论和有关从电子商务网站购买的评分。

Before starting, let’s remember the POS tags related to verbs as we will go over the verbs mostly. English verbs have five forms: the base (VB and VBP), -s (VBZ), -ing (VBG), past (VBD), and past participle (VBN). Again, future time has no marker. Modal verbs can , could , might , may , will , would admit the tag MD .

开始之前,让我们记住与动词相关的POS标签,因为我们将主要介绍动词。 英语动词有五种形式:基础( VBVBP ),-s( VBZ ),-ing( VBG ),过去( VBD )和过去分词( VBN )。 同样,未来时间没有标记。 情态动词cancouldmightmaywillwould承认标记MD

语音 (Voice)

Let’s begin with Voice , matching patterns to passive voice are is/was adverb* past-participle-verb and have/has/had been adverb* past-participle-verb . Corresponding Matcher patterns can be:

让我们从Voice开始,与被动语音匹配的模式is/was adverb* past-participle-verbhave/has/had been adverb* past-participle-verb 。 相应的Matcher模式可以是:

{"TEXT": {"REGEX": "(is|was)"}}, {"POS": "ADV", "OP": "*"}, {"TAG": "VBN"}
and
{"LEMMA": "have"}, {"TEXT":"been"}, {"POS": "ADV", "OP": "*"}, {"TAG": "VBN"}

The first pattern is is/was , followed by any number of adverbs, then a past participle verb. POS is for UD POS tags and TAG is for extended POS. The second pattern is similar: have , has and had is represented by lemma : have .

第一个模式是is/was ,后跟任意数量的副词,然后是过去分词动词。 POS用于UD POS标签,TAG用于扩展POS。 第二种模式类似: havehashadlemma : have代表lemma : have

I’ll first import spaCy, load English model then add these two rules to the Matcher object:

我将首先导入spaCy,加载英语模型,然后将这两个规则添加到Matcher对象:

import spacy
from spacy.matcher import Matchernlp = spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocab)
pass1 = [{"TEXT": {"REGEX": "(is|was)"}}, {"POS": "ADV", "OP": "*"}, {"TAG": "VBN"}pass2 = [{"LEMMA": "have"}, {"TEXT":"been"}, {"POS": "ADV", "OP": "*"}, {"TAG": "VBN"}]matcher.add("pass1", None, pass1)
matcher.add("pass2", None, pass2)

Then I run the Matcher against the dataset and here are some passive voice examples, both from positive and negative reviews:

然后,我对数据集运行Matcher ,这是一些积极和消极评论的被动语音示例:

one wash and this was ruined!washed them according to directions and they were ruined.this could not have been returned fasteri kept it anyway because the xs has been sold out, and got it taken in a bit.it is simply stunning and indeed appears to have been designed by an artist.would buy this again in several different colors if they were offeredif these were presented in other colors, i would buy those as well

How do the number of passive voice verbs in a review correlate to the review rating? First of all, let’s see the rating distribution of the reviews:

评论中的被动语态动词数量与评论等级如何相关? 首先,让我们看一下评论的评分分布:

Image for post
Review rating distribution. Obviously many customers are satisfied
查看评级分布。 显然很多客户都满意

Next, we see the distribution of passive voice verb counts in reviews. Many reviews do not include passive voice at all, some have one passive verb and few have more than one passive construction.

接下来,我们将在评论中看到被动语音动词计数的分布。 许多评论根本不包括被动语态,一些评论只有一个被动动词,而很少有一个以上的被动构造。

Image for post

Do the number of passive voice verbs correlate to the review rating? From the below, indeed no (checking out the heatmap alone is enough, it points to no correlation at all).

被动语态动词的数量与评价等级相关吗? 从下面看,确实没有(仅检查热图就足够了,它根本没有关联)。

Image for post
Image for post
Image for post
Heatmap, jointplot and violin plot for review rating vs passive verb count
热图,联合图和小提琴图,用于评估评分与被动动词计数

No surprise looking at the corpus sentences, passive voice can be “designed by a famous designer” or “they are returned”. While referring to the cloths, how it is designed, tailored, done can be negative or positive; it is returned, ruined, presented can be negative or positive as well.

看着语料库句子就不足为奇了,被动语态可以是“由著名设计师设计的”,也可以是“被归还的”。 在提及衣服时,它的设计,剪裁,完成方式可能是负面的,也可能是正面的。 它返回,破坏,显示也可以是负数或正数。

时态与长相 (Tense and Aspect)

Let’s see how the time of the verb tense and aspect correlates to the review rating. Remember past and present tense are easy to calculate (by looking at the verb inflection), and future is not really a tense since there is no inflection. We will count number of future time occurrences by counting wills, going tos and time adverbs.

让我们看看动词时态和方面的时间如何与评论等级相关。 请记住,过去和现在时很容易计算(通过查看动词变位),而未来则不是真正的时态,因为没有变位。 我们将通过计算willgoing to s和时间副词going to计算将来发生的次数。

We can do the tense-aspect table again with Matcher patterns this time:

这次我们可以使用Matcher模式再次进行时态表:

Image for post
Tense and aspect , this time with spaCy Matcher patterns
时态和方面,这次使用spaCy Matcher模式

I will also count present perfect progressive tense (“have been doing”) and past perfect progressive tense (“had been doing”), they will contribute both perfective and progressive aspect counts for present and past tenses.

我还将数出present perfect progressive tense (“一直在做”)和past perfect progressive tense (“已经在做”),它们将为当前时态和过去时态贡献完美和进步的方面。

Here are some examples of the tense and aspects used in the reviews:

以下是评论中使用的时态和方面的一些示例:

I love, love, love this jumpsuit. it's fun, flirty, and fabulous! every time i wear it, i get nothing but great compliments!fits nicely! i'm 5'4, 130lb and pregnant so i bough t medium to grow into.I have been waiting for this sweater coat to ship for weeks and i was so excited for it to arrive. this coat is not true to size and made me look short and squat.I have been searching for the perfect denim jacket and this it!I had been eyeing this coat for a few weeks after it appeared in the email, and i finally decided to purchase it to treat myself.

What about the future time? Since there is no morphological marker, we can get help from will , going to , plan to , in 2/5/10 days , next week/month/summer , the day after tomorrow

那将来的时间呢? 由于没有形态学标记,我们will in 2/5/10 daysnext week/month/summerthe day after tomorrow in 2/5/10 daysplan to going toplan to获得帮助。

Corresponding Matcher patterns can be:

相应的Matcher模式可以是:

future_modal = [{"TEXT": "will", "TAG": "MD"}]future_adv = [{"TEXT": {"REGEX": "(plan(ning) to|(am|is|are) going to)"}}time_expr1 = [{"TEXT": {"REGEX": "((next|oncoming)(week|month|year|summer|winter|autumn|fall|)|the day after tomorrow)"}}]time_expr2 = [{"TEXT": "in"}, {"LIKE_NUM": True}, {"TEXT": {"REGEX":"(day|week|month|year)s"}}]

and examples from the corpus are:

语料库的示例包括:

sadly will be returning, but i'm sure i will find something to exchange it for!I love this shirt because when i first saw it, i wasn't sure if it was a shirt or dress. since it is see-through if you wear it like a dress you will need a slip or wear it with leggings.Just ordered this in a small for me (5'6", 135, size 4) and medium for my mom (5'3", 130, size 8) and it is gorgeous - beautifully draped, all the weight/warmth i'll need for houston fall and winter, looks polished snapped or unsnapped. age-appropriate for both my mom (60's) and myself (30's). will look amazing with skinny jeans or leggings.This will be perfect for the mild fall weather in texasThere's no extra buttons to replace the old one with and i'm worried more of the coat is going to fall apart.This is going to be my go to all season.i plan to wear it out to dinner for my birthday and to a house party on new years day....i am planning to exchange this and hoping it doesn't happen againit is nice addition to my wardrobe and i am planning to wear it to the multiple occasionthis is one of those rare dresses that looks good on me now and will still look good on me in 6 months when i've got a huge belly.

According to the below counts, customers used past tense a lot. Present tense is also used widely, whereas future time is used once or twice in each review.

根据以下计数,客户经常使用过去时。 现在时也被广泛使用,而未来时间在每次评论中都使用一到两次。

Image for post
Image for post
Image for post
Image for post
Image for post

Here are the corresponding histograms:

以下是相应的直方图:

Image for post
Image for post
Image for post
Image for post
Image for post
Corresponding histograms
对应的直方图

According to the heatmap below, usage present and future tense is not really correlated to the rating; both negative and positive ratings include these two tensed verbs. However, past tense looks a bit negatively correlated; more usage of the past tense means worse rating. Perfective and progressive aspects also do not look very bright, they are also a bit negatively correlated.

根据下面的热图,当前使用时态和将来时态与评级实际上没有关系; 正面和负面的评价都包括这两个张力动词。 但是,过去式看起来有点负相关; 过去时的更多使用意味着较差的评分。 完善和进步的方面看起来也不是很光明,它们之间也存在负相关关系。

Image for post
Heatmap for tense and aspect
时态和方面的热图

The below ridgeline plot shows some information, better reviews tends to have a spike towards 0 usage of past tense; unhappier customers tend to have a smoother usage of past tense instead. A possible explanation might be customers complained a lot: “package came late”, “waistline didn’t fit”, “I couldn’t zip”, “I didn’t like it”; whereas happy customers look to the future 😄We all look to the future when we are happier, don’t we?😉

下面的山脊线图显示了一些信息,更好的评论倾向于过去时的用法为0; 不满意的客户通常会更顺畅地使用过去时。 一个可能的解释可能是客户抱怨很多:“包装迟到”,“腰围不合适”,“我无法拉开拉链”,“我不喜欢它”; 满意的客户则展望未来😄当我们更快乐时,我们都展望未来,不是吗?😉

Image for post

情绪与情态 (Mood and Modality)

As we saw, modality is a semantic notion and the same modal can give different modalities. Let’s see different semantics could introduced for an example:

如我们所见,情态是一种语义概念,相同的情态可以赋予不同的情态。 让我们来看一个示例could引入的不同语义:

i love that i could dress it up for a party, or down for work.
possibility
the straps are very pretty and it could easily be nightwear too.
possibilitythis is a light weight bra, could be a little more supportive. pretty color, with nice lines. irrealisI bought this and like other reviews, agree that the quality probably could be better, but i still love it enough to keep.
irrealis
got it on sale, but it still could've been cheaper.
irrealis
Bought a large, could barely pull up over my butt.
ability

could is correlated to both negative and positive sentiment. Possibility mood looks like on the positive side, where irrealis looks like being both positive or negative.

could与消极情绪和积极情绪相关。 可能性情绪看起来像是积极的一面,无神癖看起来像是积极的或消极的。

What about couldn't ? It is a whole another story, examples below show how much couldn't provides semantic richness to both negative and positive sentiment, though almost all examples include only one type of modality:

什么couldn't ? 完全是另外一回事了,下面的示例显示了有多少couldn't为消极情绪和积极情绪提供语义丰富性,尽管几乎所有示例都只包含一种类型的情态:

so small in fact that i could not zip it up!                 abilityi was so excited to get his dress for my wedding shower and then i couldn't wear it :(                                          abilityi really tried because the fabric is wonderful and the shirt is light and breezy for summer, i just couldn't make it work    abilityi simply couldn't resist!                                    i could not be more pleased and regret not having bought this item earlier, since i would have enjoyed wearing it during the holidays.i could not be happier with the purchase and the keyhole in the back is a beautiful detail.                                                          
emphasizing opinioni also thought that this was very heavy for a maxi dress and could not imagine wearing it in 80 degree weather. ability i think it's maybe a little too long (or short, i can't figure it out) but i couldn't pass up this skirt because the pattern is so pretty. ability i just wish it was more of a blue denim blue but nonetheless, i could not walk away from the store without this. ability

would and wouldn't might feel different but statistically, they are similar to could and couldn't :

wouldwouldn't会有所不同,但从统计学上讲,它们类似于couldcouldn't

Image for post
Modals and review rating correlation matrix
模态和评价等级相关矩阵

Irrealis happens both in negative and positive reviews. Consider:

Irrealis出现在负面评论和正面评论中。 考虑:

maybe if i weren't as small chested this wouldn't be an issue for me. i definitely recommend this tee.the neckline wouldn't even stay up on my upper body, it was that loose.

Then no surprise that the presence of would/wouldn’t/could/couldn’t does not give away much about the review sentiment.

因此, would/wouldn't/could/couldn't在审查情绪上付出很多就不足为奇了。

Corresponding Matcher pattern would be

相应的Matcher模式将是

[{"TEXT": {"REGEX": "(would|could|can|might|may)"}, "TAG": "MD"}]

MD being the modal verb tag and we exclude will .

MD是情态动词标签,我们排除了will

Dear readers, here we reach the end of this article. We had a great time with spaCy, haven’t we (as usual)? 😃 We process huge amounts of data every day but sometimes we forget to read what is written in the corpus. Language is not just a bunch of words coming together; it has many aspects, both statistical and linguistical. Today we enjoyed both. Next time we will continue with statistical discourse for voice assistants. Till next time, you can always visit Chris on https://chris.com. You can also visit me on https://duygua.github.io always. Meanwhile stay happy, safe and tuned!

亲爱的读者,我们到本文结尾。 我们度过了愉快的时光,不是吗? 😃我们每天都会处理大量数据,但有时我们会忘记阅读语料库中的内容。 语言不仅是一堆单词,而且还在一起。 它具有统计和语言两个方面。 今天我们都很喜欢。 下次,我们将继续进行语音助手的统计讨论。 直到下次,您始终可以在https://chris.com上访问Chris。 您也可以随时在https://duygua.github.io上访问我。 同时保持快乐,安全和调教!

Palmer, F. (2001), Mood and Modality (2nd ed., Cambridge Textbooks in Linguistics). Cambridge: Cambridge University Press. doi:10.1017/CBO9781139167178

Palmer,F.(2001),《情绪与情态》(第2版,剑桥语言学教科书)。 剑桥:剑桥大学出版社。 doi:10.1017 / CBO9781139167178

翻译自: https://towardsdatascience.com/mood-modality-and-dialogue-sentiment-b06cd36eca88

情态 语态

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值