【转载】The Mathematics Of Beauty

 

[婚恋交友中,女性想要收到更多异性的信件,得多贴些有争议的照片,而不是贴那些大家都觉得不错的照片。这个结论来自于OkCupid的数据科学家,原文见这里。OkCupid号称online dating行业的Google,向他们学习。]

[本文链接:http://www.cnblogs.com/breezedeus/archive/2012/10/28/2744002.html,转载请注明出处。]

 

The Mathematics Of Beauty

January 10th, 2011 by Christian Rudder

Image(1)

This post investigates female attractiveness, but without the usual photo analysis stuff. Instead, we look past a woman's picture, into the reaction she creates in the reptile mind of the human male.

Among the remarkable things we'll show:

Image(2)

Fair warning: we're about to objectify women, big-time. The whole purpose of this blog is to analyze OkCupid's data, and without a little bit of objectification that's impossible. Men will get their turn under the microscope soon enough. As usual, none of this (with the exception of the celebrity examples) is my opinion. All data is collected from actual user activity.

 

Let's start at the beginning.

All people, but especially guys, spend a disproportionate amount of energy searching for, browsing, and messaging our hottest users. As I've noted before, a hot woman receives roughly the messages an average-looking woman gets, and 25× as many as an ugly one. Getting swamped with messages drives users, especially women, away. So we have to analyze and redirect this tendency, lest OkCupid become sausageparty.com.

Every so often we run diagnostic plots like the one below, showing how many messages a sampling of 5,000 women, sorted by attractiveness, received over the last month.

Image(4)

These graphs are adjusted for race, location, age, profile completeness, login activity, and so on—the only meaningful difference between the people plotted is their looks. After running a bunch of these, we began to ask ourselves: what else accounts for the wide spread of the x's, particularly on the "above-average" half of the graph? Is it just randomness?

What is it about her:

Image(5)

that gets more attention than her:

Image(6)

...even though according to our users, they're both good-looking?

Not all 7s are the same

It turns out that the first step to understanding this phenomenon is to go deeper into the mathematically different ways you can be attractive.

For example, using the classic 10-point 'looks' scale, let's say a person's a 7. It could be that everyone who sees her thinks exactly that: she's pretty cute.

Image(7)

But something extreme like this could just as easily be going on:

Image(8)

If all we know is that she is a 7, there's no way to tell. Maybe for some guys our hypothetical woman is the cat's pajamas and for the rest she's the cat Garfield. Who knows?

As it turns out, this distribution of opinions is very important.

Celebrity photos: to titillate and inform

Let's look at what the ratings distribution might be for a couple famous people. I imagine that for, say, the actress Kristen Bell it would be roughly like this:

Image(9)

Ms. Bell is universally considered good-looking, but it's not like she's a supermodel or anything. She would probably get a few votes in the 'super hot' range, lots around 'very attractive', and almost none at the 'unattractive' end of the graph.

Compare her to Megan Fox, who might rate like this:

Image(10)

On the far right, you have the many dudes who think she's the sexiest thing ever. On the far left, you have the small number of people who have seen her movies.

Unlike Ms. Bell, Ms. Fox produces a strong reaction, even if it's sometimes negative.

Real People

Now let's look back at the two real users from before, this time with their own graphs. OkCupid uses a 1 to 5 star system for rating people, so the rest of our discussion will be in those terms. All the users pictured were generous and confident enough to allow us to dissect their experience on our site, and we appreciate it. Okay, so we have:

Image(11)Image(12)

As you can see, though the average attractiveness for the two women above is very close, their vote patterns differ. On the left you have consensus, and on the right you have split opinion.

To put a fine point on it:

  • » Ms. Left is, in an absolute sense, considered slightly more attractive
  • » Ms. Right was also given the lowest rating 142% more often
  • » yet Ms. Right gets as many messages

When we began pairing other people of similar looks and profiles, but different message outcomes, this pattern presented itself again and again. The less-messaged woman was usually considered consistently attractive, while the more-messaged woman often created variation in male opinion. Here are a couple more examples:

Image(13)Image(14)

Image(15)Image(16)

We felt like were on to something, so, being math nerds, we put on sweatpants. Then we did some work.

Our first result was to compare the standard deviation of a woman's votes to the messages she gets. The more men disagree about a woman's looks, the more they like her.We found that the more men disagree about a woman's looks, the more they like her. I've plotted the deviation vs. messages curve below, again including some examples.

The women along the graph are near the 80th percentile in overall attractiveness. You can click the tiny thumbnails to expand them.

Image(17)Image(18)Image(19)Image(20)Image(21)Image(22)Image(23)Image(24)

As you can see, a woman gets a better response from men as men become less consistent in their opinions of her.

Our next step was to analyze a woman's actual vote pattern of 1s, 2s, 3s, 4s, and 5s:

Image(25)
Image(26)

If You're Into Algebra

We did a regression on the votes for and messages to a sample of 43,000 women. To keep everything consistent, all the women were straight, between the ages of 20 and 27, and lived in the same city. The formula given in the body of the post was the best-fit we found on our second regression, after dropping the m3 term because its p-value was very near 1.

msgs are the number of messages the woman received during the observation period. The constant k reflects her overall level of site activity. For this equation, R2 = .28, which isn't great in a lab or on a problem set, but is actually very good in a real-world environment.

This required a bit more math and is harder to explain with a simple line-chart. Basically, we derived a formula to predict the amount of attention a woman gets, based on the curve of her votes. With this we can translate what guys think of a woman's looks into how much attention she actually gets.

The equation we arrived at might look opaque, but when we get into it, we'll see it says some funny things about guys and how they decide which women to hit on.

Image(27)

The most important thing to understand is that the ms are the men voting on her looks, making up her graph, like so:

Image

And those ms with positive numbers in front contribute to messaging; the ones with negative numbers subtract from it. Here's what this formula is telling us:

The more men who say you're hot, the more messages you get.

How we know this—the .9 in front of m5 is the biggest positive number, meaning that the guys who think you're amazing (voting you a perfect '5') are the strongest contributors to your messaging income. This is certainly an expected result and gives us some indication our formula is making sense.

Men who think you're cute actually subtract from your message count.

How we know this—because the .1 coefficient in front of m4 is negative. This tells us that guys giving you a '4', who are actually rating you above average-looking, are taking away from the messages you get. Very surprising. In fact, when you combine this with the positive number in front of the m1 term, our formula says that, statistically speaking:

If someone doesn't think you're hot, the next best thing for them to think is that you're ugly.

Image(28)

This is a pretty crazy result, but every time we ran the numbers—changing the constraints, trying different data samples, and so on—it came back to stare us in the face.

What We Think Is Going On

So this is our paradox: when some men think you're ugly, other men are more likely to message you. And when some men think you're cute, other men become less interested. Why would this happen? Perhaps a little game theory can explain:

Suppose you're a man who's really into someone. If you suspect other men are uninterested, it means less competition. You therefore have an added incentive to send a message. You might start thinking: maybe she's lonely. . . maybe she's just waiting to find a guy who appreciates her. . . at least I won't get lost in the crowd. . . maybe these small thoughts, plus the fact that you really think she's hot, prod you to action. You send her the perfectly crafted opening message.

"sup"

On the other hand, a woman with a preponderance of '4' votes, someone conventionally cute, but not totally hot, might appear to be more in-demand than she actually is. To the typical man considering her, she's obviously attractive enough to create the impression that other guys are into her, too. But maybe she's not hot enough for him to throw caution (and grammar) to the wind and send her a message. It's the curse of being cute.

The overall picture looks something like this:

Image(29)

Finally: What This Could Mean To You

I don't assume every woman cares if guys notice her or not, but if you do, what does all the above analysis mean in practical terms?

Well, fundamentally, it's hard to change your overall attractiveness (the big single number we were talking about at the beginning). However, the variance you create is under your control, and it's simple to maximize:

Take whatever you think some guys don't like—and play it up.

As you've probably already noticed, women with tattoos and piercings seem to have an intuitive grasp of this principle. They show off what makes them different, and who cares if some people don't like it. And they get lots of attention from men.

Image(30)Image(31)

But our advice can apply to anyone. Browsing OkCupid, I see so many photos that are clearly designed to minimize some supposedly unattractive trait—the close-cropped picture of a person who's probably overweight is the classic example. We now have mathematical evidence that minimizing your "flaws" is the opposite of what you should do. If you're a little chubby, play it up. If you have a big nose, play it up. If you have a weird snaggletooth, play it up: statistically, the guys who don't like it can only help you, and the ones who do like it will be all the more excited.

 

Tweet Share this on Twitter.

Journalists, please write press@okcupid.com for more information.

转载于:https://www.cnblogs.com/breezedeus/archive/2012/10/28/2744002.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
1 目标检测的定义 目标检测(Object Detection)的任务是找出图像中所有感兴趣的目标(物体),确定它们的类别和位置,是计算机视觉领域的核心问题之一。由于各类物体有不同的外观、形状和姿态,加上成像时光照、遮挡等因素的干扰,目标检测一直是计算机视觉领域最具有挑战性的问题。 目标检测任务可分为两个关键的子任务,目标定位和目标分类。首先检测图像中目标的位置(目标定位),然后给出每个目标的具体类别(目标分类)。输出结果是一个边界框(称为Bounding-box,一般形式为(x1,y1,x2,y2),表示框的左上角坐标和右下角坐标),一个置信度分数(Confidence Score),表示边界框中是否包含检测对象的概率和各个类别的概率(首先得到类别概率,经过Softmax可得到类别标签)。 1.1 Two stage方法 目前主流的基于深度学习的目标检测算法主要分为两类:Two stage和One stage。Two stage方法将目标检测过程分为两个阶段。第一个阶段是 Region Proposal 生成阶段,主要用于生成潜在的目标候选框(Bounding-box proposals)。这个阶段通常使用卷积神经网络(CNN)从输入图像中提取特征,然后通过一些技巧(如选择性搜索)来生成候选框。第二个阶段是分类和位置精修阶段,将第一个阶段生成的候选框输入到另一个 CNN 中进行分类,并根据分类结果对候选框的位置进行微调。Two stage 方法的优点是准确度较高,缺点是速度相对较慢。 常见Tow stage目标检测算法有:R-CNN系列、SPPNet等。 1.2 One stage方法 One stage方法直接利用模型提取特征值,并利用这些特征值进行目标的分类和定位,不需要生成Region Proposal。这种方法的优点是速度快,因为省略了Region Proposal生成的过程。One stage方法的缺点是准确度相对较低,因为它没有对潜在的目标进行预先筛选。 常见的One stage目标检测算法有:YOLO系列、SSD系列和RetinaNet等。 2 常见名词解释 2.1 NMS(Non-Maximum Suppression) 目标检测模型一般会给出目标的多个预测边界框,对成百上千的预测边界框都进行调整肯定是不可行的,需要对这些结果先进行一个大体的挑选。NMS称为非极大值抑制,作用是从众多预测边界框中挑选出最具代表性的结果,这样可以加快算法效率,其主要流程如下: 设定一个置信度分数阈值,将置信度分数小于阈值的直接过滤掉 将剩下框的置信度分数从大到小排序,选中值最大的框 遍历其余的框,如果和当前框的重叠面积(IOU)大于设定的阈值(一般为0.7),就将框删除(超过设定阈值,认为两个框的里面的物体属于同一个类别) 从未处理的框中继续选一个置信度分数最大的,重复上述过程,直至所有框处理完毕 2.2 IoU(Intersection over Union) 定义了两个边界框的重叠度,当预测边界框和真实边界框差异很小时,或重叠度很大时,表示模型产生的预测边界框很准确。边界框A、B的IOU计算公式为: 2.3 mAP(mean Average Precision) mAP即均值平均精度,是评估目标检测模型效果的最重要指标,这个值介于0到1之间,且越大越好。mAP是AP(Average Precision)的平均值,那么首先需要了解AP的概念。想要了解AP的概念,还要首先了解目标检测中Precision和Recall的概念。 首先我们设置置信度阈值(Confidence Threshold)和IoU阈值(一般设置为0.5,也会衡量0.75以及0.9的mAP值): 当一个预测边界框被认为是True Positive(TP)时,需要同时满足下面三个条件: Confidence Score > Confidence Threshold 预测类别匹配真实值(Ground truth)的类别 预测边界框的IoU大于设定的IoU阈值 不满足条件2或条件3,则认为是False Positive(FP)。当对应同一个真值有多个预测结果时,只有最高置信度分数的预测结果被认为是True Positive,其余被认为是False Positive。 Precision和Recall的概念如下图所示: Precision表示TP与预测边界框数量的比值 Recall表示TP与真实边界框数量的比值 改变不同的置信度阈值,可以获得多组Precision和Recall,Recall放X轴,Precision放Y轴,可以画出一个Precision-Recall曲线,简称P-R
图像识别技术在病虫害检测中的应用是一个快速发展的领域,它结合了计算机视觉和机器学习算法来自动识别和分类植物上的病虫害。以下是这一技术的一些关键步骤和组成部分: 1. **数据收集**:首先需要收集大量的植物图像数据,这些数据包括健康植物的图像以及受不同病虫害影响的植物图像。 2. **图像预处理**:对收集到的图像进行处理,以提高后续分析的准确性。这可能包括调整亮度、对比度、去噪、裁剪、缩放等。 3. **特征提取**:从图像中提取有助于识别病虫害的特征。这些特征可能包括颜色、纹理、形状、边缘等。 4. **模型训练**:使用机器学习算法(如支持向量机、随机森林、卷积神经网络等)来训练模型。训练过程中,算法会学习如何根据提取的特征来识别不同的病虫害。 5. **模型验证和测试**:在独立的测试集上验证模型的性能,以确保其准确性和泛化能力。 6. **部署和应用**:将训练好的模型部署到实际的病虫害检测系统中,可以是移动应用、网页服务或集成到智能农业设备中。 7. **实时监测**:在实际应用中,系统可以实时接收植物图像,并快速给出病虫害的检测结果。 8. **持续学习**:随着时间的推移,系统可以不断学习新的病虫害样本,以提高其识别能力。 9. **用户界面**:为了方便用户使用,通常会有一个用户友好的界面,显示检测结果,并提供进一步的指导或建议。 这项技术的优势在于它可以快速、准确地识别出病虫害,甚至在早期阶段就能发现问题,从而及时采取措施。此外,它还可以减少对化学农药的依赖,支持可持续农业发展。随着技术的不断进步,图像识别在病虫害检测中的应用将越来越广泛。
1 目标检测的定义 目标检测(Object Detection)的任务是找出图像中所有感兴趣的目标(物体),确定它们的类别和位置,是计算机视觉领域的核心问题之一。由于各类物体有不同的外观、形状和姿态,加上成像时光照、遮挡等因素的干扰,目标检测一直是计算机视觉领域最具有挑战性的问题。 目标检测任务可分为两个关键的子任务,目标定位和目标分类。首先检测图像中目标的位置(目标定位),然后给出每个目标的具体类别(目标分类)。输出结果是一个边界框(称为Bounding-box,一般形式为(x1,y1,x2,y2),表示框的左上角坐标和右下角坐标),一个置信度分数(Confidence Score),表示边界框中是否包含检测对象的概率和各个类别的概率(首先得到类别概率,经过Softmax可得到类别标签)。 1.1 Two stage方法 目前主流的基于深度学习的目标检测算法主要分为两类:Two stage和One stage。Two stage方法将目标检测过程分为两个阶段。第一个阶段是 Region Proposal 生成阶段,主要用于生成潜在的目标候选框(Bounding-box proposals)。这个阶段通常使用卷积神经网络(CNN)从输入图像中提取特征,然后通过一些技巧(如选择性搜索)来生成候选框。第二个阶段是分类和位置精修阶段,将第一个阶段生成的候选框输入到另一个 CNN 中进行分类,并根据分类结果对候选框的位置进行微调。Two stage 方法的优点是准确度较高,缺点是速度相对较慢。 常见Tow stage目标检测算法有:R-CNN系列、SPPNet等。 1.2 One stage方法 One stage方法直接利用模型提取特征值,并利用这些特征值进行目标的分类和定位,不需要生成Region Proposal。这种方法的优点是速度快,因为省略了Region Proposal生成的过程。One stage方法的缺点是准确度相对较低,因为它没有对潜在的目标进行预先筛选。 常见的One stage目标检测算法有:YOLO系列、SSD系列和RetinaNet等。 2 常见名词解释 2.1 NMS(Non-Maximum Suppression) 目标检测模型一般会给出目标的多个预测边界框,对成百上千的预测边界框都进行调整肯定是不可行的,需要对这些结果先进行一个大体的挑选。NMS称为非极大值抑制,作用是从众多预测边界框中挑选出最具代表性的结果,这样可以加快算法效率,其主要流程如下: 设定一个置信度分数阈值,将置信度分数小于阈值的直接过滤掉 将剩下框的置信度分数从大到小排序,选中值最大的框 遍历其余的框,如果和当前框的重叠面积(IOU)大于设定的阈值(一般为0.7),就将框删除(超过设定阈值,认为两个框的里面的物体属于同一个类别) 从未处理的框中继续选一个置信度分数最大的,重复上述过程,直至所有框处理完毕 2.2 IoU(Intersection over Union) 定义了两个边界框的重叠度,当预测边界框和真实边界框差异很小时,或重叠度很大时,表示模型产生的预测边界框很准确。边界框A、B的IOU计算公式为: 2.3 mAP(mean Average Precision) mAP即均值平均精度,是评估目标检测模型效果的最重要指标,这个值介于0到1之间,且越大越好。mAP是AP(Average Precision)的平均值,那么首先需要了解AP的概念。想要了解AP的概念,还要首先了解目标检测中Precision和Recall的概念。 首先我们设置置信度阈值(Confidence Threshold)和IoU阈值(一般设置为0.5,也会衡量0.75以及0.9的mAP值): 当一个预测边界框被认为是True Positive(TP)时,需要同时满足下面三个条件: Confidence Score > Confidence Threshold 预测类别匹配真实值(Ground truth)的类别 预测边界框的IoU大于设定的IoU阈值 不满足条件2或条件3,则认为是False Positive(FP)。当对应同一个真值有多个预测结果时,只有最高置信度分数的预测结果被认为是True Positive,其余被认为是False Positive。 Precision和Recall的概念如下图所示: Precision表示TP与预测边界框数量的比值 Recall表示TP与真实边界框数量的比值 改变不同的置信度阈值,可以获得多组Precision和Recall,Recall放X轴,Precision放Y轴,可以画出一个Precision-Recall曲线,简称P-R
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值