估计量 估计值
A recent Opinium poll published a tie in vote intentions for the main parties. The earlier YouGov poll estimated a seven-point Conservative lead over Labour. Both internet panel polls ran on similar dates: 24–25th August for YouGov, and 26–28th August for Opinium.
Opinium最近的一项民意调查显示 ,主要政党的投票意向并列 。 较早的YouGov民意调查估计,保守党在工党方面领先七分。 两次互联网专家组投票的日期都差不多:YouGov 8月24日至25日,Opinium 8月26日至28日。
This article looks at estimating systematic differences between polling companies.
本文着眼于估计投票公司之间的系统差异。
在房子上 (On the house)
Researchers in political science may refer to polling companies as ‘houses’.
The choices that market research companies can affect their published vote intentions. A house effect is the systematic difference arising from those choices in methods.
市场研究公司的选择可能会影响其公开的投票意向。 房屋效应是由这些方法的选择引起的系统性差异。
When compared to actual election results, we find the ‘house bias’. This is not about political bias or partisanship. Unlike other countries, polling companies in Britain do not align to political parties. There are no Conservative companies and Labour companies.
与实际选举结果相比,我们发现“众议院偏见”。 这与政治偏见或党派关系无关。 与其他国家不同,英国的投票公司与政党不合。 没有保守公司和劳工公司。
Major market research companies sign up to the Market Research Society’s code of conduct. There are the British Polling Council transparency rules.
主要的市场研究公司签署了市场研究协会的行为准则 。 英国轮询委员会有透明度规则 。
There are choices researchers take, which can affect headline estimates:
研究人员可能会选择以下选项,这可能会影响标题估计:
Target population: Some companies survey the whole United Kingdom. Others may survey adults in Great Britain.
目标人群:一些公司对整个英国进行了调查。 其他人可能会调查英国的成年人。
Survey mode: Social desirability can influence responses to surveys with interviewers. Internet panels can vary in quality.
调查模式: 社会可取性会影响对访调员的调查React。 Internet面板的质量可能有所不同 。
Question wording and order: Researchers write their respective question in different ways. The 2015 polling inquiry concluded question wording and order did not add to the miss.
问题的措辞和顺序:研究人员以不同的方式写下他们各自的问题。 2015年的民意调查得出结论, 问题的措词和顺序并没有增加。
Response options: Companies choose which parties to offer as answers to their question. There are often differences in prompts for new parties.
回答选项:公司选择提供哪些参与方作为其问题的答案。 参加新政党的提示常常有所不同 。
Weighting: The weights affect how the final sample looks like the population. Using recalled past votes to weigh samples can lead to issues with false recall. Different companies can use different weighting targets.
权重:权重如何影响最终样品的模样人口。 使用召回的过往选票来衡量样本的权重可能导致错误召回的问题。 不同的公司可以使用不同的加权目标。
Turnout modelling: There are many different ways to move from a sample of adults to voters.
投票率建模:有许多不同的方法可以从成年人样本转变为选民 。
一个简单的房屋效果模型 (A simple model of house effects)
Relative to other companies, how much higher or lower are one company’s estimates?
相对于其他公司,一个公司的估算值高低多少?
I look at 97 vote intention polls taken from 8th January to 28th August 2020. The research companies are members of the British Polling Council.
我看一下从2020年1月8日到8月28日进行的97次投票意向调查。研究公司是英国投票委员会的成员。
For this model, I favour simplicity and speed. A modelling approach is necessary to report uncertainty. For each party, the models have three parts:
对于此模型,我更喜欢简单性和速度。 必须采用一种建模方法来报告不确定性。 对于每一方,模型分为三个部分:
Company-weighted average: a constant representing the average reading. Each company has the same weight. One company (Opinium) published 25 polls, whilst others have only four.
公司加权平均值:代表平均值的常数。 每个公司的权重相同。 一家公司(Opinium)发布了25个民意调查,而其他公司只有四个。
A smooth function of time: Vote intentions can change. I use a smooth function of the number of days since the 2019 General Election. That smooth function must average to zero across the study period.
时间的平稳功能:投票意图可以改变。 我使用自2019年大选以来的天数平滑函数。 该平滑函数在整个研究期间必须平均为零。
The house effect: for each company, there is an change to fit to the ‘average company’ vote intention over time. Those static adjustments are our statistics of interest.
内部影响:随着时间的流逝,每家公司都会发生变化,以适应“普通公司”的投票意图。 这些静态调整是我们感兴趣的统计数据。
As these terms sum together, this is a generalised additive model.
由于这些术语加在一起 ,这是一个广义的加性模型。
Suppose Redfield & Wilton Strategies estimate a Conservative lead of three points. On average, Opinium would show a lead of around two points. In contrast, YouGov would estimate a lead of about five points.
假设Redfield&Wilton策略估计保守派领先三分。 平均而言,Opinium的领先优势约为两点。 相比之下,YouGov估计领先优势约为5分。
Due to house effects and sampling error, two polls on the same dates may have different results. In a sample, we do not always get an exact miniature population. That lead can vary through sampling error alone.
由于房屋影响和抽样误差,同一日期的两次民意测验可能会有不同的结果。 在样本中,我们并不总是能获得确切的微型种群。 该引线可能仅因采样误差而变化。
Electorates can be volatile too: with large changes in opinion in short amounts of time.
选举人也可能会动荡不定:在短时间内会有很大的意见分歧。
Since house effects are relative to the industry average, this is a measure of centrality. Centrality does not mean accuracy. We can only judge accuracy of vote intention polls after an election.
由于房屋影响是相对于行业平均值的,因此这是集中度的度量。 中心性并不意味着准确性。 我们只能判断选举后投票意向调查的准确性。
局限性和其他方法 (Limitations and other approaches)
There are many limitations with this simple approach. The house effect is static: it cannot vary in time. The same choices could inflate one party’s share earlier in the Parliament, but lower it later.
这种简单的方法有很多限制。 房屋效应是静态的:它不能随时间变化。 相同的选择可能会使一党在议会中的份额增加,但后来却降低。
I cannot feed in prior beliefs about realistic sizes of house effects. The models are in the Normal family. Three companies have only four polls, so that assumption may not be appropriate. There is no change for sample sizes and its influence on sampling error.
我不能接受关于房屋效果的现实大小的先前信念。 这些模型属于普通系列。 三家公司只有四个民意调查,因此该假设可能不合适。 样本大小及其对抽样误差的影响没有变化。
Due to the small number of polls, some of the ‘house effects’ may be uneven sampling error. When flipping a fair coin, there is no guarantee of balanced results.
由于民意调查的数量少,某些“房屋影响”可能是不均匀的采样误差。 掷硬币时,不能保证取得平衡的结果。
The models for each party are also independent of one another. A company overestimating one party means underestimating another party.
每一方的模型也彼此独立。 一个公司高估了一方意味着低估了另一方。
Other possible approaches include:
其他可能的方法包括:
Dirichlet regression: Vote intention shares sum to 100%. We can use Dirichlet regression to enforce that constraint.
Dirichlet回归:投票意向份额总和为100%。 我们可以使用Dirichlet回归来强制执行该约束。
Bayesian regression modelling using Stan: We may have prior beliefs about house effects. We could use this information, updating our beliefs with the available data.
使用Stan的贝叶斯回归建模:我们可能对房屋的影响有先入之见。 我们可以使用此信息 ,使用可用数据更新我们的信念。
State-space models: Real vote intention can move from day to day. The ‘walking’ intention starts near the previous election results.
状态空间模型: 实际投票意图可以每天变化 。 “步行”意图始于先前的选举结果。
Different models may produce different estimates.
不同的模型可能会产生不同的估计。
The list of published polls were from Wikipedia, as at 1st September 2020. I looked at polling company archives. I used Prof Simon Wood’s mgcv package for generalised additive models. The data file and R code are on GitHub and R Pubs.
公开投票的清单来自2020年9月1日的Wikipedia 。我查看了投票公司的档案 。 我将Simon Wood教授的mgcv 软件包用于广义加性模型 。 数据文件和R代码位于GitHub和R Pubs上 。
翻译自: https://medium.com/swlh/estimating-house-effects-5c465f2aca87
估计量 估计值