propensity score mating and application with R language

最新推荐文章于 2024-06-21 08:00:00 发布

keep-hungry

最新推荐文章于 2024-06-21 08:00:00 发布

阅读量343

点赞数

分类专栏： R 计量文章标签： r语言开发语言

本文链接：https://blog.csdn.net/weixin_43718786/article/details/125098117

版权

R 同时被 2 个专栏收录

14 篇文章 22 订阅

订阅专栏

计量

6 篇文章 5 订阅

订阅专栏

Instructions

For this assignment we will use data from Lalonde (1986),
that aimed to evaluate the impact of National
Supported Work (NSW) Demonstration, which is a labor training program, on
post-intervention income levels. Interest is in estimating the causal effect of
this training program on income.

First load the packages tableone and Matching:

install.packages(‘tableone’)
install.packages(‘Matching’)
install.packages(‘MatchIt’)
library(‘tableone’)
library(‘Matching’)

Now load the lalonde data (which is in the MatchIt package):

library(MatchIt)
data(lalonde)

The data have n=614 subjects and 10 variables

age age in years.

educ years of schooling.

black indicator variable for blacks.

hispan indicator variable for Hispanics.

married indicator variable for marital
status.

nodegree indicator variable for high school
diploma.

re74 real earnings in 1974.

re75 real earnings in 1975.

re78 real earnings in 1978.

treat an indicator variable for treatment
status.

The outcome is
re78 – post-intervention income.

The treatment is
treat – which is equal to 1 if the subject received the labor training and equal
to 0 otherwise.

The potential confounding
variables are: age, educ, black, hispan, married, nodegree, re74, re75.

standardized differences for all of the confounding variables

What is the standardized difference for married (to nearest hundredth)?

black <- ifelse(lalonde$race == 'black', 1, 0)
hispan <- ifelse(lalonde$race == 'hispan', 1, 0)
white <- ifelse(lalonde$race == 'white', 1, 0)
# add dummy variables
mydata<-data.frame(lalonde,black,hispan,white)

# covariates we will use
xvars<-c('age','educ','black','hispan', 'white','married','nodegree','re74','re75')
table1<- CreateTableOne(vars=xvars,strata="treat", data=mydata, test=FALSE)
 print(table1,smd=TRUE)

在这里插入图片描述

Fit apropensity score model.

Use a logistic regression model, where the outcome is treatment. Include the 8 confounding variables in the model as predictors, with no interaction terms or non-linear terms (such as squared terms). Obtain the propensity score for each subject.

What are the minimum and maximum values of the estimated
propensity score?

psmodel<-glm(treat~ age+educ+black+hispan+white+married+nodegree+re74+re75,  family=binomial(),data=mydata)
pscore<-psmodel$fitted.values
max(pscore)
min(pscore)

carry out propensity score matching using the Match function.

Before using the Match function, first do:

set.seed(931139)

Setting the seed will ensure that you end up with a matched data set that is the same as the one used to create the solutions.

Use options to specify pair matching, without replacement, no caliper.

Match on the propensity score itself, not logit of the propensity score. Obtain the standardized differences for the matched data.

set.seed(931139)
# use logic regression
psmodel<-glm(treat~ age+educ+black+hispan+white+married+nodegree+re74+re75,  family=binomial(),data=mydata)
pscore<-psmodel$fitted.values
# matching depends on propensity score
psmatch<-Match(Tr=mydata$treat,X = pscore,replace=FALSE)
matched<-mydata[unlist(psmatch[c("index.treated","index.control")]), ]
xvars<-c('age','educ','black','hispan', 'white','married','nodegree','re74','re75')
matchedtab1<-CreateTableOne(vars=xvars, strata ="treat", data=matched, test = FALSE)
print(matchedtab1, smd = TRUE)

在这里插入图片描述

Re-do the matching, but use a caliper this time

Set the caliper=0.1 in the options in the Match function.

Again, before running the Match function, set the seed:

set.seed(931139)

psmatch<-Match(Tr=mydata$treat,X = pscore,replace=FALSE,caliper = 0.1)
matched<-mydata[unlist(psmatch[c("index.treated","index.control")]), ]
xvars<-c('age','educ','black','hispan', 'white','married','nodegree','re74','re75')
matchedtab1<-CreateTableOne(vars=xvars, strata ="treat", data=matched, test = FALSE)
print(matchedtab1, smd = TRUE)

For the matched data, what is the mean of real earnings in 1978 for treated subjects minus the mean of real earnings in 1978 for untreated subjects?

Use the matched data set (from propensity score matching with caliper=0.1) to carry out the outcome analysis.

#outcome analysis
y_trt = matched$re78[matched$treat==1]
 y_con = matched$re78[matched$treat==0]

#pairwise difference
diffy = y_trt-y_con

#paired t-test
t.test(diffy)

keep-hungry

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
propensity score mating and application with R language

For this assignment we will use data from Lalonde (1986),that aimed to evaluate the impact of NationalSupported Work (NSW) Demonstration, which is a labor training program, onpost-intervention income levels.
复制链接

扫一扫