Synth MATLAB Code (11/07/2006) written for MATLAB 7.0
by Alberto Abadie, Alexis Diamond, and Jens Hainmueller (all Harvard University)
Contact: jhainm@harvard.edu
关于合成控制法简介请详细阅读一文读懂合成控制法 (Synthetic Control Method)操作及Stata应用 Synth MATLAB使用Abadie和Gardeazabal(2003)以及Abadie、Diamond和Hainmueller(2006)中开发的汇总数据,在比较案例研究中实现因果推理的合成控制方法。 1 合成控制法Matlab操作之数据介绍39 by 39 data matrix to run the example:
First row contains state numbers:Alabama 1; Arkansas 2; Colorado 4; Connecticut 5; Delaware 6;Georgia 7; Idaho 8; Illinois 9; Indiana 10; Iowa 11; Kansas 12;Kentucky 13; Louisiana 14; Maine 15; Minnesota 16; Mississippi 17; Missouri 18; Montana 19; Nebraska 20; Nevada 21; New Hampshire 22; New Mexico 23; North Carolina 24; North Dakota 25; Ohio 26; Oklahoma 27;Pennsylvania 28; Rhode Island 29; South Carolina 30; South Dakota 31; Tennessee 32; Texas 33; Utah 34; Vermont 35; Virginia 36; West Virginia 37; Wisconsin 38; Wyoming 39; California
Predictors are stored in rows 2 to 8:
row 2: income, row 3: retail price, row 4: percent_15-19; row 5: beer
consumption (all averaged 1980 to 1988);row 6: smoking 1988, row 7: smoking 1980; row 8: smoking 1975;
Outcome Data (smoking consumption in packs per capita) is stored in
rows 9 to 39 for the years 1970, 1971,...,2000
2 合成控制法Matlab操作代码 1、导入数据load MLAB_data.txt;data = MLAB_data;
% California is state no 3, stored in the last column no 39index_tr = [39];% 38 Control states are 1,2 & 4,5,...,38, stored in columns 1 to 38index_co = [1:38];% Predcitors are stored in rows 2 to 8index_predict = [2:8];% Outcome Data is stored in rows 9 to 39; for 1970, 1971,...,2000index_Y = [9:39];
3、Define Matrices for Predictors
% X0 : 7 X 38 matrix (7 smoking predictors for 38 control states)X0 = data(index_predict,index_co);% X1 : 10 X 1 matrix (10 crime predictors for 1 treated states)X1 = data(index_predict,index_tr);% Normalization (probably could be done more elegantly)bigdata = [X0,X1];divisor = std(bigdata');scamatrix = (bigdata' * diag(( 1./(divisor) * eye(size(bigdata,1))) ))';X0sca = scamatrix([1:size(X0,1)],[1:size(X0,2)]);X1sca = scamatrix(1:size(X1,1),[size(scamatrix,2)]);X0 = X0sca;X1 = X1sca;clear divisor X0sca X1sca scamatrix bigdata;
4、Define Matrices for Outcome Data
% Y0 : 31 X 38 matrix (31 years of smoking data for 38 control states)Y0 = data(index_Y,index_co);% Y1 : 31 X 1 matrix (31 years of smoking data for 1 treated state)Y1 = data(index_Y,index_tr);% Now pick Z matrices, i.e. the pretreatment period% over which the loss function should be minmized% Here we pick Z to go from 1970 to 1988 % Z0 : 19 X 38 matrix (31 years of pre-treatment smoking data for 38 control states)Z0 = Y0([1:19],1:38);% Z1 : 19 X 1 matrix (31 years of pre-treatment smoking data for 1 treated state)Z1 = Y1([1:19],1);
5、Now we implement Optimization
% Check and maybe adjust optimization settings if necessaryoptions = optimset('fmincon')% Get Starting Values s = std([X1 X0]')';s2 = s; s2(1)=[];s1 = s(1);v20 =((s1./s2).^2);[v2,fminv,exitflag] = fmincon('loss_function',v20,[],[],[],[],... zeros(size(X1)),[],[],options,X1,X0,Z1,Z0);display(sprintf('%15.4f',fminv));v = [1;v2];% V-weightsv% Now recover W-weightsD = diag(v);H = X0'*D*X0;f = - X1'*D*X0;options = optimset('quadprog')[w,fval,e]=quadprog(H,f,[],[],ones(1,length(X0)),1,zeros(length(X0),1),ones(length(X0),1),[],options);w = abs(w);% W-weightsw
6、绘制结果图
Y0_plot = Y0*w;years = [1970:2000]'; plot(years,Y1,'-', years,Y0_plot,'--');axis([1970 2000 0 150]);xlabel('year');ylabel('smoking consumtion per capita (in packs)');legend('Solid Real California','Dashed Synthetic California',4);
7、结果为:
相关权重为: