Java Python Basic Econometrics
Individual Assignment
This is an individual assignment where you must work alone. You must submit an electronic copy of your assignment in Canvas in pdf, doc or docx format along with your R-code. Hard copies will not be accepted. Show your calculations (if any) as well as answering the questions in clear full sentences. Log referrers to natural logarithm!
Use the dataset: WDI_2350.RData
Extreme weather events and patterns are being increasingly frequently recorded throughout the globe. The occurrence has had a negative impact on the welfare of households, businesses and governments. This has highlighted the pivotal role that environmental quality has on human development and wellbeing. The study to unravel major drivers of CO2 can aid policymakers to make better informed decisions. Assume that the outgoing environmental research officer had started working on the econometric model to assess some of the drivers of CO2 emissions. Now as an incoming research officer your job is to finish this research. Your variables of interest are:
CO2 = CO2 emissions (metric tons per capita) [EN.ATM.CO2E.PC]
RGDPpc = GDP per capita (constant 2015 US$) [NY.GDP.PCAP.KD]
Urban_pop = Urban population (% of total population) [SP.URB.TOTL.IN.ZS]
Manu = Industry (including construction), value added (constant 2015 US$) [NV.IND.TOTL.KD]
Elec = Access to electricity (% of population) [EG.ELC.ACCS.ZS]
Your variables :
RGDPpc = The more developed a country is the more consumption per capita and the more expected CO2 production. Whether or not this relationship is linear or even reversible at high GDPpc levels is hotly debated. See the literature on the Environmental Kuznets Curve.
Urban_pop = When a country has a larger share of population living in cities, we expect higher production and consumption of CO2 (livestock, fossil fuel etc)
Manu = A larger manufacturing sector of a country, is likely associated with more emissions.
Elec = A large proportion of countries generate electricity from non-renewable resources such as oil, natural gas, coal etc., and these contribute to CO2 emissions.
All data originate from the World Bank (WDI).
Please assess whether the above variables are truly associated with CO2 emissions, and if yes, how. Answer the following questions:
1) Use R to run a simple OLS regression with the natural log of CO2 as your dependent variable and the natural log of real GDP per capita as your explanatory variable. Specifically, run the following models:
You will have to take the natural log of GDPpc and CO2 using R!
Ln(CO2) = β0 + β1ln(RGDPpc) + u (Eq.1)
a. Provide your R output and write out your estimated regression equation. 1 +1 marks
b. What is the associated degrees of freedom and state whether standard normal critical value can be used. 1 mark
c. What is a major problem this regression is likely to have, and which Gauss Markov assumption is violated? 2 marks
Subtotal: 5 marks
2) Use R to run a cross sectional regression of CO2 on the natural log of RGDPpc, the squared term of natural log of RGDPpc, Urban_pop, log of Manu and Elec for the listed countries as follows (Please note the natural logs and construct these in R as needed):
Ln(CO2) = β0 + β1ln(RGDPpc) + β2ln(RGDPpc)2 + β3ln (Manu) + β4Urban_pop + β5Elec + u (Eq.2)
a. Present your regression results in a table below (R output): 5 marks
b. Interpret the constant (2.5 marks) and its p-value (1.5 marks). 4 marks
c. Interpret the coefficient of Industry value added (log Manu) and its p-value (1.5 marks each). 3 marks
d. Interpret the coefficient of urban population and its p-value (1.5 marks each). 3 marks
e. Interpret the coefficient of electricity access and its p-value (1.5 marks each). 3 marks
f. Interpret the R2 of the regression. 2 marks
g. Is the relationship between CO2 and RGDPpc U-shaped or inverted U shaped? 2 marks
h. Multicollinearity:
Use R to compute the correlation between:
· Level real GDP per capita and level Industry (including construction), value added (1 mark)
· Level real GDP per capita and level Access to electricity (1 mark)
· Can these independent variables be regressed in the same model? Explain the concept of multicollinearity and its effect on the estimated coefficients. (2 marks) 4 marks
i. Would you use a quadratic term? What do your results say about the “Environmental Kuznets Curve Effect?” (Support it or not? Hint: Google the Environmental Kuznets Curve concept) Write 1 paragraph. 4 marks
Subtotal: 30 marks
3. Present a functioning R code reproducing the results