HW1 Fall 24R

最新推荐文章于 2024-09-17 23:15:58 发布

wx___codinghelp

最新推荐文章于 2024-09-17 23:15:58 发布

阅读量932

点赞数 7

文章标签： python 人工智能开发语言

本文链接：https://blog.csdn.net/wx___codinghelp/article/details/142265806

版权

Java Python HW1

1. The file Crash.xls contains daily returns on the Dow Jones Industrial Average for 47 consecutive trading days in 1990.

A) Make a time series plot of the returns, and describe any patterns you see.

B) Note that the 24’th observation corresponds to a 7.15% drop. (Let’s call it a “crash”). Calculate the Z-Score for this observation, that is, calculate how many standard deviations the observation is from the mean.

C) From the time series plot in A), do you notice anything interesting about the return for the day immediately following the crash? Calculate the Z-score for this observation.

D) Construct a Box-Plot of the returns? Do you detect any outliers?

E) The variable “Pre-Post” is an indicator variable which is 1 for returns before the crash, and 2 for returns on or after the day of the crash. Construct a side-by-side boxplot for these two sets of returns. What conclusion do you draw about the volatility of returns following a crash? Explain how you can see this same pattern in the time series plot.

2. The file EPSReturn.xls contains data on the stock returns and earnings per share (EPS) for 52 major companies. The EPS values for the companies are those announced in December 1997, while the stock returns are calculated for January 1998. Make a scatterplot of the returns versus the EPS values. (So, Y=Return, X=EPS). Does the plot suggest any relationship between the two variables? What relationship would you have expected? Which company had the highest value of EPS? Was the return for this company large as well?

3. The file DraftLottery.xls contains data on the 1970 Draft Lottery, carried out by the United States Selective Service to determine who would be drafted that year for military service in Vietnam. The lottery applied to all eligible men aged 19 to 26 on January 1, 1970, and represented an attempt on the part of the US government to expose these men fairly to the risk of being drafted. A total of 366 capsules, one for each day of the year, were placed in a vat. First, the January capsules were placed in the vat, then the ones for February, etc., with the ones for December going in last. The vat was then mixed (by turning it around for several minutes), and the capsules were then drawn out, one by one. The first date drawn (September 14’th) was assigned rank 1, the second date drawn (April 24’th) was assigned rank 2, and so on. Those eligible for the draft who were born on September 14’th were called first, followed by those born on April 24’th, and so on. The first column of the dataset contains the day of the year (1-366), the second column contains the rank for that day obtained in the lottery, and the third column contains the month for the given date. So, for example, the first day of the year (Jan 1) received rank 305, and occurred in the first month.

A) Make a scatterplot of Rank (the Y variable) versus DayofYr (the X variable). Does this plot show any obvious patterns?

B) Construct side-by-side boxplots of Rank versus Month. Describe any patterns you see. Which month seems to be systematically receiving the lowest rankings (and therefore the largest chance of being drafted)? Try to explain this phenomenon in terms of the description given above of the way the lottery was carried out.

C) On January 4th, 1970, The New York Times ran an a HW1 Fall 24R rticle, “Statisticians Charge Draft Lottery Was Not Random.” Do you think the lottery was random? Explain.

4. The file NormTemp.xls contains data on body temperatures for 130 randomly selected subjects. The first column (Temp) contains the temperatures themselves. For each subject, this temperature, in degrees Fahrenheit, represents an average of several readings taken over the course of two consecutive days. The second column (Gender) is 1 for male, 2 for female, and the third column (HeartRate) is measured in beats per minute. Here, we focus on Temp.

A) Make a histogram of Temp. Does the data seem to have a reasonably bell- shaped distribution? Do you see any outliers?

B) What do you think the population mean is for body temperatures? (Presumably, you’ve been hearing this number since you were very young! If you were raised on Celsius, convert to Fahrenheit using F=9/5*C+32.)

C) Based on the histogram, does the sample mean seem to be reasonably close to the “known” population mean? You don’t actually need to calculate the sample mean for this problem, just look at the histogram. (Hint: If a distribution is symmetrical, then the mean is the center of symmetry).

D) Use Descriptive Statistics to calculate the sample mean. What is the value of the sample mean? Is it reasonably close to the “known” population mean?

E) Based on the Descriptive Statistics output, give numerical values for the median, range and interquartile range. Do these numbers suggest symmetry or skewness of the distribution of temperatures?

5. The file Copies.xls contains data on the number of copies made on self- service copying machines at a copy center, each day for 44 days.

A) Make a boxplot of the number of copies, and identify any outlier values. On which days did these outliers occur?

B) Using Descriptive Statistics, find the mean and standard deviation of the number of copies.

C) Delete the outlier values, by clicking in the cells corresponding to the outliers and using the backspace key (NOT the delete key!). Recompute the mean and standard deviation. Which number changes more?

6. A random sample of 100 prices of three-bedroom houses in a particular city that were recently sold has a sample mean and a sample standard deviation of $525k and $25k respectively.

A) According to the Empirical Rule, within what price range would you expect 68% of the homes to fall?

B) According to the Empirical Rule, within what price range would you expect 95% of the homes to fall?

C) According to the Empirical Rule, within what price range would you expect 99% of the homes to fall?

7. During World War II, many economists, mathematicians and statisticians were members of Columbia University’s Statistics Research Group, which did high level consulting work for the armed forces. As part of this group’s work, statistician Abraham Wald was asked where to place armor on planes. It seemed obvious to the aircraft engineers that armor was needed at the place most frequently hit as found in a large sample of battle proven airplanes. After studying the bullet holes of a sample of returning planes, Wald’s conclusion was to place armor where bullet holes were least frequently found in these planes