Problem
Based on initial research into publicly available data sets of interest, I aim to explore the Internet as a household necessity. As technological reliance becomes ever more prominent, being able to predict Internet access among households in the United States would be a valuable endeavor. Technological corporations are transitioning to subscription-based service-oriented business models, further increasing the value of Internet access within a household (CB Insights, n.d.). Furthermore, academic institutions and corporate workplaces, each rely on digital connectivity as a means to communicate, create, retrieve, and share (Kelley, 2013; University of Birmingham, 2019). As such, such a study would remain valuable to Internet service providers, government officials, technological corporations, and infrastructural professionals alike. Assessing Internet connectivity among the domestic populous is achieved through household data collection, of which the national census remains the foremost public resource. The primary indicator of Internet access within a household is the possession of an Internet subscription (obtained through an Internet service provider). Thus, I plan to predict the percent of households with Internet subscriptions, as a function of median income, average weekly hours worked, and the number of individuals with bachelor’s degrees, utilizing a state-by-state unit of analysis.
iuww520iuww520iuww520iuww520iuww520iuww520iuww520iuww520iuww520
Model
This study will utilize a multi-linear regression model. With three independent variables and a single dependent variable, this model was appropriate for a predictive effort in this context. The variables, all of which are numeric variables, are as follow:
- Independent Variables (x):
- Median Income ($ - US Dollars)
- Average Weekly Hours Worked (Hours)
- Number of Individuals with Bachelor’s Degrees (Count)
- Dependent Variable (y)
- Households with Internet Subscriptions (% of State Population)
To retain a clear scope and control over the data, I will utilize a state-by-state unit of analysis. As such, the resulting data set of this research effort will consist of 50 rows of data, one per each state. Example snippets from my data curation efforts can be found in data curation section of this document. In addition, I incorporate existing data sets for this study, all of which are cited under the references section.
This model can be represented by the following regression equation:
(% of Households with Internet Subscriptions)i = B0 + (Median Income ($))i B1 + (Average Weekly Hours Worked)i B2 + (Number of Individuals with Bachelor’s Degrees)i B3
Analytical Approach
- Relationships: Given the significant impact of the Internet on accelerated technological, economical, and global development, I expect clear relationships between the aforementioned variables. Given the cost of Internet subscriptions, median income could impact the dispersion of Internet subscriptions among states with lower incomes. Likewise, the average hours worked in a week could dictate the necessity of the Internet as a fundamental household resource in a given state. Lastly, education, namely higher education at the undergraduate level, could have implications on the willingness of household populations to obtain an Internet subscription.
- Informed Estimate of Findings: Based on the technological development of the past few decades, in addition to the current trends of subscription-based service-oriented business models, I would predict that each of the aforementioned independent variables impact the likelihood of household Internet subscriptions. Median income differences could logically explain the significant differences in subscription percentages by state. Lower incomes may be inhibited with regard to obtaining such subscriptions, and vice versa for wealthier states. Working more hours in the modern economy may indicate a greater reliance on the Internet for day-to-day operations. Thus, once a typical workday concludes, workers may continue ‘remote working’ from their personal homes, thus requiring an Internet subscription. In communities with lower average work hours, such reliance on the Internet may not be as evident. Lastly, those educated in a society may recognize a larger potential behind the Internet’s societal offerings. As such, these educated groups may be keen on obtaining Internet subscriptions, in an effort to further digitally optimize their connected lifestyles. This could imply that the more bachelor’s degrees within a given state, the larger percentages of the population that possess Internet subscriptions. Together, the aforementioned conclusions and are anticipated as this data analysis effort progresses.