IB1500 Data Analysis for Management

IB1500

Foundations of Data Analysis for Management

Project on Data Analysis Resit, 2023-24

Assignment Instructions

All assignments must be submitted ONLINE via my.wbs by 12pm (midday) UK time on the date displayed against this assessment.

Please ensure that you have inserted a completed assignment coversheet, which must be included as the first page of your script. This should include your Student ID number, but not your name.

Word Limit

3000 word limit.

Word Count Policy

WBS has a school-wide policy on word counts. This is strictly enforced to ensure consistency across modules and programme. You can find more information about this policy in your Student Handbook under Academic Practice - 7i. Word count policy.

This is a strict limit not a guideline: any piece submitted with more words than the limit will result in the excess not being marked.

Academic Practice

Please ensure you read the full guidelines for Academic Practice in the Undergraduate Handbook and ensure you understand it. If in doubt, please seek clarification in advance of your submission.  This includes important information on:

.    Cheating, plagiarism and collusion

.    Correct referencing

.    Using internet sources in assessments

.    Academic writing

.    English Language support

.    Word count policy

When you submit this assignment online, you will be required to tick a declaration box indicating that the work involved is entirely your own. Each assignment will be put through plagiarism software to identify any collusion or inadequate referencing of materials used from different sources.  Please do not submit images of your typed work unless you have been specifically requested to do so.

We would consider taking action if your work:

1. is too reliant on the words of particular authors (rather than presenting your ideas in your own words), if the essay uses the ideas or words of an author without referencing them or putting their words into quotations (plagiarism).

2. suggests that you have worked very closely with another student or students (unless explicitly asked to do so by your Module Leader/Tutor) (collusion).

3. includes unreferenced work that you have previously submitted for any accredited course of study (unless explicitly asked to do so by your Module Leader/Tutor) (self-plagiarism).

The Use of Artificial Intelligence (AI)

The University recognises an increasing number of technologies such as Artificial Intelligence and that they may be applicable in your completing this assessment. The assessment brief sets out specific requirements or restrictions, and your student handbook has further guidance and advice.

You  are  reminded  that  the  inappropriate  use  of  such  a technology  may  constitute  a  breach  of University policy, such as the Proofreading Policy or Regulation 11 (Academic Integrity). If you breach these policies, it may have significant consequences for your studies. Please make sure you read and understand the assessment brief and how AI may or may not be used.

If a generative AI or similar is permitted and has been used you MUST make clear why you used such a tool or service, what you used it for and you will be obliged to confirm that you take sole intellectual ownership of any submitted work. As appendices, and as part of your submitted work, you must provide screenshots of the question and the AI-generated response, alongside an explanation of how the content has been utilised. You should note the relevant reference alongside each screenshot.

When you submit you must complete (physically or electronically) a declaration. This requires you to explain the use of any AI. Failure to disclose at the point of submission may be prejudicial in any later investigations should they arise.

For this assessment the use of AI is: Prohibited

You MUST NOT use any generative Artificial Intelligence in this assessment unless specifically authorised for reasonable adjustments. You MAY use non-generative tools such as a spell-check, basic grammar check (non-generative), calculator or similar. If you have any doubts about a tool or service you plan to use please contact the module leader.

Extensions and Self-certification

Late submissions will incur a penalty of 5% for every 24 hour period after the due date and time, i.e. this begins one minute after the submission deadline (beginning at 12.01pm).

Requests for specific extensions (of up to 15 days) which are typically for longer and more serious concerns must be submitted via my.wbs ideally 72 hours BEFORE the deadline. Extensions can only be approved if you clearly detail your circumstances and provide supporting documentation (or a reason as to why you cannot provide the supporting documentation at the time) as set out in the Mitigating Circumstances Policy.

Self-certification is a university-wide policy whereby you are permitted an automatic extension of 5 working days on eligible written assessed work without the need for evidence. WBS permits self-certification for all types of written, assessed works such as essays and dissertations. It is not permitted for exams, course tests, or presentations.

You can self-certify twice within each year of study, starting from the anniversary of your course start date. This will cover all eligible written assessments that fall within the self-certification period, as long as they  have  not  previously  had  an extension  applied. To find out further details about the  self- certification policy please see: https://my.wbs.ac.uk/-/academic/20778/item/id/1244460/ .

If you wish to self-certify for an extension of 5 working days, please select 'Self-certification' in the Extension Type field. If you wish to request a longer extension than 5 working days, please leave the Extension Type as 'Standard'.

Your assignment instructions begin below.

**You must NOT use the same media and data you used for the initial Project for this module.**

This project consists of two parts.

In part 1, your task is to analyse a piece of media discussing a scientific study using the concepts from the module. This part consists of four sections. You will need to pick the piece of media yourself.

In part 2, your task is to analyse of a dataset using the knowledge acquired in the module. This part consists of four sections. You will need to pick the data yourself.

In what follows are specific instructions for the sections in each part. For each part/section, there is a suggested approximate word count. The word count is approximate because you don’t need to follow it precisely as long as the overall count is 3000 or under.

Your submission will consist of two documents: the essay and an Excel file that supports your analysis.

An essay without the Excel file will be considered, but it will receive lower marks on the Technical Capability component. The files should be clearly signposted: the essay should follow the structure outlined below and the Excel file follow the same structure (see the example Excel on the module page).  There  should   be  a  clear  correspondence   between  the   Excel  file  and  the  essay:  where appropriate assign numbers and titles to tables and figures; they can be easily referenced in the Excel file. You can calculate statistics by hand or use Analysis ToolPak where appropriate. The marker should be able to judge how the results reported in the essay were achieved. To do so, make sure that all formulas are readable (i.e., do not simply put in the numbers from somewhere else) and if ToolPak is used clearly indicate that with “Computed using TookPak.”

Note on the word count: The following items are NOT counted towards the word count:

.     Section/subsection names

.     Appendix (the tables and figures in the appendix)

.     Footnotes (footnotes should be used only when strictly  necessary, no essential information should be put in footnotes)

.     Numbers in the text

.     Text in the infographic

Please make sure to provide the overall word count at the beginning of your project.

PLEASE SEE THE EXAMPLE ESSAY AND EXCEL FILE ON THE MODULE PAGE TO GET AN IDEA OF HOW THIS PROJECT SHOULD LOOK LIKE.

See more detailed instructions for each part/section below.

Part 1: Analysis of a news article (1000 words)

Section 1.1. Data based on the news article (200 words)

In this section, please describe the data from study covered in the news article. Use the following structure (see Week 2 materials):

Question -> Target Population -> Study Population -> Sample -> Data.

Provide description of each part of the chain and note any information that you are not sure about from the article (i.e., when information is not enough in the article to give a precise description). For this section  use  ONLY the article  itself.  Do  not  look  up  information  about  the  research  paper  or anything else mentioned in the article.

Do not forget to provide the link to article.

Section 1.2. Data based on the research article (200 words)

In this section, provide descriptions following the same structure but now using the research paper that the article discusses. If you identified any missing information in the previous section, attempt to fill in the gaps using the paper. Make sure to reference the part of the paper that supports your statements (pages are enough).

Do not forget to provide a link to the paper.

Section 1.3. Data quality (500 words)

In this section, given your analysis above, discuss the data quality of the study. Specifically, discuss measurement, sampling, and external validity. Identify any issues there might be and explain how they could affect the result of the study. For example, it is not enough to say that self-report can have issues but you have to identity the issues and explain why that matters.

Issues of measurement: Identify the variables of interest, how they were measured and whether this measurement is appropriate or not.

Issues of sampling: Using the sample and the study population from the previous analysis, discuss whether the sampling might have introduced a bias into the data (or not).

Issues of external validity:  Discuss whether the  results  of the study can  be  applied to the target population or other populations.

Section 1.4. Conclusion (100 words)

Provide  a  short  conclusion  about  the  quality  of  the   data  and  the   conclusions  drawn   by  the article/study.

Part 2: Analysis of data

Section 2.1: Data (600 words)

In this section, provide a short description of your data. If you are using data licensed under Creative Commons, here you should also provide a proper reference of the data. See the specific license on proper attribution. Provide the following information:

.     Information about the sample size,

.     Information  about  the  origin of the  data  (e.g.,  it was a survey  done  by  someone or it  is observational data on a particular topic).

Section 2.1.1. Descriptive Statistics

Provide a short description of ALL the variables you are USING in this project. This means that you don’t need to provide descriptions of all the variables in the data set but only the variables you are using in the project (in any section of the project). Provide the following information:

.     A table with the following columns where you define the variables and determine their type. This table should be in the appendix.

Variable

Definition

Data Type

.     A  table  with  the  following  columns  where  you  provide  descriptive  statistics  for  all  the continuous variables:

Table X. Description of Continuous Variables.

Variable

Mean

Median

Mode

SD

Minimum

Maximum

.     A table with the following columns where you  provide  descriptive  statistics for  each the categorical variables  (don’t’ forget to  provide  the absolutes for the  percentages).  If  your categorical variable has more than 5 outcomes, provide information only for the top 5:

Table X. Description of Variable Y. Overall sample size Z.

Value

Count

%

.     For each variable:

o  Create a histogram (if it is continuous) and a bar chart (if it is categorical, same rule applies as before for the variables with more than 5 outcomes). The graphs should be in the appendix and appropriately titled and labelled.

o  Determine the skew of the variable. Comment on which descriptive statistics would be the most appropriate for this variable.

o  Comment on whether there  are  any  outliers;  what  they   might  be  (mistakes, uncommon observations); what you did with them (if you did anything).

o  Comment if you see any strange/interesting  patterns in the graphs and what is the most likely explanation for them.

The Excel file should  contain  all  the  calculations  of  the  descriptive  statistics  and  the  histograms. Specially, the any formulas  used should  be clickable  (i.e., the  marker should  be able to  read the formula used); if ToolPak is used, that should be clearly noted with “Computed using TookPak.”

Section 2.1.2. Data Quality

Assess the quality of your data on three levels: measurement, data sampling, and external validity. Make sure to note how those issues affect the conclusions you can draw from your analysis.

First, consider the measurement in your data. What do you think they wanted to measure? What does the data actually measure?

Second, consider the study population and how the sample was selected. Are there any issues with who ended up in this sample?

Finally,  consider whether whatever you  can conclude  about this  data can  be  extended to  larger populations?

Here you don’t need to provide very detailed descriptions of all possible biases and issues that each variable might suffer from. You should rather concentrate on the most important problems relating to the sampling of data and the validity of your conclusions. For example, you identified a possibility of selection bias in some of your data. Then you should explain how selection bias affects your data and why that matters for your results.

Section 2.2: Confidence Intervals and Hypothesis Testing (550 words)

In this section, you will calculate confidence intervals and test your hypotheses. For each subsection, write a QUESTION you are trying to answer with your analysis.

In this section, there should not be any formulas. You should only provide the value of the statistic and the p-value and clearly show what these results mean for your question.

The Excel file should contain your estimations for the tests (the values of statistics and p-values) and should clearly indicate the data that was used in the tests.

Section 2.2.1. Confidence Interval Estimation

Question: Formulate your question here

Calculate two confidence intervals for two different means or proportions in your data and make a conclusion whether the difference between the means/proportions is statistically reliable (think of answering the question “What did I learn from doing this analysis?”).

Section 2.2.2. T-test

Question: Formulate your question here

Conduct  a 5% significance t-test  (a  one  or two-sample version).  Make  conclusions  from the test concerning  your  question  (think  of  answering  the  question  “What  did   I   learn  from  doing  this analysis?”).

NOTE: You need to compare similar things here. For example, means of the same variable for different subgroups. DO NOT COMPARE APPLES TO ORANGES.

Section 2.2.3. Chi-square test

Question: Formulate your question here

Conduct a 5% significance chi-square test  (one or two samples).  Make conclusions from the test concerning  your  question  (think  of  answering  the  question  “What  did   I   learn  from  doing  this analysis?”).

NOTE: Chi-square test is for CATEGORICAL VARIABLES. This means that it applies to COUNTS of cases, not means of continuous variables.

Section 2.3: Regression Analysis (550 words)

Question: Formulate your question here (i.e., pick the dependent and the independent variables)

In this section, you will conduct a regression analysis. Specifically, estimate a multiple regression. First, provide a question that you are trying to answer with this regression. Then for each independent variable explain why you included it in the regression and what you think the sign of the coefficient would be. DO THIS BEFORE YOU RUN THE REGRESSION.

Section 2.3.1. Results

Discuss the results of the regression analysis here. Specifically, discuss the significance and sign of EVERY independent variable and whether your predictions from the section above were supported. Moreover, provide the following table of the regression results in the Appendix:

Coefficient

95% CI

Coefficient

XXX

[XX,XX]

R-squared

XX

# Observations

XXX

Note: * - p-value <0.1, ** - p-value<0.05, *** - p-value< 0.01

Section 2.3.2. Model Fit and Regression Assumptions

In this section discuss ALL the following points:

.     Model Fit

.     Normality of Residuals

.     Heteroskedasticity

.     Multicollinearity

.     Non-linearity

.     Reverse Causality

.     Omitted variable bias

.     Correlation vs Causation

.     Comment  on whether you identified any issues and what you should do if you wanted to improve your model.

Section 2.4: Infographic (300 words)

In  this  section,  present  your  infographic.  Your  infographic  should  consist  of  3-5  graphs  and  3-5 numbers. Don’t forget to provide statistical information (such as 95%CIs) in support of the claims.

To accompany the infographic, please provide its description: what is the main topic of the infographic and why did you choose it? Also, provide a short description of each element of the infographic and its purpose. Each element should represent a specific statistical claim and you should provide statistics that support that claim. For example, you are making a claim about two means, then you should provide 95%CI information and/or a result of a t-test.

Remember,  infographics  are  about  telling  stories  visually,  so  make  sure  to  clearly  state  which QUESTIONS your infographic is trying to answer. Moreover, the textual component should be minimal. The text should only provide the basic context for the statistics and graphs.

The Excel file should show how you estimated each statistic. It should also show how you calculated confidence intervals for means and/or proportions mentioned in this section.

Below is guidance on the elements of the infographic:

MESSAGE - There should be a central question explicitly addressed in the infographic.

GRAPHS & STATISTICS - Appropriate statistics and graphs should be selected for the data. The graphs and statistics should help convey the main message.

LAYOUT - Infographic should be structured and spaced well, with a clear conceptual basis for the organizational scheme. Where appropriate, spatial relationships should be used to convey meaning and show signs of creativity.

AESTHETICS  -  Colours  should   be  used  to  convey   meaning  and   be  aesthetically  appealing.  The infographic as a whole should be aesthetically pleasing. The main purpose of the infographic is to convey a message, and tell a story. This means that this is an artistic medium, so if this is something you enjoy, put your skills to good use here.

CLARITY - Text and visuals should work together so that each is enhanced by the other.

SOURCES - Sources of data and ideas drawn from outside the course are clearly provided, as are the ways they were used.

  • 19
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值