**Subject Code & Title :** BUS105 Computing **Assessment Type :** Assignment

“Instructions for the computing assignment word file worth 18% of your final grade and excel file worth 2% of your final grade**Overview :-** Materials that must be used in the assignment – these are provided on moodle

1. An excel file with the datasets for all the students , each student must follow the instructions

and get 4 datasets using their student number , each student will have different datasets

**The following material briefly discusses where to find the material for the assignment**

**Submission :-**

1.Students must submit a word file (worth 18%) to Moodle AND an excel file (worth 2%) to moodle

2.The word file needs to be submitted to the Turnitin link – instructions are given on page 6

3.The word file needs a cover page

4. The word file needs the answers to 9 questions given in full detail later in this document a vital part of answering the question is using the data set and the data set summarizer.

5 The excel file needs to be submitted to the assignment drop box,

6 The excel file should have the student’s 3 datasets and summaries NOT made by the automatic data set summarizer – summarize the data set using Pivot Tables and the scatter plot. (Instructions for submitting the excel file are given on page 8)

The Computing assignment also consists of 5 preparation quizzes worth 1% each these preparation quizzes are on moodle.

“Instructions for Major part of assignment, the word file worth 18% of your final grade you submit to Turnitin.

**Overview :-**You need to submit a word file with the answers to 9 questions – the first 8 questions are about the

datasets, the last question is a paraphrasing task

You will use your datasets and the automatic data set summarizer to get the descriptive statistics that are used in questions 1 to 5 and the inferential statistics that are used in question 6 to 8.To check you have correctly obtained your dataset check both p-values are correct when you investigate both categorical variables (question 6 to 8). There will be videos on moodle explaining to check you have properly obtained your sample

**Summary of the datasets (questions 1 to 8 are about the datasets)**

**Data set 1**

Version 1 of a pregnancy test results

**The variables**Are “Reality, pregnant or not pregnant?” and “Test result, positive or negative?”

**Data set 2**

Version 2 of a pregnancy test results

The variables

Are “Reality, pregnant or not pregnant?” and “Test result, positive or negative ?”

**Data set 3**

Daily flight cancelations at airline ABC

The variables are “Which Country“

Country A or Country B

Number of tests in one week

Number of people needing tech support

**Question 1 :**

a) Paste data set 1 into an appropriate data set summarizer Paste in the descriptive statistics into the word file. The descriptive sample statistics let you investigate the relationship between the variables “Reality, pregnant or not pregnant” and ““Test result, positive or negative” using the sample, This lets you check the accuracy of version 1 of the virus test

b) Use part a Describe the relationship between the two variables using one of the following numbers, choose the correct option

1. The difference between sample means –

2. The difference between sample proportions –

3. The correlation coefficient r

Your description of the relationship between the variables should also describe the relationship using plain English

c) Paste data set 2 into an appropriate data set summarizer

Paste in the descriptive statistics into the word file. The descriptive sample statistics let you investigate the relationship between the variables “Reality, Pregnant or not pregnant” and““Test result, positive or negative” using the sample , This lets you check the accuracy of version 2 of the virus test

d) Use the answer in c) part to describe the relationship between the two variables using one of

the following numbers, choose the correct option

1. The difference between sample means –

2. The difference between sample proportions –

3. The correlation coefficient r

Your description of the relationship between the variables should also describe the relationship using plain English

e) Which version of the virus test is better ? , version 1 or version 2? Give a reason for you answer ,

you can use the answer to part b) and d) as a way of deciding which version is better, you do not

have to decide which is worse false positives or false negatives

**Question 2**Paste the first two variables of data set 3 into a data set summarizer

a) Paste the descriptive sample statistics below. The descriptive statistics let you investigate the relationship between the variables “Which country?” and “Number of tests ?” using the sample

b) Use the answer to part a) to describe the relationship by using one of the following numbers,

select the correct option

i. The difference between sample means –

ii. The difference between sample proportions –

iii. The correlation coefficient r

You should also describe the relationship in plain English

c) Paste in the graph that shows the predicted shape of the histograms if the variables are normally

distributed and compare the centres and the spreads.

d) Suppose you know the quantitative variable is normally distributed for both groups, make a

comment about part c)

**Question 3 :**

Paste the last two variables of data set 3 into a data set summarizer

a) Paste in the descriptive statistics into the word file. The descriptive sample statistics let you

investigate the relationship between the variables “Number of tests ?” and “Number of people needing tech support?” using the sample. Obviously paste in the graph as well.

b) Looking at the graph does there appear to be one linear relationship or two linear relationships?

c) Repeat part a) but this time only paste in the information from country A , so still select the columns “Number of tests” and “Number of people tech support?” but do NOT select any of the rows that are from country B

d) Using the output from part c) Describe the relationship between the variables using one of the following numbers, select the correct option

*The difference between sample means –

*The difference between sample proportions –

*The correlation coefficient r

Your description of the relationship should also include some plain English.

e) Using the information in part c) Write an equation that lets you predict the number of people needing tech support Y given the number of tests.

f) Use the information in part (d) to predict number of people needing tech support if the number of tests is 1000

**Question 4**Note that you need the output from question 2 to answer this question

**a) Just considering the information from country A**i) What is the estimate of the population mean number of tests

ii) What is the standard error of this estimate?

**b) Just considering the people country B**i) What is the estimate of the population mean number of tests?

ii) What is the standard error of this estimate?

**Question 5**

Note that you need the output from question 1 to answer this question

a) For version 1 of the test find a 95% confidence interval for the proportion of pregnant women that test positive

b) For version 2 of the test find a 95% confidence interval for the proportion of pregnant women that test positive

**Question 6**

Paste data set 1 into an appropriate data set summarizer

a) Paste in the computer output that measures evidence for the claim there is a relationship between the variables “Reality, Pregnant or not pregnant ?” and ““Test result, positive or negative” if you consider the whole population

b) Comment on the confidence interval

c) Comment on the pvalue

**Question 7**Paste the first two variables of data set 3 into an appropriate data set summarizer

a) Paste in inferential statistics that measure evidence for the claim there is a relationship between

the variables “Which country ?” and “Number of tests?”if you consider the whole population

b) Comment on the confidence interval

c) Comment on the p value

**Question 8**Paste the last two variables of dataset 3 into an appropriate dataset summarizer

a) Paste in computer output that measure evidence for the claim there is a relationship between

the variables “number of tests?” and “number of people needing tech support?” if you consider

the whole population

Hint: inferential statistics measure evidence for a claim.

b) Just using the information from country A, Paste in computer output that measure evidence for

the claim there is a relationship between the variables “number of tests?” and “number of

people needing tech support?” if you consider the whole population

c) Which case has a lower standard error the output from part a) or the output from part b)

d) In both part a) and part b) the computer is trying to find a single linear relationship between the

variables, based on your previous work in which case is the output trustworthy?

e) Comment on the confidence interval in part b)

f) Comment on the p value in part b)

**Question 9 :**Paraphrase one or more of the concepts in of one or more of the videos from the list on the next page and explain how the concept (or concept) is useful in business . A total of 400 words is enough. An easy way to keep the Turnitin match is to give a brief overview of a few different videos it is easier to use your own words when you give a brief overview.

Include screenshots of the video and explain how the image helps explain the message in the video

As an example of screenshot from a video 2 semesters ago

An example of explaining the screenshot

The video was talking about how inferential statistics involves taking a sample to make an estimate of the population and in the screenshot above you can see a lady getting a sample of the chips to test the fat and salt content and this can be used to make an estimate of the whole populaiton.

**Upload the word file to the Turnitin link on moodle**

**Instructions for the excel file ,**

This is worth 2% of your final grade

you have to use the excel commands discussed below and not the data set summarizer However you should check that your summaries are the same as the output from the data set summarizer you used in the word file. If you have different information you will get at most 1 out of 2

You need to cut and paste just your data set into a new excel file and follow the instructions below, DO NOT use a cover page for the excel file, you must check that you have the correct sample

Note that you do not have to use excel to make summaries you can use google sheets

A) Select all of data set 1 and use excel Pivot Table commands (or google sheet pivot table commands) to find appropriate sample statistics that let you investigate the relationship between the fields (variables) “Reality, pregnant or noy virus?” and “Test result, positive or negative?”

B) Select all the first two variables of data set 3 and use excel Pivot Table commands (or google sheet pivot table commands) to find appropriate sample statistics that let you investigate the relationship between the fields (variables) “Which country?” and “Number of tests in one week ?”

C) Select the last 2 variables of data set 4 and use excel commands to make a graph that lets you investigate the relationship between the fields (variables) “Number of tests in one week ?” and “number of people that need tech support ?”

D) Upload the excel file with the pivot tables and scatter plot to the assignment drop box

