Hello Peers, Today we are going to share all week’s assessment and quizzes answers of the Inferential Statistics course launched by Coursera totally free of cost✅✅✅. This is a certification course for every interested student.

In case you didn’t find this course for free, then you can apply for financial ads to get this course for totally free.

Coursera, India’s biggest learning platform launched millions of free courses for students daily. These courses are from various recognized universities, where industry experts and professors teach in a very well manner and in a more understandable way.

Here, you will find Inferential Statistics Exam Answers in Bold Color which are given below.

These answers are updated recently and are 100% correct✅ answers of all week, assessment, and final exam answers of Inferential Statistics from Coursera Free Certification Course.

Use “Ctrl+F” To Find Any Questions Answer. & For Mobile User, You Just Need To Click On Three dots In Your Browser & You Will Get A “Find” Option There. Use These Option to Get Any Random Questions Answer.

The most prevalent statistical inference methods, including those for numerical and categorical data, are covered in this course. You will learn how to set up and carry out hypothesis tests, as well as how to evaluate p-values and communicate the results of your analysis in a manner that can be understood by customers or the general public.

You will learn how to give estimates of numbers in a way that reflects the uncertainty of the quantity of interest by using a large number of data samples. This will allow you to better analyze the data.

You will be walked through the process of installing and using R and RStudio, both of which are free statistical software packages. You will use these packages for lab exercises as well as a final project.

This course addresses the essential principles necessary to evaluate and report results for both categorical and numerical data, and it presents practical tools for undertaking data analysis.

SKILLS YOU WILL GAIN

• Statistical Inference
• Statistical Hypothesis Testing
• R Programming

Course Apply Link – Inferential Statistics

#### Quiz 1: Practice Quiz Answers

Q1. Suppose we are interested in studying how much chocolate is consumed by Coursera students, measured in grams per week. After surveying 500 students, we calculate an average of 175 grams per week with a standard deviation of 195 grams per week. Which of the following is not necessarily true?

• A histogram of the samples will be skewed to the right.
• {x} = 175, s = 195
• A point estimate for the population standard deviation is 195.
• μ=175, σ=195

Q2. Which of the following is false?

• Standard error measures the variability in means of samples of the same size taken from different populations.
• As the sample size increases, the variability of the sampling distribution decreases.
• Standard error computed based on a sample standard deviation will always be lower than the standard deviation of that sample.
• In order to reduce the standard error by half, sample size should be increased by a factor of 4.

Q3. The ages of pennies at a particular bank follow a nearly normal distribution with mean 10.44 years with standard deviation 9.2 years. Say you take random samples of 30 pennies, find the mean age in each sample, and plot the distribution of these means. Which of the following are the best estimates for the center and spread of this distribution?

• mean = 10.44/30 = 0.348,
• standard error = (9.2/30)^2 = 0.094(9.2/30)
2 =0.094
• mean = 10.44,
• standard error = 9.2/30 = 0.31
• mean = 10.44,
• standard error = 9.2
• mean = 10.44,
• standard error = 9.2/ \sqrt{30} = 1.689.2/
30 =1.68

Q4. Which of the following is true about sampling distributions?

• Sampling distributions get closer to normality as the sample size increases.
• Sampling distributions are always nearly normal.
• Shape of the sampling distribution is always the same shape as the population distribution, no matter what the sample size is.
• Sampling distribution of the mean is always right skewed since means cannot be smaller than 0.

Q5. To get an estimate of consumer spending in the U.S. following the Thanksgiving holiday, 436 randomly sampled American adults were surveyed. Their daily spending for the six-day period following Thanksgiving averaged $84.71. A 95% confidence interval based on this sample is ($80.31, $89.11). Which of the following are true? • I. We are 95% confident that the average spending of the 436 American adults in this sample is between$80.31 and $89.11. • II. If we collected many random samples of the same size and calculated a confidence interval for daily spending for each sample, then we would expect 95% of the intervals to contain the true population parameter. III. We are 95% confident that the average spending of all American adults is between$80.31 and \$89.11.

• I and II
• I and III
• II and III
• I, II, and III
• None

Q6. Which of the following is false about confidence intervals?All else held constant.

• as the standard deviation of the sample increases, the width increases.
• as the confidence level increases, the width decreases.
• as the sample mean increases, the margin of error stays constant.
• as the sample size increases, the margin of error decreases.

Q. Researchers studying anthropometry collected body girth measurements and skeletal diameter measurements, as well as age, weight, height and gender, for 507 physically active individuals. The histogram below shows the sample distribution of heights in centimeters, and the table shows sample statistics calculated based on this sample. Which of the following is not necessarily true?

• The point estimate for the population mean is 171.1 cm.
• The population mean is 171.1 cm.
• The sample median is 170.3 cm.
• The sample mean is 171.1 cm.

Q3. For the standard deviation σ or s and the standard error SE, which of the following is the correct set of descriptions?

s: variability in sample data

SE: variability in point estimates from different samples of the same size and from same population

σ: variability in population data

Q1. We want to estimate the average coffee intake of Coursera students, measured in cups of coffee. A survey of 1,000 students yields an average of 0.55 cups per day, with a standard deviation of 1 cup per day. Which of the following is not necessarily true?

• μ=0.55, σ=1
• x = 0.55
• The sample distribution is right skewed.

0.55 is a point estimate for the population mean.

Q2. Researchers studying anthropometry collected various body and skeletal measurements for 507 physically active individuals. The histogram below shows the sample distribution of heights in centimeters. If the 507 individuals are a simple random sample – and let’s assume they are – then the sample mean is a point estimate for the mean height of all active individuals. What measure do we use to quantify the variability of such an estimate? Compute this quantity using the data from this sample and choose the best answer below.

• standard deviation = 0.417
• standard error = 0.019
• mean squared error = 0.105
• standard deviation = 0.019
• standard error = 0.417

Q3. Students are asked to count the number of chocolate chips in 22 cookies for a class activity. They found that the cookies on average had 14.77 chocolate chips with a standard deviation of 4.37 chocolate chips. After collecting the data, a student reports the standard error of the mean to be 0.93 chocolate chips. What is the best way to interpret the student’s result?

• 0.93 chocolate chips is a measure of the variability we’d expect in calculations of the mean number of chocolate chips if we took repeated random samples of 22 cookies.
• The student either made a calculation error or his result is meaningless, because it does not make sense to talk about 0.93 chocolate chips.
• 0.93 is the standard deviation of the number of chocolate chips in a chocolate chip cookie.
• 0.93 chocolate chips is a measure of the variability in the mean number of chocolate chips across all chocolate chip cookies.

Q4. Four plots are presented below. The plot at the top is a distribution for a population. The mean is 60 and the standard deviation is 18. Also shown below is a distribution of

(1) a single random sample of 500 values from this population,

(2) a distribution of 500 sample means from random samples of each size 18,

(3) a distribution of 500 sample means from random samples of each size 81.

Determine which plot (A, B, or C) is which.

• (1) one sample, n = 500 – Plot B
• (2) 500 samples, n = 18 – Plot C
• (3) 500 samples, n = 81 – Plot A
• (1) one sample, n = 500 – Plot C
• (2) 500 samples, n = 18 – Plot A
• (3) 500 samples, n = 81 – Plot B
• (1) one sample, n = 500 – Plot A
• (2) 500 samples, n = 18 – Plot B
• (3) 500 samples, n = 81 – Plot C
• (1) one sample, n = 500 – Plot A
• (2) 500 samples, n = 18 – Plot C
• (3) 500 samples, n = 81 – Plot B
• (1) one sample, n = 500 – Plot C
• (2) 500 samples, n = 18 – Plot B
• (3) 500 samples, n = 81 – Plot A

Q5. The General Social Survey (GSS) is a sociological survey used to collect data on demographic characteristics and attitudes of residents of the United States. In 2010, the survey collected responses from over a thousand US residents. The survey is conducted face-to-face with an in-person interview of a randomly-selected sample of adults. One of the questions on the survey is “For how many days during the past 30 days was your mental health, which includes stress, depression, and problems with emotions, not good?”

Based on responses from 1,151 US residents, the survey reported a 95% confidence interval of 3.40 to 4.24 days in 2010. Given this information, which of the following statements would be most appropriate to make regarding the true average number of days of “not good” mental health in 2010 for US residents?

• For all US residents in 2010, there is a 95% probability that the true average number of days of “not good” mental health is between 3.40 and 4.24 days.
• For all US residents in 2010, based on this 95% confidence interval, we would reject a null hypothesis stating that the true average number of days of “not good” mental health is 5 days.
• There is not sufficient information to calculate the margin of error of this confidence interval.
• For these 1,151 residents in 2010, we are 95% confident that the average number of days of “not good” mental health is between 3.40 and 4.24 days.

Q. A random sample of 100 runners who completed the 2012 Cherry Blossom 10 mile run yielded an average completion time of 95 minutes. A 95% confidence interval calculated based on this sample is 92 minutes to 98 minutes. Which of the following is false based on this confidence interval?

• The margin of error of this confidence interval is 3 minutes.
• We are 95% confident that the true average finishing time of all runners who completed the 2012 Cherry Blossom 10 mile run is between 92 minutes and 98 minutes.
• Based on this 95% confidence interval, we would reject a null hypothesis stating that the true average finishing time of all runners who completed the 2012 Cherry Blossom 10 mile run is 90 minutes.
• 95% of the time the true average finishing time of all runners who completed the 2012 Cherry Blossom 10 mile run is between 92 minutes and 98 minutes.

Q6. Suppose we collected a sample of size n = 100 from some population and used the data to calculate a 95% confidence interval for the population mean. Now suppose we are going to increase the sample size to n = 300. Keeping all else constant, which of the following would we expect to occur as a result of increasing the sample size?

The standard error would decrease.

Width of the 95% confidence interval would increase.

The margin of error would decrease.

• II and III
• I and III
• I and II
• I, II, and III
• None

Q7. Researchers investigating characteristics of gifted children collected data from schools in a large city on a random sample of thirty-six children who were identified as gifted children soon after they reached the age of four. The following histogram shows the distribution of the ages (in months) at which these children first counted to 10 successfully. Also provided are some sample statistics.

Calculate a 90% confidence interval for the average age at which gifted children first count to 10 successfully. Choose the closest answer.

• (30.49, 30.89)
• (29.50, 31.88)
• (29.28, 32.10)
• (30.12, 31.26)

#### Quiz 3: Week 1 Lab Answers

Q1. Which of the following is false?

• The distribution of areas of houses in Ames is unimodal and right-skewed.
• 50% of houses in Ames are smaller than 1,499.69 square feet.
• The middle 50% of the houses range between approximately 1,126 square feet and 1,742.7 square feet.
• The IQR is approximately 616.7 square feet.
• The smallest house is 334 square feet and the largest is 5,642 square feet.

Q2. Suppose we took two more samples, one of size 100 and one of size 1000. Which would you think would provide a more accurate estimate of the population mean?

• Sample size of 50
• Sample size of 100
• Sample size of 1000

Q3. How many elements are there in this object called sample_means_small?

• 0
• 3
• 25
• 100
• 5,000

Q4. Which of the following is true about the elements in the sampling distributions you created?

• Each element represents a mean square footage from a simple random sample of 10 houses.
• Each element represents the square footage of a house.
• Each element represents the true population mean of square footage of houses.

Q5. It makes intuitive sense that as the sample size increases, the center of the sampling distribution becomes a more reliable estimate for the true population mean. Also as the sample size increases, the variability of the sampling distribution _.

• decreases
• increases
• stays the same

Q6. Which of the following is false?

• The variability of the sampling distribution with the smaller sample size (sample_means50) is smaller than the variability of the sampling distribution with the larger sample size (sample_means150).
• The means for the two sampling distributions are roughly similar.
• Both sampling distributions are symmetric.

#### Quiz 1: Practice Quiz Answers

Q1. Read the following scenario and then, from the choices that follow, choose the correct set of hypotheses for the scenario:

Since 2008, chain restaurants in California have been required to display calorie counts of each menu item. Prior to menus displaying calorie counts, the average calorie intake of diners at a restaurant was 1100 calories. After calorie counts started to be displayed on menus, a nutritionist collected data on the number of calories consumed at this restaurant from a random sample of diners. Do these data provide convincing evidence of a difference in the average calorie intake of a diners at this restaurant?

• H_0: \mu = 1100 \\ H_A: \mu < 1100H0​:μ=1100HA​:μ<1100
• H_0: \bar{x} = 1100 \\ H_A: \bar{x} < 1100H0​:xˉ=1100HA​:xˉ<1100
• H_0: \mu = 1100 \\ H_A: \mu > 1100H0​:μ=1100HA​:μ>1100
• H_0: \mu = 1100 \\ H_A: \mu \ne 1100H0​:μ=1100HA​:μ​=1100

Q2. Which of the following is the correct definition of the p-value?

• P(H_0H0​ true | observed data)
• P(H_0H0​ true)
• P(H_0H0​ true | H_AHA​ false)
• P(observed or more extreme sample statistic | H_0H0​ true)

Q3. One-sided alternative hypotheses are phrased in terms of:

• < or >
• ≤ or ≥
• ≈ or =

Q4. A Type 2 error occurs when the null hypothesis is

• rejected when it is false
• rejected when it is true
• not rejected when it is false
• not rejected when it is true

Q5. True / False: Decreasing the significance level (\alphaα) will increase the probability of making a Type 1 error.

• True
• False

#### Quiz 2: Week 2 Quiz Answers

Q1. A study suggests that the average college student spends 2 hours per
week communicating with others online. You believe that this is an
underestimate and decide to collect your own sample for a hypothesis
test. You randomly sample 60 students from your dorm and find that on
average they spent 3.5 hours a week communicating with others online.
Which of the following is the correct set of hypotheses for this
scenario?

• H_0: \mu = 2\\ H_A: \mu > 2H0​:μ=2HA​:μ>2
• H_0: \bar{x} = 2\\ H_A: \bar{x} > 2H0​:xˉ=2HA​:xˉ>2
• H_0: \bar{x} = 2\\ H_A: \bar{x} < 2H0​:xˉ=2HA​:xˉ<2
• H_0: \mu = 3.5\\ H_A: \mu < 3.5H0​:μ=3.5HA​:μ<3.5
• H_0: \mu = 2\\ H_A: \mu < 2H0​:μ=2HA​:μ<2

Q2. Which of the following is the correct definition of the p-value?

• P(observed or more extreme sample statistic | H_0H0​ true)
• P(H_0H0​ true | H_AHA​ false)
• P(observed sample statistic | H_0H0​ true)
• P(H_0H0​ true | observed sample statistic)

Q3. Two-sided alternative hypotheses are phrased in terms of:

• < or >
• ≈ or =
• ≤ or ≥

Q4. A Type 1 error occurs when the null hypothesis is

• rejected when it is true
• rejected when it is false
• not rejected when it is true
• not rejected when it is false

Q5. A statistician is studying blood pressure levels of Italians in the
age range 75-80. The following is some information about her study:

The data were collected by responses to a survey conducted by email,
and no measures were taken to get information from those who did not
respond to the initial survey email.

The sample observations only make up about 4% of the population.

The sample size is 2,047.

The distribution of sample observations is skewed – the skew is easy to see, although not very extreme.

The
researcher is ready to use the Central Limit Theorem (CLT) in the main
part of her analysis. Which aspect of the her study is most likely to
prevent her from using the CLT?

• (I), because the sample may not be random and hence observations may not be independent.
• (II), because she only has data from a small proportion of the whole population.
• (III), because the sample size is too small compared to all Italians in the age range 75-80.
• (IV), because there is some skew in the sample distribution.

Q6. SAT scores are distributed with a mean of 1,500 and a standard
deviation of 300. You are interested in estimating the average SAT score
of first year students at your college. If you would like to limit the
margin of error of your 95% confidence interval to 25 points, at least
how many students should you sample?

• 392
• 393
• 553
• 554
• 13,830

Q7. The significance level in hypothesis testing is the probability of

• rejecting a null hypothesis
• rejecting an alternative hypothesis
• rejecting a true null hypothesis
• failing to reject a true null hypothesis
• failing to reject a false null hypothesis

Q8. The nutrition label on a bag of potato chips says that a one ounce
(28 gram) serving of potato chips has 130 calories and contains ten
grams of fat, with three grams of saturated fat. A random sample of 35
bags yielded a sample mean of 134 calories with a standard deviation of
17 calories. We are evaluating whether these data provide convincing
evidence that the nutrition label does not provide an accurate measure
of calories in the bags of potato chips at the 10% significance level.
Which of the following is correct?

• The p-value is approximately 16%, which means we should fail to reject the null hypothesis and determine that these data do not provide convincing evidence the nutrition label does not provide an accurate measure of calories in the bags of potato chips.
• The p-value is approximately 16%, which means we should reject the
null hypothesis and determine that these data provide convincing
evidence the nutrition label does not provide an accurate measure of
calories in the bags of potato chips.
• The p-value is approximately 8%, which means we should reject the
null hypothesis and determine that these data provide convincing
evidence the nutrition label does not provide an accurate measure of
calories in the bags of potato chips.
• The p-value is approximately 8%, which means we should fail to reject the null hypothesis and determine that these data do not provide convincing evidence the nutrition label does not provide an accurate measure of calories in the bags of potato chips.

#### Quiz 3: Week 2 Lab Answers

Q1. My distribution should be similar to others’ distributions who also collect random samples from this population, but it is likely not exactly the same since it’s a random sample.

• True
• False

Q2. For the confidence interval to be valid, the sample mean must be normally distributed and have standard error \frac{s}{\sqrt{n}}​
. Which of the following is not a condition needed for this to be true?

• The sample is random.
• The sample size, 60, is less than 10% of all houses.
• The sample distribution must be nearly normal.

Q3. What does “95% confidence” mean?

• 95% of the time the true average area of houses in Ames, Iowa, will be in this interval.
• 95% of random samples of size 60 will yield confidence intervals that contain the true average area of houses in Ames, Iowa.
• 95% of the houses in Ames have an area in this interval.
• 95% confident that the sample mean is in this interval.

Q4. What proportion of 95% confidence intervals would you expect to capture the true population mean?

• 1%
• 5%
• 95%
• 99%

Q5. What is the appropriate critical value for a 99% confidence level?

• 0.01
• 0.99
• 1.96
• 2.33
• 2.58

Q6. We would expect 99% of the intervals to contain the true population mean.

• True
• False

#### Quiz 1: Week 3 Practice Quiz Answers

Q1. Consider the width of two bootstrap confidence intervals constructed based on the same sample. One of the intervals is constructed at a 90% confidence level and the other is constructed at a 95% confidence level. Which of the following is true?

• The 95% interval is wider.
• There is not enough information to determine which interval is wider.
• The 90% interval is wider.
• The intervals are the same size.

Q2. Which of the following is not a situation where the paired test is preferred?

• Compare pre- (beginning of semester) and post-test (end of semester) scores of students.
• Compare artery thicknesses at the beginning of a study and after 2 years of taking Vitamin E.
• Assess gender-related salary gap by comparing salaries of randomly sampled men and women.
• Assess effectiveness of a diet regimen by comparing the before and after weights of subjects.

Q3. You’ve just read a study that investigated the difference in brain sizes between EU and US citizens, based on data from random samples from both populations. At the 5% significance level the study failed to reject the null hypothesis that EU and US citizens have (on average) brains of equal size. Which of the following is true regarding a 99% confidence interval for the difference in brain sizes?

• The interval does not contain 0.
• The interval contains 0.
• Without more information, it is impossible to know whether the interval contains 0.
• Since the data come from samples and not populations, no conclusions can be made.

Q4. The figure below shows three unimodal and symmetric curves, which assignment is most plausible?

• solid: t_{df = 1}t
df=1
• dashed: t_{df = 5}t
df=5
• dotted: normal
• solid: t_{df = 5}t
df=5
• dashed: t_{df = 1}t
df=1
• dotted: normal
• solid: normal
• dashed: t_{df = 1}t
df=1
• dotted: t_{df = 5}t
df=5
• solid: normal
• dashed: t_{df = 5}t
df=5
• dotted: t_{df = 1}t
df=1

Q5. We are testing the following hypotheses:

H0 : μ = 3HA : μ > 3

The sample size is 18. The test statistic is calculated as T = 0.5. What is the p-value?

• greater than 0.1
• less than 0.005
• between 0.01 and 0.025
• less than 0.01
• between 0.05 and 0.1

Q6. What does ANOVA mean?

• Aardvarks not over vanilla ants
• Analysis of variance
• Assessment of orthogonal variation
• Assessment of null observed variability

Q7. Which of the following is not a condition required for comparing means across multiple groups using ANOVA?

• The variability across the groups should be about equal.
• The data within each group should be nearly normal.
• The means of each group should be roughly equal.
• The observations should be independent within and across groups.

Q8. Which of the following looks most like an F distribution?

#### Quiz 2: Week 3 Quiz Answers

Q1. People of different ages were asked to stand on a “force platform” and maintain a stable upright position. The “wiggle” of the board in the forward-backward direction is recorded; more wiggle corresponds to less balance. The participants are divided into two age groups: young and elderly. The average wiggle among elderly people was 26.33 mm, and the average among young people was 18.125 mm. The bootstrap distribution for the difference in means is shown below, based on 100 bootstrap samples. Of the following choices, which is the most accurate 90% bootstrap confidence interval for the true difference in means?

• (5 mm, 15 mm)
• (3.75 mm, 15 mm)
• (3 mm, 17 mm)
• (2.5 mm, 18 mm)

Q2. Which of the following is false regarding paired data?

• Each observation in one data set is subtracted from the average of the other data set’s observations.
• Two data sets of different sizes cannot be analyzed as paired data.
• In a paired analysis we first subtract the paired observations from each other, and then do inference on the differences.
• Each observation in one data set has a natural correspondence with exactly one observation from the other data set.

Q3. Which of the following is false about bootstrap and sampling distributions?

• Both distributions get narrower as the standard deviation decreases.
• Both distributions are created by sampling with replacement from the population.
• Both distributions are comprised of sample statistics.
• Bootstrap distributions are centered at the sample statistic, sampling distributions are centered at the population parameter.

Q4. Your friend, who took statistics a few years ago, recently read a study which examined whether there is any difference between the average birth weights of babies born to smoking mothers vs. non-smoking mothers. Your friend asked you to remind him what it means when the study says “a 95% confidence interval for the difference between the average birth weight from non-smoking mothers and smoking mothers (\mu_{non}- \mu_{smoke}μ
non

−μ
smoke

) is 0.2 to 0.9 pounds.” Of the following possible responses to your friend’s question, which is true according to the study?

• The study’s authors are 95% confident that babies born to non-smoking mothers are on average 0.2 to 0.9 pounds lighter than babies born to smoking mothers.
• The study data does not provide convincing evidence (at 5% significance level) of a difference between the average birth weight from smoking mothers and non-smoking mothers.
• The study’s authors are 95% confident that babies born to non-smoking mothers are on average 0.2 to 0.9 pounds heavier than babies born to smoking mothers.
• The study data does not provide convincing evidence (at 10% significance level) of a difference between the average birth weight from smoking mothers and non-smoking mothers.

Q5. An insurance company wants to estimate (using a confidence interval) its average claim amount using data from 20 randomly selected claims. Which of the following is false?

• The critical tt-score, t^\start , has 19 degrees of freedom.
• A confidence interval based on this sample is not accurate since the sample size is small.
• If the distribution of the sampled claim amounts is not extremely skewed, a T interval is appropriate.
• The confidence interval can also be calculated using bootstrapping.

Q6. The figure below shows three tt-distribution curves. Which curve has the highest degree of freedom?

• Solid
• dashed
• dotted

Q7. Air quality measurements were collected in a random sample of 25 country capitals in 2013, and then again in the same cities in 2014. We would like to use these data to compare average air quality between the two years. Which of the following tests is the most appropriate?

• independent samples t-test with two-sided alternative hypothesis
• paired t-test with one-sided alternative hypothesis
• paired t-test with two-sided alternative hypothesis
• independent samples t-test with one-sided alternative hypothesis

Q8. We are testing the following hypotheses:

H0​ : μ = 0.5

H_AHA​ : μ \neq​= 0.5

The sample size is 26. The test statistic is calculated as T = 2.485. What is the p-value?

• between 0.005 and 0.01
• between 0.02 and 0.05
• between 0.01 and 0.02

Q9. When doing an ANOVA, you observe large differences in means between groups. Within the ANOVA framework this would most likely be interpreted as:

• Evidence strongly favoring the alternative hypothesis.
• Evidence strongly favoring the null hypothesis.
• Evidence revealing which group mean is different from the others.
• None of the above

Q10. Which of the following is not a condition required for comparing means across multiple groups using ANOVA?

• The variability across the groups should be about equal.
• The observations should be independent within and across groups.
• There should be at least 10 successes and 10 failures.
• The data within each group should be nearly normal.

Q11. A study compared five different methods for teaching descriptive statistics. The five methods were traditional lecture and discussion, programmed textbook instruction, programmed text with lectures, computer instruction, and computer instruction with lectures. 45 students were randomly assigned, 9 to each method. After completing the course, students took a 1-hour exam.

Which of the following is the correct degrees of freedom for an F-test for evaluating if the average test scores are different for the different teaching methods?

• df_G = 45, df_E = 4dfG​=45,dfE​=4
• df_G = 40, df_E = 4dfG​=40,dfE​=4
• df_G = 4, df_E = 44dfG​=4,dfE​=44
• df_G = 5, df_E = 45dfG​=5,dfE​=45
• df_G = 4, df_E = 40dfG​=4,dfE​=40

Q12. A study compared five different methods for teaching descriptive statistics. The five methods were traditional lecture and discussion, programmed textbook instruction, programmed text with lectures, computer instruction, and computer instruction with lectures. 45 students were randomly assigned, 9 to each method. After completing the course, students took a 1-hour exam. We are interested in finding out if the average test scores are different for the different teaching methods. Which of the following is the appropriate set of hypotheses?

• H0​:μ1​=μ2​=μ3​=μ4​=μ5​
• H_A: \mu_1 \neq \mu_1 \neq \mu_2 \neq \mu_3 \neq \mu_4 \neq \mu_5HA​:μ1​​=μ1​​=μ2​​=μ3​​=μ4​​=μ5​
• H_0: \mu_1 = \mu_2 = \mu_3 = \mu_4 = \mu_5H0​:μ1​=μ2​=μ3​=μ4​=μ5​
• H_A:HA​: at least one \mu_iμi​ is different
• H_0: s_{between} = s_{within}H0​:sbetween​=swithin
• H_A: s_{between} \neq s_{within}HA​:sbetween​​=swithin
• H_0: \mu_{between} = \mu_{within}H0​:μbetween​=μwithin
• H_A: \mu_{between} \neq \mu_{within}HA​:μbetween​​=μwithin
• H_0: \mu_{between} \neq \mu_{within}H0​:μbetween​​=μwithin
• H_A: \sigma_{between} \neq \sigma_{within}HA​:σbetween​​=σwithin

Q13. Researchers studying people’s sense of smell devised a measure of smelling ability. A higher score on this scale means the subject can smell better. A random sample of 36 people (18 male and 18 female) were involved in the study. The average score for the males was 10 with a standard deviation of 3.4 and the average score for the females was 11 with a standard deviation of 2.7. Which of the following is the correct standard error for the test evaluating whether the males and females have differing smelling abilities, on average? Choose the closest answer.

• 0.801
• 1.023
• 3.504
• 1.047
• 0.724

Q14. A study compared five different methods for teaching descriptive statistics. The five methods were traditional lecture and discussion, programmed textbook instruction, programmed text with lectures computer instruction, and computer instruction with lectures. 45 students were randomly assigned, 9 to each method. After completing the course, students took a 1-hour exam. We are interested in finding out if the average test scores are different for the different teaching methods.

How many pairwise tests would we need to do in order to compare all pairs of means to each other?

• 3
• 20
• 10
• 4
• 5

#### Quiz 3: Week 3 Lab Quiz Answers

Q1. There are 1,000 cases in this data set, what do the cases represent?

• The days of the births
• The births
• The fathers of the children
• The hospitals where the births took place

Q2. How many mothers are we missing weight gain data from?

• 0
• 13
• 27
• 31

Q3. Make side-by-side boxplots of habit and weight. Which of the following is false about the relationship between habit and weight?

• Median birth weight of babies born to non-smoker mothers is slightly higher than that of babies born to smoker mothers.
• Range of birth weights of babies born to non-smoker mothers is greater than that of babies born to smoker mothers.
• Both distributions are extremely right skewed.
• The IQRs of the distributions are roughly equal.

Q4. What are the hypotheses for testing if the average weights of babies born to smoking and non-smoking mothers are different?

• H_0: \mu_{smoking} = \mu_{non-smoking}H0​:μsmoking​=μnonsmoking
• H_A: \mu_{smoking} > \mu_{non-smoking}HA​:μsmoking​>μnonsmoking
• H_0: \mu_{smoking} = \mu_{non-smoking}H0​:μsmoking​=μnonsmoking
• H_A: \mu_{smoking} ≠ \mu_{non-smoking}HA​:μsmoking​​=μnonsmoking
• H_0: \bar{x}_{smoking} = \bar{x}_{non-smoking}H0​:xˉsmoking​=xˉnonsmoking
• H_A: \bar{x}_{smoking} ≠ \bar{x}_{non-smoking}HA​:xˉsmoking​​=xˉnonsmoking
• H_0: \bar{x}_{smoking} = \bar{x}_{non-smoking}H0​:xˉsmoking​=xˉnonsmoking
• H_A: \bar{x}_{smoking} > \bar{x}_{non-smoking}HA​:xˉsmoking​>xˉnonsmoking
• H_0: \mu_{smoking} ≠ \mu_{non-smoking}H0​:μsmoking​​=μnonsmoking
• H_A: \mu_{smoking} = \mu_{non-smoking}HA​:μsmoking​=μnonsmoking

Q5. Change the type argument to “ci” to construct and record a confidence interval for the difference between the weights of babies born to smoking and non- smoking mothers. Which of the following is the best interpretation of the interval?

• We are 95% confident that babies born to nonsmoker mothers are on average 0.05 to 0.58 pounds lighter at birth than babies born to smoker mothers.
• We are 95% confident that the difference in average weights of babies whose moms are smokers and nonsmokers is between 0.05 to 0.58 pounds.
• We are 95% confident that the difference in average weights of babies in this sample whose moms are smokers and nonsmokers is between 0.05 to 0.58 pounds.
• We are 95% confident that babies born to nonsmoker mothers are on average 0.05 to 0.58 pounds heavier at birth than babies born to smoker mothers.

Q6. Calculate a 99% confidence interval for the average length of pregnancies (weeks). Note that since you’re doing inference on a single population parameter, there is no explanatory variable, so you can omit the x variable from the function. Which of the following is correct interval?

• (38.0892 , 38.5661)
• (6.9779 , 7.2241)
• (38.0952 , 38.5742)
• (38.1526 , 38.5168)

Q7. Now, a non-inference task: Determine the age cutoff for younger and mature mothers. Use a method of your choice. What is the maximum age of a younger mom and the minimum age of a mature mom, according to the data?

• The maximum age of younger moms is 34 and minimum age of mature moms is 35.
• The maximum age of younger moms is 35 and minimum age of mature moms is 36.
• The maximum age of younger moms is 33 and minimum age of mature moms is 34.
• The maximum age of younger moms is 32 and minimum age of mature moms is 33.

#### Quiz 1: Week 4 Practice Quiz

Q1. Suppose you want to construct a confidence interval for a population proportion. Which of the following, if it were true, would prevent you from being able to assume that the distribution of the sample proportion is nearly normal?

• n = 104. Out of these 104 there are only a few successes (15), but relatively many failures (89).
• n = 104. These observations are a simple random sample and make up less than 10% of the population.
• n = 104. Out of these 104 there are only a few failures (7), but relatively many successes (97).
• None of these options.

Q2. In 2013, Edward Snowden leaked details of top-secret NSA spying activities to the media. A poll conducted by USA TODAY / Pew Research Center asked 1,504 people in U.S. whether Snowden’s leaks have helped or harmed the public interest. 53% of respondents answered “helped the public interest”. You want to test whether a majority of people in the U.S. believe he helped the public interest. Which of the following is the correct set of hypotheses?

• H0​:ρ=0.53;HA​:ρ>0.53
• H_0: \rho = 0.5; H_A: \rho > 0.5H0​:ρ=0.5;HA​:ρ>0.5
• H_0: \rho < 0.5; H_A: \rho > 0.5H0​:ρ<0.5;HA​:ρ>0.5
• H_0: \rho = 0.53; H_A: \rho < 0.53H0​:ρ=0.53;HA​:ρ<0.53

Q3. In response to complaints from residents about too many (about 15%) of the cars passing by the local school speeding, the police started closely monitoring traffic. You want to check if the police’s efforts had an effect on the prevalence of speeding in this area. One day you observe 560 different cars pass by the school, and find that 70 of them were speeding. You calculate a p-value of 0.0976. Assuming the cars are representative of all cars that drive by the school, which of the following is true?

• If in fact the police’s efforts had an effect, the probability of getting a random sample of 560 cars where 70 or less cars are speeding is 0.0976.
• If in fact the police’s efforts didn’t have an effect, the probability of getting a random sample of 560 cars where 70 or less cars are speeding is 0.0976.
• If in fact the police’s efforts didn’t have an effect, the probability of getting a random sample of 560 cars where 70 cars are speeding is 0.0976.
• If in fact the police’s efforts didn’t have an effect, the probability of getting a random sample of 560 cars where 70 or less or 98 or more cars are speeding is 0.0976.

Q4. When do we use the pooled proportion in calculation of the standard error of the

difference of two proportions (SE(\hat{p}p^​1 − \hat{p}p^​2))?

• when constructing a confidence interval for p1 − p2
• when comparing p1 and p2 using a theoretical approach, and the null hypothesis is H0 : p1 − p2 = (some value other than 0)
• when using a randomization test to compare p1 − p2
• when comparing p1 and p2 using a theoretical approach, and the null hypothesis is H0 : p1 − p2 = 0

Q5. Rock-paper-scissors is a hand game played by two or more people where players choose to sign either ‘rock’, ‘paper’, or ‘scissors’ with their hands. We would like to test if players choose between these three options randomly, or if certain options are favored above others. What hypothesis test should we conduct to answer this research question?

• Chi square test of independence
• Compare two means
• Compare two proportions
• Chi square test of goodness of fit

Q6. When doing a hypothesis test on a single proportion (i.e. for one categorical variable), we have studied how to calculate the p-value for the hypothesis test, beginning with generating simulated samples. Which of the following is the best description for how you should generate the simulated samples, and why?

• Generate simulated samples based on the alternative hypothesis because that is the hypothesis we’re trying to prove when doing the hypothesis test.
• Generate simulated samples based on the null hypothesis because we need to see how extreme our observed data looks if the null hypothesis were really true.
• Generate simulated samples based on the alternative hypothesis because we need to see how extreme our observed data looks if the alternative hypothesis were really true.
• Generate simulated samples based on the null hypothesis because that is the hypothesis we’re trying to prove when doing the hypothesis test.

Q7. True or false: In calculation of the required sample size for a given margin of error of the confidence interval for a population proportion, we should use p= 0.5 if we don’t have any knowledge about the characteristics of the population.

• True
• False

Q8. Suppose in a population 20% of people wear contact lenses. What is the expected shape of the sampling distribution of proportion of contact lens wearers in random samples of 1000 people from this population?

• uniform
• nearly normal
• left-skewed
• right-skewed

Q9. True/False: When the success-failure condition is not met, we should use a T test to compare two proportions.

• True
• False

#### Quiz 2: Week 4 Quiz Answers

Q1. Which of the following is not required for the distribution of the sample proportion to be nearly normal?

• There should be at least 10 failures.
• Observations should be independent.
• There should be at least 10 successes.
• Sample size should be at least 30 and the population distribution should not be extremely skewed.

Q2. When checking conditions for calculating a confidence interval for a proportion, you should use which number of successes and failures?

• Expected (based on the null value)
• Not applicable. The number of successes and failures (observed or otherwise) is not part of the conditions required for calculating a confidence interval for a proportion.
• Depends on the context
• Observed

Q3. In May 2011, Gallup asked 1,721 students in grades five through twelve if their school teaches them about money and banking. Researchers are interested in finding out if a majority of students receive such education. Which of the following is the correct set of hypotheses?

• H0 :p < 0.5; HA :p > 0.5
• H0 :μ = 0.5; HA :μ > 0.5
• H0: \hat{p}p^​ = 0.5; HA :\hat{p}p^​ ≠ 0.5
• H0 :p = 0.5; HA :p > 0.5

Q4. The campaign manager for a congressional candidate claims that the candidate has more than 50% support from the district’s electorate. A newspaper collects a simple random sample of 500 likely voters in this district and estimates the support for this candidate to be 52%. The p-value for the hypothesis test evaluating the campaign manager’s claim is 0.19. Which of the below is correct?

• If in fact 50% of likely voters support this candidate, the probability of obtaining a random sample of 500 likely voters where 52% or more support the candidate is 0.19.
• The data provide convincing evidence for the campaign manager’s claim.
• 95% of random samples of size 500 will estimate the support for this candidate to be 52%.
• The success-failure condition is not met, so this p-value is not reliable.

Q5. Gallup conducts an annual poll of U.S. residents. Approximately 1,000 residents across all 50 states and Washington D.C. are asked “Do you believe the use of marijuana should be made legal?” The distribution of responses by date of survey is shown in the table below. Imagine a hypothesis test evaluating whether there is a difference from 2012 to 2013 between proportions of “yes” responses. Using the information in the table below, calculate the standard error for this hypothesis test. Choose the closest answer.

• 0.5798
• 0.00048
• 0.4754
• 0.022
• 0.5274

Q6. “In statistical inference for proportions, standard error (SE) is calculated differently for hypothesis tests and confidence intervals.” Which of the following is the best justification for this statement?

• Because if we used the same method for hypothesis tests as we did for confidence intervals, the calculation would be impossible.
• Because statistics is full of arbitrary formulas.
• Because in hypothesis testing we’re interested in the variability of the true population distribution, and in confidence intervals we’re interested in the variability of the sampling distribution.
• Because in hypothesis testing, we assume the null hypothesis is true, hence we calculate SE using the null value of the parameter. In confidence intervals, there is no null value, hence we use the sample proportion(s).

Q7. At the beginning of a semester an anonymous survey was conducted on students in a statistics class. Two of the questions on the survey were about gender and whether or not students have equal, more, or less energy in the afternoon compared to the morning. Below are the results.

What test should we perform to see if gender and energy level are associated?

• F test
• Comparing two proportions
• ANOVA
• Z test
• Comparing two means
• hypothesis test for a single mean
• Chi-square test of goodness of fit
• Chi-square test of independence

Q8. A variety of studies suggest that 10% of the world population is left-handed. It is also claimed that artists are more likely to be left-handed. In order to test this claim we take a random sample of 40 art students at a college and find that 6 of them (15%) are left handed. Which of the following is the correct set-up for calculating the p-value for this test?

• In a bag place 40 chips, 6 red and 34 blue. Randomly sample 40 chips, with replacement, and record the proportion of red chips in the sample. Repeat this many times, and calculate the proportion of samples where at least 10% of the chips are red.
• Randomly sample 40 non-art students, and record the number of left-handed students in the sample. Repeat this many times and calculate the proportion of samples where at least 15% of the students are left-handed.
• Roll a 10-sided die 40 times and record the proportion of times you get a 1. Repeat this many times, and calculate the proportion of simulations where the sample proportion is 10% or more.
• Roll a 10-sided die 40 times and record the proportion of times you get a 1. Repeat this many times, and calculate the proportion of simulations where the sample proportion is 15% or more.

Q9. True or false: The χ2 statistic is always non-negative.

• True
• False

Q10. 80% of Americans start the day with a cereal breakfast. Based on this information, determine if the following statement is true or false.

“The sampling distribution of the proportions of Americans who start the day with a cereal breakfast in random samples of size 40 is right skewed.”

• True
• False

Q11. At a stop sign, some drivers come to a full stop, some come to a ‘rolling stop’ (not a full stop, but slow down), and some do not stop at all. We would like to test if there is an association between gender and type of stop (full, rolling, or no stop). We collect data by standing a few feet from a stop sign and taking note of type of stop and the gender of the driver. What are the hypotheses for testing for an association between gender and type of stop?

• H0: Males and females are equally likely to come to a full stop.
• HA: Males and females are not equally likely to come to a full stop.
• H0: Gender and type of stop are associated.
• HA: Gender and type of stop are independent.
• H0: Males and females are equally likely to come to a rolling stop.
• HA: Males are more likely than females to come to a rolling stop.
• H0: Gender and type of stop are independent.
• HA: Gender and type of stop are associated.

Q12. Does Weight Watchers work? Researchers randomly divided 500 people into two equal-sized groups. One group spent 6 months on the Weight Watchers program. The other group received a pamphlet about controlling portion sizes. At the end of the study 35% of the subjects in the pamphlet group and 55% of the subjects in the Weight Watchers group had lost at least 10 pounds. To test whether Weight Watchers is more effective for weight loss than pamphlets, a statistician used an index card to represent each subject in the study and wrote whether or not the subject lost at least 10 pounds on the index card. He then shuffled these cards together, and dealt them into two equal-sized groups. Which of the following best describes the expected result?

• The difference between the proportions of cards indicating whether or not the subject lost at least 10 pounds will be about 0.
• The difference between the proportions of cards indicating whether or not the subject lost at least 10 pounds will be about 20%.
• If Weight Watchers was effective, the difference between the proportions of cards indicating whether or not the subject lost at least 10 pounds will be more than 20%.

#### Quiz 3: Week 4 Lab Quiz Answers

Q1. How many people were interviewed for this survey?

• A poll conducted by WIN-Gallup International surveyed 51,917 people from 57 countries
• A poll conducted by WIN-Gallup International surveyed 52,000 people from 57 countries
• A poll conducted by WIN-Gallup International surveyed 51,000 people from 57 countries
• A poll conducted by WIN-Gallup International surveyed 51,927 people from 57 countries

Q2. Which of the following methods were used to gather information?

• Face to face
• Telephone
• Internet
• All of the above

Q3. In the first paragraph, several key findings are reported. These percentages appear to be sample statistics.

• False
• True

Q4. The title of the report is “Global Index of Religiosity and Atheism”. To generalize the report’s findings to the global human population, We must assume that the sample was a random sample from the entire population in order to be able to generalize the results to the global human population. This does seem to be a reasonable assumption.

• True
• False

Q5. What does each row of Table 6 correspond to?

• Religions
• Countries
• Individual Persons

Q6. What does each row of atheism correspond to?

• Countries
• Individual Persons
• Religions

Q7. Using the command below, create a new dataframe called us12 that contains only the rows in atheism associated with respondents to the 2012 survey from the United States. Next, calculate the proportion of atheist responses. [TRUE / FALSE] This percentage agrees with the percentage in Table 6.

• True
• False

Q8. Based on the R output, what is the margin of error for the estimate of the proportion of the proportion of atheists in US in 2012?

• The margin of error for the estimate of the proportion of atheists in the US in 2012 is 0.0135.
• The margin of error for the estimate of the proportion of atheists in the US in 2012 is 0.025.
• The margin of error for the estimate of the proportion of atheists in the US in 2012 is 0.05.

Q9. Which of the following is false about the relationship between p and ME.

• The ME reaches a minimum at p = 1
• The ME reaches a minimum at p = 0
• The most conservative estimate for calculating a confidence interval occurs when p is set to 1
• The ME is maximized when p = 0.5

Q10. There is convincing evidence that Spain has seen a change in its atheism index between 2005 and 2012.

• False
• True

Q11. There is convincing evidence that the United States has seen a change

in its atheism index between 2005 and 2012.

• False
• True

Q12. If in fact there has been no change in the atheism index in the countries listed in Table 4, in how many of those countries would you expect to detect a change (at a significance level of 0.05) simply by chance? Hint: Type 1 error.

• 5
• 1
• 1.95
• 0

Q13. Suppose you’re hired by the local government to estimate the proportion of residents that attend a religious service on a weekly basis. According to the guidelines, the estimate must have a margin of error no greater than 1% with 95% confidence. You have no idea what to expect for p. How many people would you have to sample to ensure that you are within the guidelines?Hint: Refer to your plot of the relationship between p and margin of error. Do not use the data set to answer this question.

• 2401 people
• 9604 people
• At least 2401 people
• At least 9604 people

### Conclusion  