data_analysis_assignment_2_instructions.docx

example.docx

Don't use plagiarized sources. Get Your Custom Essay on
STAT 250 Data Analysis Assignment 2
Just from \$10/Page

Unformatted Attachment Preview

STAT 250 Spring 2019 Data Analysis Assignment 2
Your submitted document should include the following items. Points will be deducted if the
following are not included.
1. Type your Name and STAT 250 with your correct section number (e.g. STAT 250-xxx)
right justified and then Data Analysis Assignment #2 centered on the top of page 1
corresponding number and subpart. Keep the answers in order. Do not include the
4. Generate all requested graphs and tables using StatCrunch.
5. Upload your document onto Blackboard as a Word (docx) file or pdf file using the link
6. You may not work with other individuals on this assignment. It is an honor code
violation if you do.
Elements of good technical writing:
Use complete and coherent sentences to answer the questions.
Graphs must be appropriately titled and should refer to the context of the question.
Graphical displays must include labels with units if appropriate for each axis.
Units should always be included when referring to numerical values.
When making a comparison you must use comparative language, such as “greater than”, “less
than”, or “about the same as.”
Ensure that all graphs and tables appear on one page and are not split across two pages.
Type all mathematical calculations when directed to compute an answer ‘by-hand.’
Pictures of actual handwritten work are not accepted on this assignment.
When writing mathematical expressions into your document you may use either an equation
editor or common shortcuts such as:
x can be written as sqrt(x), p̂ can be written as p-hat, x
can be written as x-bar.
1
Problem 1: Game Spinner
We will be comparing empirical (relative frequencies based on an observation of a real-life
process) to theoretical (long-run relative frequency) probabilities. We will use StatCrunch to
simulate this process using a board game spinner three times so that we can determine the total
number of spaces moved in three turns. The board game spinner looks like the image below.
The spinner is equally likely to land on any given section.
a) Build a probability distribution table for the result of a single spin. Present this
b) Simulate using the spinner 3 times by following the steps below:
Step 1: Open up the data set “Spinner Options”. This contains the value on the 8
spinner locations, each option of which is equally likely to occur.
Step 2: Click on Applets → Spinner (the second blue box at the top).
Step 3: For the “Labels” option, select Spaces. For the “Weights” option, select
Weights.
Step 4: Select Compute!
Step 5: A new window labeled “Spinner experiment” should open. Click the Spin
button and Statcrunch will simulate using the spinner once. Click this button 2 more
times so that you have 3 results.
Step 6: Click the Analyze button. The data from your 3 spins should now be stored as
a column in StatCrunch.
Step 7: Resize the Spinner experiment window so that it contains 1 row of 3 spins.
c) In the Spinner experiment window, click the Reset button. Simulate another 3 spins and
store the results in StatCrunch by clicking the Analyze button. Copy an image of the
times to produce a total of four sets of three spins.
2
d) To simulate 3 spins 100 times and find the total number of spaces moved for each, use
the following steps:
Step 1: Under Data → Simulate → select Custom
Step 2: Under “Values in:”, select Spaces. Under “Weights in:”, select Weights.
Step 3: Under “Number of rows and columns:”, enter 3 for Rows and 95 for Columns.
Step 4: Select Compute! You should now have 100 (including 2 from use of the
applet) columns with information for 3 spins.
Step 5: From here go to Stat → Summary Stats → Columns
Step 6: Select all columns except for the Spaces and Weights column (to do this click
on your first “Spins” in the select column(s) box, hold the Shift key, scroll down and
select “Custom95.” You should see 100 columns selected in the white box.
Step 7: Under “Statistics:”, select only Sum.
Step 8: Under “Output:”, check the box for Store in data table.
Step 9: Click Compute!
Make a properly titled and labeled relative frequency histogram out of the resulting Sum
(d).
e) Use your results in part (d) to find the empirical probability of moving 10 or more spaces
in 3 spins. Show the calculation for this empirical probability and state your probability
as a decimal rounded to three decimal places.
f) Calculate the theoretical probability of moving 10 or more spaces in 3 spins (i.e.
obtaining the sum of the spins to be 10 or greater). Use your probability distribution in
part (a) and note that spins are independent. (Hint: Recognize that getting a 1 on spin 1, a
2 on spin 2, and a 1 on spin 3 is a different result than getting a 1 on spin 1, a 1 on spin 2,
and a 2 on spin 3.) Show how you obtained this probability and provide the answer.
g) In a sentence, compare your empirical probability from part (e) to your theoretical
probability in part (f).
h) How would you expect empirical probability in part (e) to change if it had been based on
a simulation of 1000 repetitions and why? Answer this question in one to two sentences.
3
Problem 2: Main Street Speed Limit
A portion of Main Street (Route 236) in Fairfax, VA has a posted speed limit of 35 miles per
hour. Fairfax police collected data on actual speed limits of a sample of 338 vehicles driving on
this portion of Main Street between 2:30 and 3:30 p.m. The data set “Main Street Speed Data”
contains this sample of vehicles speed limits (in MPH) collected over the past six months.
a) Use StatCrunch to construct an appropriately titled and labeled relative frequency
histogram of the vehicle speeds stored in the “Speed” variable. Copy your histogram into
b) What is the shape of this distribution? Answer this question in one complete sentence.
c) Now overlay your histogram with a Normal curve and add a vertical line at the mean.
This can be done by going to Options → Edit in the top left corner of your graph. Inside
the histogram graph box, look for Display Options. Next to “Overlay distrib.:” click the
arrow next to the word –optional– and select Normal. Then, check the box next to mean
under the word “Markers.” Copy and paste this histogram into your document.
d) Do you think it is reasonable to use the normal model in this case? Answer this question
in one complete sentence.
e) Calculate the sample size, the mean, and the standard deviation of the “Speed” variable
using StatCrunch. (Select Stat → Summary Stats → Columns.) Copy and paste this
table into your document. Round the mean and standard deviation to two decimal places
inside this table.
For parts (f) – (h), assume that the distribution of all vehicle speeds in the population is Normal
with the mean and standard deviation found in Part (e) (again use the rounded mean and standard
deviation values). Note: you are using the Normal distribution for the next three calculations.
f) Calculate the probability that a randomly selected vehicle is driving above the posted
speed limit of 35 miles per hour. First, draw a picture with the mean labeled, shade the
area representing the desired probability, standardize, and use the Standard Normal Table
(Table 2 in your text) to obtain this probability. Please take a picture of your hand drawn
sketch and upload it to your Word document (if you do not have this technology, you
may use any other method (i.e. Microsoft paint) to sketch the image). You must type the
rest of your “by hand” work to earn full credit.
g) Verify your answer in part (f) using the StatCrunch Normal calculator (see instructions
below) and copy that image into your document. In addition, write one sentence to
explain what the probability means in context of the question.
h) Use StatCrunch only to calculate the probability that a randomly selected vehicle was
driving between 33 and 37 miles per hour. Copy the Normal distribution image from
to explain what the probability means in context of the question.
4
i) Suppose the police department decided that the top 20% of speeds would automatically
receive a speeding ticket. Determine the minimum speed for which a driver would
receive a speeding ticket. This speed or any speed above it will receive a ticket. Draw a
picture (or two), shade area, and use Table 2 to solve this problem. Please take a picture
of your hand drawn sketch and upload it to your Word document (if you do not have this
technology, you may use any other method (i.e. Microsoft paint) to sketch the image).
You must type the rest of your “by hand” work to earn full credit.
j) Verify your answer in part (i) using the StatCrunch Normal calculator (see instructions
below) and copy that image into your document. In addition, write one sentence to
explain what the probability means in context of the question.
Steps to produce StatCrunch Normal graphs.
Step 1: Open the calculator by selecting Stat → Calculators → Normal as shown below.
Standard – shows area above or below
a specified x value.
Between – shows area between two
specified x values.
Enter the value of the mean and
standard deviation.
Select to change the direction of the
inequality sign to match question.
Enter either a value in first box to find
probability OR a probability in the last
box to find a value.
Step 2: Enter the values for the mean and standard deviation found in part 2d into their respective
boxes.
Problem 3: Celiac Disease
Celiac disease is an autoimmune disorder where the ingestion of gluten leads to damage in the
small intestine. Left untreated, celiac disease can lead to the development of other autoimmune
disorders like Type I diabetes, multiple sclerosis, anemia, and osteoporosis. Generally, the later
in life that celiac disease is diagnosed, the higher the chances of developing another autoimmune
condition. In fact, it is known that 34% of individuals with celiac disease that is first diagnosed
when they are 21 years of age or older will develop another autoimmune condition. Suppose we
are interested in the number of individuals that develop another autoimmune disorder in a
random sample of 9 people with celiac disease first diagnosed after they turn 21. Assume these
people are independent of each other.
a) Check if this situation fits the binomial setting. Write four complete sentences addressing
each requirement in one sentence each.
5
b) Assuming this situation is a binomial experiment, build the probability distribution in
table form in StatCrunch. There are two ways to do this. You may use Data → Compute
→ Expression and choose the function dbinom. This method relies on you entering the
values of the random variable in the first column of your data table. The other way to do
this is to use the binomial calculator and calculate the probability of each of the values of
the random variable from X = 0 to X = 9. You may present this table horizontally or
vertically and leave the probabilities unrounded.
c) Calculate the probability that exactly three people in the sample develop another
autoimmune disorder using the StatCrunch binomial calculator. Copy this image from
to explain what the probability means in context of the question.
d) Calculate the probability that no more than 6 people in the sample develop another
autoimmune disorder using the StatCrunch binomial calculator. Again, provide a
StatCrunch binomial calculator graph to display your answer. Then, once you obtain
your answer, write one sentence to explain what the probability means in context of the
question.
e) Calculate the probability that between 2 and 5 people in the sample (inclusive) develop
another autoimmune disorder. Show your work using the probability distribution you
built in part (b) to answer this question. Then, verify it with a StatCrunch binomial
calculator graph and include this image in your document as well. Finally, once you
obtain your answer, write one sentence to explain what the probability means in context
of the question.
f) Calculate the mean and standard deviation of this probability distribution. Show your
work using the binomial mean and standard deviation formulas and provide your answers
in your document. (No need to use StatCrunch for this part).
Problem 4: Building a Sampling Distribution
We will use the Sampling Distribution applet in StatCrunch to investigate properties of the
sampling distribution of the proportion of students that find themselves distracted by their cell
phone during class. Historically, it is known that 72% of students get distracted by their cell
phone. Under Applets, open the Sampling distribution applet (box shown below). First, select
Binary for the population, then enter the value for p = 0.72, the proportion of students who are
distracted next to “p:” Then click on Compute. See image below.
6
a) Once the applet box is opened, enter 10 in the box to the right of the words “sample size”
in the right middle of the applet box window (see image below). Then, at the top of the
applet, click “1 time.” Watch the resulting animation. When the sample is completed,
copy and paste the entire applet box (using options → copy) into your document.
b) Click Reset at the top of the applet. Then, click the “1000 times” to take 1000 samples of
size 10. Copy and paste the applet image into your document.
c) Describe the shape of the Sample Proportions graph at the bottom of your image from
part (b) in one sentence.
7
d) Why do you think that this graph does not have an approximately Normal shape? Use the
Central Limit Theorem large sample size condition to answer this question in one
sentence. Explicitly show these calculations.
e) Click Reset at the top of the applet. Type 100 in the sample size box. Then, click the
“1000 times” to take 1000 samples of size 100. Copy and paste the applet image into
f) Describe the shape of the Sample Proportions graph at the bottom of your image from
part (e) in one sentence.
g) Why do you think that this graph from part (f) has the shape you described? Use the
Central Limit Theorem large sample size condition to answer this question in one
sentence. Explicitly show these calculations.
h) Using the image in part (e), write the values you obtained for the mean (in green) and the
standard deviation (in blue). These values are found in the bottom right box labeled
“Sample Prop. of 1s.”
i) Compare the mean value (in green, found in part (h)) to the known population proportion
in one sentence.
j) Now calculate the standard error of the sample proportion using p = 0.72 and n = 100 by
hand. Show this calculation “by-hand” and round your answer to three decimal places.
k) Compare the value in part (j) to the standard deviation (in blue) you obtained in part (h)
in one sentence.
l) Finally, use the sampling distribution defined by the Central Limit Theorem to calculate
the probability that from a sample of 100 students at least 80% are distracted by their cell
phones (using p = 0.72 and the standard error found in part (j)). Show your work by
using the formula to calculate the z-value and using the standard Normal probability table
m) Interpret the resulting probability from part (l) in context.
8
1
Sample Solution to Display Formatting
A random sample of 30 students was selected from a STAT 250 course taught during the
summer session and their first exam scores were recorded.
a) Create a histogram in StatCrunch. Be sure to title and label it correctly.
b) Interpret the histogram’s shape
See sample solution and formatting on page 2.
Following the main points will help you submit a professionally completed assignment.
1)
2)
3)
4)
Right justify your name and provide your correct section and the due date.
Center the specific homework assignment title.
Bold each problem complete problem number.
The graph can be around the below size for readability (click on the graph once and only
adjust the size of the graph by using the bottom right dot)
keep the assignment in problem and part order (present 1a, then 1b, and so on).
2
Kenneth Strazzeri