In this assignment, there’s two questions(Q1 and Q2) broken down into a few more questions. I have provided the Lecture notes document, only pgs. 73-93 are needed for this. This assignment should take no longer than an hour.Thank you.
assignment_8.doc
lecture_notes_c.doc
Unformatted Attachment Preview
1
Unit/Assignment 8 Research Results & Interpretation: Inferential Statistics (Monday March 25-Sunday March 31)
Reading:
Chapter 7: Generalizing From Research Results: Inferential Statistics
Lecture Notes: Under ‘Resources,’ you will find ‘Lecture Notes C,’ pp. 73-93.
Assignment 8:
From Book Chapter & Lecture Notes:
Q1. Suppose you wish to test the hypothesis that the number of print media (newspaper, magazine,
and so forth) that people subscribe to is related to subscribers’ education levels. The following
hypothetical data was gathered from five people.
Number of subscriptions, x
4
5
1
2
3
Education level, y (years)
18
20
5
9
15
(1) Draw a scatter diagram.
y
x
(2) Formulate null and research hypotheses that can be tested adequately using a correlation
analysis.
H0: ______________________________________ (p=0)
H1: ______________________________________ (p=0)
(3) If the computed t value is in the rejection region (r=.79, p<.05), can you reject the null
hypothesis at the p<.05 level of significance? Why? Show your work for t test.
Q2. Design a media effects study for which a correlation analysis is appropriate.
Then, provide the following information to conduct the study.
IV: ____________________________
DV:____________________________
H1: _______________________________________________________
Assignment Point: 3 points (3% of 100 semester points)*
*To receive the maximum 3 points, you should provide all relevant answers/information to the questions/exercises.
This is an individual assignment, not a group assignment. If you have any questions about the assignment,
please don’t hesitate to e-mail me between Monday and Friday (I am not available on Saturday & Sunday).
Assignment Due: All work for this unit is due by Sunday March 31, 2019 at 11:59 pm.
Please email your assignment directly to me: [email protected] (NOT via Reggienet).
Lecture Notes C
Contents:
1. Descriptive Statistics vs. Inferential (Inductive) Statistics
p. 2
2. Hypothesis Test (t-test)
p. 39
3. Hypothesis Test (z-test)
p. 52
4. Hypothesis Test (T-test)
p. 65
5. Correlation Analysis
p. 73
1. Descriptive Statistics vs. Inferential Statistics
Descriptive Statistics: Methods used to describe the data
that have been collected.
Inferential Statistics: A decision, estimate, prediction, or
generalization about a population
based on a sample.
Important Statistical Symbols
Mean
Population
Sample
Variance
µ (mu)
62
_
x
s2
Standard.
Deviation
6(sigma)
s
Frequency
N
n
Descriptive Statistics
Average:
A single value that is representative of the set of
data considered as a whole. An average locates
the "center" of a set of data. The commonly
used average are the arithmetic mean, the
median, and the mode.
Dispersion: The opposite meaning of “Average.”
Represents the deviation from the mean.
Tells how values scatter about the average score.
Average: A single value that is representative of the set of
data considered as a whole. An average locates the
"center" of a set of data. The commonly used
average are (1) the mean, (2) the median, and
(3) the mode.
1. Mean:
An average, computed by summing the values
of several observations and dividing by the
number of observations.
2. Median: Another average, representing the value of the
"middle" case in a rank-ordered set of
observations.
3. Mode: Still another average, representing the most
frequently observed value or attribute.
Average
1. Mean (x): Sum of values divided by the number of
values.
∑X
(Sum of the values)
n
(Number of values)
2. Median: The midpoint of the values after they have
been arranged from the smallest to the
largest.
Midpoint (Median) Formula: (n + 1) / 2
3. Mode: The value of the observation that appears most
_ often.
(Ex 1) Average
Midterm grades from 7 students: 80, 90, 90, 100, 85, 90, 95.
Jason
Zac
Jennifer
Morgan
Matt
Emily
Jane
80
90
90
100
85
90
95
1. Mean (the arithmetic average):
(80 + 90 + 90 + 100 + 85 + 90 + 95) / 7 = 90
2. Median (the number in the middle):
In order to find the median, you have to put the
values in order from lowest to highest, then find the
number that is exactly in the middle:
80
85
90
Median is 95. 95
90
90
100
Midpoint (4th person)
Midpoint: (n+1)/2
(7+1)/2 = 4
* n (sample size: 7 students)
3. Mode (the value that occurs most often).
80
85
90
Mode is 90. 90
90
95
100
Since there are 3 90's, the mode is 90.
(Ex 2) Average
Calculate average age of 17 students:
21, 19, 22, 20, 23, 22, 19, 21, 20, 21, 22, 20, 22, 21, 20, 21, 23
1. Mean Age:
__________
2. Median Age:
__________
3. Mode Age:
__________
(Ex. 2) Age from 17 students:
21, 19, 22, 20, 23, 22, 19, 21, 20, 21, 22, 20, 22, 21, 20, 21,23
Mean:
(21+19+22+20+23+22+19+21+20+21+22+20+22+21+20+21+23)/17= 21
Median: The number in the middle
19
19
20
20
20
20
21
21
21
21
21
22
22
22
22
23
23
Midpoint (9th person)
Median
Midpoint: (n*+1)/2 = (17+1)/2 = 9th person
* n (sample size: 17 students)
Mode:
19
19
20
20
20
20
21
21
21
21
21
22
22
22
22
23
23
The value that occurs most often
2 cases (19 years old)
4 cases (20 years old)
5 cases (21 years old)
4 cases (22 years old)
2 cases (23 years 0ld)
Mode
Frequency Distribution Table
(Ex. 2) Age from 17 students:
21, 19, 22, 20, 23, 22, 19, 21, 20, 21, 22, 20, 22, 21, 20, 21,23
Frequency Distribution Table
Age (x)
Frequency (f)
fx
19
20
21
22
23
2
4
5
4
2
38
80
105
88
46
17
∑fx=357
_Frequency Distribution Table
Mean(x):
357÷17 = 21
Median:
21
Mode:
21
9th= (n+ 1) / 2
Frequency Distribution (Graph)
y (freq)
5
●
●
4
●
3
2
●
●
1
19
Age (x)
19
20
21
22
23
20
21
Frequency (f)
2
4
5
4
2
22
23
x(age)
Standard Normal Distribution
THE NORMAL DISTRIBUTION HAS THE FOLLOWING
CHARACTERISTICS
1. The normal curve has a single peak at the precise center of the
distribution. The mean, median, and mode -which in a normal
distribution are equal- are all located at the peak. Therefore,
exactly one-half, or 50%, or the area created by the bell-shaped
curve is below the center of the distribution, and exactly onehalf of the area is above it.
2. A normal distribution is symmetrical about its mean.
If you were to fold the distribution along its central value,
the two halves would be identical.
3. The normal curve falls off smoothly in a "bell shape" and
the two tails of the distribution extend indefinitely in either
direction.
Bell-shaped and symmetric
Not symmetric
- Long tail on the left.
- Mean is toward tail.
Not bell-shaped
- Long tail on the right.
- Mean is toward tail.
(Ex 1) Course Packet p.23
A researcher is studying the impact of television on the family. Of
particular interest is the number of hours of television school-aged
children watch each day. A random sample of 50 homes reveals the
following number of hours watched on an average weekday.
Number of Hoursa (x)
2
4
6
8
9
Frequency (f)
9
11
15
12
3
Q. Compute Mean, Median, and Mode TV viewing hours.
Frequency Table
x (TV Hour)
Freq (f)
fx
9
11
15
12
3
18
44
90
96
27
n=50
∑fx=275
2
4
6
8
9
Mean:
275
50
= 5.5
Median:
6
Mode:
6
Midpoint: (50+1)/2 = 25.5
Dispersion
Dispersion: The opposite meaning of “Average.”
Represents the deviation from the mean.
Tells how values scatter about the average
score.
The degree of scatter of data about an
average value, usually the mean or median.
Measures of dispersion are:
- Range
- Variance
- Standard Deviation
Dispersion
1. Range:
The difference between the highest and
lowest observations in a set of data.
2. Variance: The arithmetic mean of the squared
deviations from the mean.
∑d2
S2
=
n-1
3. Standard Deviation: The square root of the arithmetic
mean of the squared deviations from the
mean.
∑d2
S
S =
=
n-1
n-1
(Ex 1) Dispersion (The amount of variability in a set of scores)
TV viewing hours from 9 students: 2, 3, 4, 5, 5, 5, 6, 7, 8
Range: ___________________
Variance: __________________
Standard Deviation: _________
1. Range: The difference between the highest and lowest data element.
Range: 8 - 2 = 6 hours
2. Variance (Four Steps to get Variance):
Step 1: Compute the sample mean to get the deviation
from the mean.
_
Step 2: Compute the deviation (d = x - x).
Step 3: Square each value of d (d2).
Step 4: Use variance formula.
_____________________________________________
Step 1
Step 2
Step 3
Values of
Deviations of x
Deviations
from the mean
squared
x
(d =
)
d 2=
2 – 5 = -3
3 – 5 = -2
4 – 5 = -1
5–5= 0
5–5= 0
5–5= 0
6–5= 1
7–5= 2
8–5= 3
2
3
4
5
5
5
6
7
8
_
X=5
9
4
1
0
0
0
1
4
9
∑d2 = 28
(Sum of the Squared Deviations)
Step 4:
S
2
=
∑d2
n-1
=
28
9-1 = 3.5
3. Standard Deviation:
The Standard deviation is another way to calculate dispersion.
This is the most common and useful measure because it is the
average distance of each score from the mean.
The formula for sample standard deviation is as follows.
Variance ( S
2)
=
∑d2
n-1
Standard Deviation (S) =
=
28
9-1 = 3.5
∑d2
n-1
=
28
9-1 = 3.5 = 1.87
Standard deviation (Reason for the square root of the variance)
- Variance formula might give misleading in measuring dispersion.
- Squaring deviation gives too much weight to extreme values by
squaring deviation (d2).
- Standard deviation represents the square root of the variance.
- Standard deviation is the best measure of dispersion.
Interpretation of Standard Deviation
Interpretation (+ 1 SD):
About 68% of the values fall in between (
– 1 SD) and (
+ 1 SD).
– 2 SD) and (
+ 2 SD).
– 3 SD) and (
+ 3 SD).
Interpretation (+ 2 SD):
About 95% of the values fall in between (
Interpretation (+ 3 SD):
About 99% of the values fall in between (
TV viewing hours from 9 students: 2, 3, 4, 5, 5, 5, 6, 7, 8
Standard Deviation: 1.87
1.87+1.87
1.87+1.87+1.87
(1 SD)
(2 SD)
(3 SD)
Interpretation (+ 1 SD):
About 68% of the values fall in between (
About 68% of the values fall in between (
– 1 SD) and (
+ 1 SD).
– 1 SD) and ( + 1 SD).
5
1.87
5
1.87
About 68% of the sampled students watch television
between (5-1.86) and (5+1.87) hours per day.
About 68% of the sampled students watch television
between 3.13 and 6.87 hours per day.
Interpretation (+ 2 SD):
About 95% of the values fall in between (
– 2 SD) and (
5 (1.87+1.87) 5
About 95% of the sampled students watch television
between (5-3,74) and (5+3.74) hours per day.
About 95% of the sampled students watch television
between 1.26 and 8.74 hours per day.
+ 2 SD).
(1.87+1.87)
Interpretation of Standard Deviation
99% (+3 SD)
95% (+2 SD)
68% (+1 SD)
Interpretation (+1 Standard Deviation)
-1 SD
+1 SD
1.87 hrs 1.87 hrs
__
3.13 hrs X
6.87 hrs
(5 hrs-1.87)
(5 hrs+1.87)
Interpretation: About 68% of the values fall in between 3.13 and 6.87hrs.
Interpretation: About 95% of the sampled students watch television
between 3.13 and 6.87 hours per day.
Interpretation (+2 Standard Deviation)
-1 SD -1 SD
+1 SD +1 SD
1.87 hrs1.87 hrs1.87 hrs 1.87 hrs
1.26 hrs
(5 hrs-1.87-1.87)
__
X
8.74 hrs
(5 hrs+1.87+1.87)
Interpretation: About 95% of the values fall in between 1.26 and 8.74hrs.
Interpretation: About 95% of the sampled students watch television
between 1.26 and 8.74 hours per day.
The greater the deviation from the mean, the larger the S.D.Score.
(Ex 3) Twelve families live on Cherry Hill Circle.
The number of children in each family is:
1, 2, 3, 5, 4, 3, 3, 4, 5, 3, 5, and 4.
Q. Use the appropriate formula to compute
the variance and the standard deviation.
Variance (Four Steps to get Variance):
Step 1: Compute the sample mean to get the deviation
from the mean.
Step 2: Compute the deviation (d = x - ).
Step 3: Square each value of d (d2).
Step 4: Use the Variance formula.
Then, use the Standard Deviation formula.
Values of
x
1
2
3
3
3
3
4
4
4
5
5
5
= 42
12
S
2
=
∑d2
n-1
S = 1.24
Deviations of x
from the mean
(d =
)
Deviations
squared
d 2=
1 - 3.5 = -2.5
2 – 3.5 = -1.5
3 – 3.5 = -.5
3 – 3.5 = -.5
3 – 3.5 = -.5
3 – 3.5 = -.5
4 – 3.5 = +.5
4 – 3.5 = +.5
4 – 3.5 = +.5
5 – 3.5 = +1.5
5 – 3.5 = +1.5
5 – 3.5 = +1.5
6.25
2.25
.25
.25
.25
.25
.25
.25
.25
2.25
2.25
2.25
∑d2 = 17
= 3.5
=
17
12-1 = 1.55
Variance
Standard Deviation
Interpretation (+1 SD):
1 SD=1.24 (68%)
3.5 kids+1.24
(2.26 < x < 4.74)
Interpretation (+2 SD):
2 SD=2.48 (95%)
3.5 kids+2.48
(1.02 < x < 5.98)
Interpretation (+ 1 Standard Deviation)
-1 SD
+1 SD
1.24
1.24
__
2.26 kids X
4.74 kids
(3.5 -1.24)
(3.5 +1.24)
Interpretation: About 68% of the values fall in between 2.26 and 4.74.
Interpretation: ____________________________________________
Interpretation (+2 Standard Deviation)
-1 SD -1 SD
1.24
1.02
(3.5-1.24-1.24)
+1 SD +1 SD
1.24
1.24
__
X
1.24
5.98
(3.5+1.24+1.24)
Interpretation: About 95% of the values fall in between ____ and _____.
Interpretation: ______________________________________________.
EX 4. The test kitchen wants to compare the variation in the
weights of the blueberry cakes in the previous selfreview with the peach cakes--using the variance and
the standard deviation. The weights of a sample of
blueberry cakes were 484, 503, 496, 510, 491, and
516 grams.
Q. Compute the variance and the standard deviation.
Values of
x
Deviations of x
from the mean
(d =
)
484
491
496
503
510
516
484 - 500 = -16
491 - 500 = -9
496 - 500 = -4
503 - 500 = +3
510 - 500 = +10
516 - 500 = +16
X = 3,000 = 500
6
S
2
=
∑d2
n-1
S = 11.98
=
718
6-1 = 143.6
Deviations
squared
d 2=
256
81
16
9
100
256
∑d2 = 718
Variance
Standard Deviation
___________________________________________________________
Interpretation of Standard Deviation
Interpretation (+1 SD):
About 68% of the values fall in between (
– 1 SD) and (
500g 11.98
+ 1 SD).
500g
11.98
About 68% of the sampled blueberry cakes weigh between 488.02 (50011.98) and 511.98 (500+11.98) grams.
Interpretation (+2 SD):
About 95% of the values fall in between (
– 2 SD) and (
500g (2x11.98)
+ 2 SD).
500g
(2x11.98)
About 95% of the sampled blueberry cakes weigh between 476.04 (50023.96) and 523.96 (500+23.96) grams.
Interpretation (+3 SD):
About 99% of the values fall in between (
– 3 SD) and (
500g (3x11.98)
+ 3 SD).
500g
(3x11.98)
About 99% of the sampled blueberry cakes weigh between 464.06 (50035.94) and 535.94 (500+35.94) grams.
2. Hypothesis Test (t-test)
Comparing Sample Mean with Population Mean (t-test)
- Known as Student’s t-test.
- A statistical procedure used to examine the difference
between the sample mean and the population mean
when the population standard deviation is unknown.
- Use t-test when sample size is small.
- One sample t-test (Ex. Comparing Sample Mean with
Population Mean).
t-testFormula
Formula
t-test
-µ
t=
s
√n
= Sample Mean
µ = Population Mean
s = Sample Standard Deviation
n = Sample size
5 Step Hypothesis Test (t-test)
1. The null hypothesis and research hypothesis are stated.
2. A level of significance is selected.
3. A test statistic is chosen.
4. A decision rule is formulated.
5. One or more samples are selected, the test statistic
computed, and a decision made either to accept or reject
the null hypothesis.
t-test (Example):
Number of students in the School of Communication: 800 students (GPA=2.8)
Number of Com Ed majors in the SOC:
100 students (GPA=3.0)
_
x = Sample Mean (3.0)
µ = Population Mean (2.8)
s = Sample Standard Deviation (.8)
n = Sample size (100)
Compare the sample mean (3.0) with the population mean (2.8).
Q: Is Com Ed. GPA (3.0) higher than SOC GPA (2.8)?
Comparing sample mean (3.0) with population mean (2.8).
SOC all majors’ GPA = 2.8
(N=800)
Com Ed. Majors’ GPA = 3.0
(n=100, s=.08)
5 Step Hypothesis Test (t-test)
1. The null hypothesis and alternative _hypothesis are stated.
_
Null Hypothesis (Ho):
µ=x
H0: There is no difference between sample (3.0) mean and population mean (2.8).
_
Research Hypothesis (H1): µ ≠ x)
H1: There is difference between sample (3.0) mean and population mean (2.8).
2. A level of significance is selected.
Probability 95%: P < .05 (Level of Significance of .05)
P < .05 (95%)
(with df= n - 1)
Critical
Region
Critical Value
Degree of Freedom
Critical Region
??
>?
?
??
>?
?
Critical Value
Probability 99%: P < .01 (Level of significance of .01)
(with df= n - 1)
P < .01 (99%)
Critical Region
??
>?
?
??
>?
?
Critical Value
3. A test statistic is chosen.
t-test: Comparing Sample Mean with Population Mean
_
x -µ
s
√n
t=
_
x = Sample Mean (3.0)
µ = Population Mean (2.8)
s = Sample Standard Deviation (.8)
_
n = Sample
size (100)
t=
t=
s_
3.0 – 2.8
.8
√100
t value is 2.5
=
.2
.8
10
=
.2
.08
=
2.5
4. A decision rule is formulated.
Define critical value and region: df (Degree of freedom) = n-1
Probability 95%: P < .05 (Level of Significance of .05)
(with df= n - 1)
P < .05 (95%)
Crtical
Region
Critical Value
Critical
Region
-1.984
+1.984
Critical Value
To get the critical value:
1. Go to p. 46 (Critical Values of Student's t distribution: Two-tailed Value)
2. Go to the first column, get the degree of freedom (df=n-1): 99 (df=100 students-1)
Find df=99 (p. 48). Then, move horizontally to 0.05 column and get the critical value of
1.984
Decision Rule
- If t-value (t=2.5 from Step 3) falls in the critical region, reject the
null (H0) and accept the research hypothesis (H1).
If ‘t value’ is higher than +1.984, it falls in the critical region (red area).
If ‘t value’ is lower than -1.984, it also falls in the critical region (red area).
- If t-value (t=2.5 from Step 3) falls in outside the critical region,
fail to reject the null (H0).
If ‘t value’ is between +1.984 and -1.984, then it falls in the white area (Non-critical
region.
Critical Values of Student's t distribution
(Two-tailed Value)
df
0.20
0.10
0.05
0.02
0.01
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
3.078
1.886
1.638
1.533
1.476
1.440
1.415
1.397
1.383
1.372
1.363
1.356
1.350
1.345
1.341
1.337
1.333
1.330
1.328
1.325
1.323
1.321
1.319
1.318
1.316
1.315
1.314
1.313
6.314
2.920
2.353
2.132
2.015
1.943
1.895
1.860
1.833
1.812
1.796
1.782
1.771
1.761
1.753
1.746
1.740
1.734
1.729
1.725
1.721
1.717
1.714
1.711
1.708
1.706
1.703
1.701
12.706
4.303
3.182
2.776
2.571
2.447
2.365
2.306
2.262
2.228
2.201
2.179
2.160
2.145
2.131
2.120
2.110
2.101
2.093
2.086
2.080
2.074
2.069
2.064
2.060
2.056
2.052
2.048
31.821
6.965
4.541
3.747
3.365
3.143
2.998
2.896
2.821
2.764
2.718
2.681
2.650
2.624
2.602
2.583
2.567
2.552
2.539
2.528
2.518
2.508
2.500
2.492
2.485
2.479
2.473
2.467
63.657
9.925
5.841
4.604
4.032
3.707
3.499
3.355
3.250
3.169
3.106
3.055
3.012
2.977
2.947
2.921
2.898
2.878
2.861
2.845
2.831
2.819
2.807
2.797
2.787
2.779
2.771
.763
df
0.20
0.10
0.05
0.02
0.01
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
1.311
1.310
1.309
1.309
1.308
1.307
1.306
1.306
1.305
1.304
1.304
1.303
1.303
1.302
1.302
1.301
1.301
1.300
1.300
1.299
1.299
1.299
1.298
1.298
1.298
1.297
1.297
1.297
1.297
1.296
1.296
1.699
1.697
1.696
1.694
1.692
1.691
1.690
1.688
1.687
1.686
1.685
1.684
1.683
1.682
1.681
1.680
1.679
1.679
1.678
1.677
1.677
1.676
1.675
1.675
1.674
1.674
1.673
1.673
1.672
1.672
1.671
2.045
2.042
2.040
2.037
2.035
2.032
2.030
2.028
2.026
2.024
2.023
2.021
2.020
2.018
2.017
2.015
2.014
2.013
2.012
2.011
2.010
2.009
2.008
2.007
2.006
2.005
2.004
2.003
2.002
2.002
2.001
2.462
2.457
2.453
2.449
2.445
2.441
2.438
2.434
2.431
2.429
2.426
2.423
2.421
2.418
2.416
2.414
2.412
2.410
2.408
2.407
2.405
2.403
2.402
2.400
2.399
2.397
2.396
2.395
2.394
2.392
2.391
2.756
2.750
2.744
2.738
2.733
2.728
2.724
2.719
2.715
2.712
2.708
2.704
2.701
2.698
2.695
2.692
2.690
2.687
2.685
2.682
2.680
2.678
2.676
2.674
2.672
2.670
2.668
2.667
2.665
2.663
2.662
df
60.
61.
62.
63.
64.
65.
66.
67.
68.
69.
70.
71.
72.
73.
74.
75.
76.
77.
78.
79.
80.
81.
82.
83.
84.
85 ...
Purchase answer to see full
attachment