Continuation of Part 1.
Why are random sample better than convenience samples
Random samples are less likely to be biased.
Random samples are more likely to be representative of the population from which they are drawn.
Applying learnings from Part 1 to below:
Sample size = 100
From sample the number of working hours per day is 10.
From entire population the number of working hours is 8.
From this we understand that:
Sample statistic is 10 and population parameter is 8.
Difference between 10 and 8 is called “sampling error”.
We should not be surprised that the sample average is different from population average.
The amount of sampling error in this example is 2 (10 - 8) hours per day.
The sample size is 100.
We might have found sample average closer to population average if we used a larger sample (n= 2000)
x̄ = 10 and μ = 8.
In an experiment, the researcher:
Manipulates the Independent variable,
Measures changes in the Dependent variable,
Ans seeks to control Lurking variables.
A survey was sent to 30000 students, Only 50 of the 1000 samples students completed and returned the survey.
From above we can conclude
There is likely to be non-response bias because most students in the sample did not complete and return the survey.
The students who responded may not have reported accurately about their satisfaction.
The sample may not be representative of all 30000 students for various reasons.
A number that describes a sample is called:
statistic
If average weight was recorded for all freshman at one university, then average is an example of “population parameter”
If you consider average value in the context of all freshmen at all universities, then the value is a sample statistic.
However, if value is from one university,
then it can instead be considered a population parameter since we have the information from every freshman.
If exam score of population is different from sample
This difference that exists between a sample mean and population mean is sampling error.
Statement about the relationship between variables are called:
Hypothesis
Deviation from mean
Deviation meaning how far something differs. Calculate mean and then subtract each input value from the mean. This will show how far each value differs from the mean.
Squared deviation is just the square of the deviation value.
For variance we divide the square of the deviation value by n. But if the values are from a sample then for calculating variance we divide square of deviation by n-1.
Standard Deviation is calculated by doing square root of variance. Standard deviation is represented by the Greek symbol Sigma.
We use the symbol ‘n’ to represent:
Sample size
if ages (in years) was measured of n = 50 people, this tells us that:
sample size is 50 people
50 ages were collected
“Age in years” is operational definition of “age”
Students who listen to classical music in elementary school tend to have higher grade in high school.
The researcher found a correlation between listening to classical music and performance in school,
then mistakenly concluded that listening to classical music caused better performance in school.
Correlation does not imply causation!
This is because lurking variables can result in a correlation between two variables that are not causally related.