-- usually set at .05, the possibility of rejecting the null
-- null: states that any difference in two means is attributable to chance (sampling error)
ANOVA
Text for "ANOVA": analysis of variance
-- mean squares between: variance among the group means
-- reject the null is there is a least one pair of means with a significant difference
Central Limit Theorem
-- the larger the sample the more likely that sigma-sub-mean will be closer to mu and the distribution would take a normal shape; the "sampling distribution of the means"
-- states that "sampling distribution of the mean (a distribution of sample means, each based on a random sample of size N, has three properties: 1. normally distributed
2. mean equal to mu
3. standard deviation called the standard error of the mean
Confidence Intervals
-- critical values for a two-tailed test: ± ____
Correlation
-- the degree of relationship between two variables; reported numerically
-- positive or negative (e.g. high on both is positive)
-- Pearson C: requires data be interval or ratio
-- r or r(sub.xy)
-- r-squared: "coefficient of determination" = percent of v
-- sign + or -
-- size: 0 to ± 1.0
-- linear relationship between 2 variablesariance in once variable that's predictable from the other
-- P-slanted (rho): symbol for population correlation
-- Spearman Correlation: easier to do by hand; "rank order correlation", ordinal measure (first connect data to ranks); not as precise as Pearson
-- IQ/Achievement: positive relationship (IQ test purports to predict acheivement, but not perfect)
-- high scores on var1 associated with low scores on var2 = negative relationship
-- can't interpret correlations as causal relationships: the establishment of a causal relationship between two variables must be determined by using experimental methods and logic, which goes beyond merely showing a statistical demonstration of association (correlation)
-- a lot of medical research is correlational; smoking & cancer, but can infer a causal link
-- scattergrams: directional
-- use correlations to make predications and/or describe relationships
-- perfect relationship = perfect prediction (will never happen)
-- variables must have variability to correlate with each other
-- types of relationships: linear, no linear, curvilinear, no or little variability on any one of the variables
-- perfect correlation: R = 0
-- Graphing: scatterplot describes the direction of correlation but does not quantify the amount of correlation
-- to determine the amt or degree of association between two vars it is necessary to compute correlation coefficients: describe stats, describe the degree (amount) or association (relationship) between the two vars
-- r(sub.p): Pearson Product-Moment Correlation Coefficient: both variables must have quantitative properties: assesses both the direction and the strength of the relationship; AKA the "coefficient of determination"; measures the % of variance in the x scores that is due to variation in the y scores and conversely
-- Venn diagrams: depict overlapping proportions of shared variance between two vars
-- Spearman Rank-Order Correlation Coefficient: measure between two quantitative vars that are both expressed in ranks
Data Presentation
-- when var takes on many values, group to make table more succinct
-- good description of data is to group the data by categories and report # of occurrences (i,e. frequency)
-- indicating the frequency of each category is frequency distribution
-- proportion 1.0 or percentage 100%: of a particular category devided by total number of observations (N)
-- multiply proportions by 100 to get percent (e.g. 1.0 x 100 = 100%)
-- on nominal graph, categories should be separated and distinct; rank order must touch
- nominal: order on the axis is arbitrary
-- cumulative frequency: to better summarize and describe the data
-- interval: size/amt (high score - low score)+1 (unit); then divide that number by thge desired # of intervals wanted in the distribution; round to an odd unit so that midpoint would be a whole unit (apparent upper and lower limits, not real limits)
-- histogram: same as bar graph but bars touch
-- F curve: AKA polygon
-- single variable: univariate description
-- bivariate description: describes how changes in one variable are realted to values of the other variable
Dependent Grps T-test
--
Descriptive Statistics
-- techniques to describe a sample
Independent Grps T-test
-- populations are normally distributed
-- assumes that variances of the populations are equal (homogeneity)
Inferential Statistics
-- technqiues for infering around a sample to a larger population
Mean Squares Between
-- mean squares between: variance among the group means
-- reject the null is there is a least one pair of means with a significant difference
Measurement Scales
-- nominal: quality: same as "qualitative unordered"; kind
- use a number in place of a name
- not more or less assignment, just a coding classification
- not meaningful or permissible to get a mean
-- ordinal: quality; absence of property; rank order; same as "qualitative ordered"; arbitrary; differences have meaning but not necessarily twice as much
- usually to turn or ration data ranks
- magnitude (how much); military rank
-- : quantitative; magnitude; zero is arbitrary; differences have meanings; property of equal
-- ratio: quantitative; numbers magnitude; equal ; zero is absence of the property (can't have negatives)
-- stats chart; what can be used for each kind of measurement
- nominal: freq. dist. histogram, mode
- ordinal: nominal plus median, range
- : all plus mean, polygon, variance, stand. dev.
- ratio: same as
- anything permissible with lower order var is permissible with higher one
Measures of Central Tendency
"Measures of Central Tendency"
-- mode: most often-occurring score; used to construct a frequency distribution; tallest point the normal distribution; most useful with nominal data
-- median: 50th percentile; divides area around curve two equal parts
-- mean: arithmetic average (x-bar = sample mean, U-tail = population mean; geometric; harmonic; quadratic; sum of all scores divided by number of scores
Measures of Variability
Measures of Variability: -- standard deviation & mean are most common
-- variance: of a distribution is equal to the sum of the square deviation scores divided by the total # of scores in the distribution (N)
-- the higher the variance, the more variability; you must first calculate the mean of the dist. and then subtract the mean from each score in the dist.
-- variability: the spread, dispersion, or scatter of the data; instead of measuring averages, this will measure how the scores in the distribution vary
-- to interpret an individual score correctly we must know both the central T and the variability
-- variation ratio: V = ...
-- inclusive range: (XH-XL)+1
-- exclusive range: (XH-XL)
-- variance: average amount of squared deviation from the mean
-- S-squared: sample variance
-- mu-squared = population variance
-- standard deviation: the square root of variance in a normal or near-normal distribution; indicates average amount (or about 2/3) of deviation in scores around the mean
-- describe the skewness of a frequency polygon
Normal Distribution
-- SEE ALSO: standard scores
-- symmetrical
-- not normal is "skewed"
-- standard deviation is how far on average scores are above or below the mean
-- other distributions: asymetrical, skewed, unimodal, bimodal, peaked, flat, etc.
-- "normal curve": AKA "normal probability distribution" or "bell-shaped curve"; one mode; bilaterally symmetrical (not skewed), md, mo, mean all have same value; the tails do not touch the x axis; theoretically, the tails extend to infinity and do not touch the axis
-- standardized normal curve: mean = 0 (see also, Z scores)
-- normal curve table: shows what proportion of scores in a dist. falls within ceratin areas under the normal curve; has four columns; col. 1: Z scores valued 0 - 3.70; col. 2: specifies proportion of the dist. that falls in the area of the mean to a particular Z score
Prediction & Regression
-- see also: correlation
-- the nature of relationships
-- prediction equation: stat procedure for predicting the value of one var from the value of another
-- bivaraite prediction: uses one var to make predictions about another var
-- multivariate prediction: uses two or more vars to make predictions about another var
-- criterion var: value is estimated in a prediction
-- predictor var: var on whose values a prediction is based
-- regression line: AKA prediction line: a straight line on a graph that can be used to predict the value of the criterion var from the value of the predictor var
-- graph limitations: 1) predicted values are only approximations of the correct value, 2) may be only hypothetical perfect correlation
-- "line of best fit": a prediction line that minimizes the size of errors that are made when using it to make predictions. The "predicted" criterion value (Y-hat) is the value of the Y that is predicted by the regression line.
-- Y-hat: predicted Y value: distinguished from observed Y value
-- e: error of prediction
-- "least squares criterion": is satisfied when the prediction line is placed so that the sum of the squared prediction errors for all observations is as small as possible
Sampling Distributions
-- underlie hypotheses tests and confidence intervals in parametric, inferential statistics
Standard Deviation: Types
Text for "Standard Deviation: Types": from Goodwin, REM 510
-- square root of the variance
-- Deviation Score: the measure of the difference between the mean and an individual score of a distribution; how far away on average a given score is from the mean
-- Symmetrically unimodal Deviation: the mean, median, mode all have same value (skewed: measure of central T are not equal)
-- population standard deviation: variability of population raw scores
-- sample standard deviation: variability of sample raw scores
-- standard error of estimate: variability of ACTUAL scores on Y around a predicted score (Y-hat); used to make population statements about an individual's actual Y given her Y-hat to set confidence intervals around Y-hat; how much error there is in predicting Y from X
-- standard error of the mean: standard deviation of a sampling distribution of the mean; variability of sample means around mu; used to compute a confidence interval around a sample mean, or to conduct a one-group X test; can also be SEOM-estimated by s
-- standard error of mean differences; used to conduct an independent-groups z-test (estimated for t-test)
-- standard error of the correlation: standard deviation of a sampling distribution of the correlation; variability of sample correlations (r) around population correlation; used to compute a confidence interval around a sample correlation (r) or to conduct a significance test on r
-- standard deviation of differences scores: variability of difference raw scores; used to calculate the standard error of differences
-- standard error of differences: standard deviation of s a sampling distribution of difference scores; dep-grps t-test
-- standard error of differences in proportions: standard deviation of a sampling distribution of the proportion; sig test on the dif between two sample populations
Standard Scores
-- Z-Score: to facilitate comparisons across distributions; how far from the mean in standard deviation amounts; not good for reporting because half the scores are negative; other standard scores build on z-scores; converting to z-scores does not change the shape of the distribution; mean = 0, S = 1
-- T-Score: mean = 50, S = 10 (no one would have a negative score, that's its purpose); T = 10z + 50 (converted from Z scores)
-- AGCT: Army Classification Test: mean = 100, S = 20
-- IQ: 15z + 100
-- all scores multiply z by S and add desired mean
-- "new" scale score = "new" S(z) + "new" mean
-- Stanine scores: ordinal, all other standard scores are ordinal; order high to low or vice versa [stair step]; quick way to make normal distribution, convert highly skewed data to stanines to make distribution normal; de-emphasize making individual comparisons
-- scores used to conduct a standardized normal curve
Statistics (definition)
-- techniques for organizing, tabulating, summarizing and interpreting data
Summation
-- PPRMDAS: Pretty Please Remember My Dear Aunt Sally
- P: parenthesis
- R: radical (e.g. square root)
- M: *
- D: /
- A: +
- S: -
-- sum of deviations from the mean will always = 0
Trivia
-- asymptotic: theoretically the normal distribution never touches the x axis
Type I Error
-- in hypothesis testing, Type I error is synonymous with level of significance
Variable
-- anything that can change; opposite is "constant" (e,g. we are human)
-- x for first var, y for second, z for third, n for # of cases (usually subjects)
-- sigma: summation sign; sum whatever follows it
Contact Us
Enter feedback, comments, questions, or suggestions:
Email this page
Add or change any text to your message in the text field below: