CPPLogo2006
#Page Contents#Folder Contents#What's New?#Translations#Email this page#Search
Realms
Home
Education
Positive Practices
Learning by Design
  Research and Evaluation


amazon.com

David M. Fetterman: Empowerment Evaluation: Knowledge and Tools for Self-Assessment and Accountability

What's New?
Link Mania

Index: Research and Evaluation

Descriptive Statistics (Notes)

At present, this page is just a collection of notes on various topics related to descriptive statistics.

Page Contents

Alpha & the Null   dot   ANOVA   dot   Central Limit Theorem   dot   Confidence Intervals   dot   Correlation   dot   Data Presentation   dot   Dependent Grps T-test   dot   Descriptive Statistics   dot   Independent Grps T-test   dot   Inferential Statistics   dot   Mean Squares Between   dot   Measurement Scales   dot   Measures of Central Tendency   dot   Measures of Variability   dot   Normal Distribution   dot   Prediction & Regression   dot   Sampling Distributions   dot   Standard Deviation: Types   dot   Standard Scores   dot   Statistics (definition)   dot   Summation   dot   Trivia   dot   Type I Error   dot   Variable


Alpha & the Null

-- usually set at .05, the possibility of rejecting the null

-- null: states that any difference in two means is attributable to chance (sampling error)
#prev#next#top#bottom

ANOVA

Text for "ANOVA": analysis of variance

-- mean squares between: variance among the group means

-- reject the null is there is a least one pair of means with a significant difference
#prev#next#top#bottom

Central Limit Theorem

-- the larger the sample the more likely that sigma-sub-mean will be closer to mu and the distribution would take a normal shape; the "sampling distribution of the means"

-- states that "sampling distribution of the mean (a distribution of sample means, each based on a random sample of size N, has three properties: 1. normally distributed

2. mean equal to mu

3. standard deviation called the standard error of the mean
#prev#next#top#bottom

Confidence Intervals

-- critical values for a two-tailed test: ± ____
#prev#next#top#bottom

Correlation

-- the degree of relationship between two variables; reported numerically

-- positive or negative (e.g. high on both is positive)

-- Pearson C: requires data be interval or ratio

-- r or r(sub.xy)

-- r-squared: "coefficient of determination" = percent of v

-- sign + or -

-- size: 0 to ± 1.0

-- linear relationship between 2 variablesariance in once variable that's predictable from the other

-- P-slanted (rho): symbol for population correlation

-- Spearman Correlation: easier to do by hand; "rank order correlation", ordinal measure (first connect data to ranks); not as precise as Pearson

-- IQ/Achievement: positive relationship (IQ test purports to predict acheivement, but not perfect)

-- high scores on var1 associated with low scores on var2 = negative relationship

-- can't interpret correlations as causal relationships: the establishment of a causal relationship between two variables must be determined by using experimental methods and logic, which goes beyond merely showing a statistical demonstration of association (correlation)

-- a lot of medical research is correlational; smoking & cancer, but can infer a causal link

-- scattergrams: directional

-- use correlations to make predications and/or describe relationships

-- perfect relationship = perfect prediction (will never happen)

-- variables must have variability to correlate with each other

-- types of relationships: linear, no linear, curvilinear, no or little variability on any one of the variables

-- perfect correlation: R = 0

-- Graphing: scatterplot describes the direction of correlation but does not quantify the amount of correlation

-- to determine the amt or degree of association between two vars it is necessary to compute correlation coefficients: describe stats, describe the degree (amount) or association (relationship) between the two vars

-- r(sub.p): Pearson Product-Moment Correlation Coefficient: both variables must have quantitative properties: assesses both the direction and the strength of the relationship; AKA the "coefficient of determination"; measures the % of variance in the x scores that is due to variation in the y scores and conversely

-- Venn diagrams: depict overlapping proportions of shared variance between two vars

-- Spearman Rank-Order Correlation Coefficient: measure between two quantitative vars that are both expressed in ranks
#prev#next#top#bottom

Data Presentation

-- when var takes on many values, group to make table more succinct

-- good description of data is to group the data by categories and report # of occurrences (i,e. frequency)

-- indicating the frequency of each category is frequency distribution

-- proportion 1.0 or percentage 100%: ƒ of a particular category devided by total number of observations (N)

-- multiply proportions by 100 to get percent (e.g. 1.0 x 100 = 100%)

-- on nominal graph, categories should be separated and distinct; rank order must touch

- nominal: order on the axis is arbitrary

-- cumulative frequency: to better summarize and describe the data

-- interval: size/amt (high score - low score)+1 (unit); then divide that number by thge desired # of intervals wanted in the distribution; round to an odd unit so that midpoint would be a whole unit (apparent upper and lower limits, not real limits)

-- histogram: same as bar graph but bars touch

-- F curve: AKA polygon

-- single variable: univariate description

-- bivariate description: describes how changes in one variable are realted to values of the other variable
#prev#next#top#bottom

Dependent Grps T-test

--
#prev#next#top#bottom

Descriptive Statistics

-- techniques to describe a sample
#prev#next#top#bottom

Independent Grps T-test

-- populations are normally distributed

-- assumes that variances of the populations are equal (homogeneity)
#prev#next#top#bottom

Inferential Statistics

-- technqiues for infering around a sample to a larger population
#prev#next#top#bottom

Mean Squares Between

-- mean squares between: variance among the group means

-- reject the null is there is a least one pair of means with a significant difference
#prev#next#top#bottom

Measurement Scales

-- nominal: quality: same as "qualitative unordered"; kind

- use a number in place of a name

- not more or less assignment, just a coding classification

- not meaningful or permissible to get a mean

-- ordinal: quality; absence of property; rank order; same as "qualitative ordered"; arbitrary; differences have meaning but not necessarily twice as much

- usually to turn or ration data ranks

- magnitude (how much); military rank

-- : quantitative; magnitude; zero is arbitrary; differences have meanings; property of equal

-- ratio: quantitative; numbers magnitude; equal ; zero is absence of the property (can't have negatives)

-- stats chart; what can be used for each kind of measurement

- nominal: freq. dist. histogram, mode

- ordinal: nominal plus median, range

- : all plus mean, polygon, variance, stand. dev.

- ratio: same as

- anything permissible with lower order var is permissible with higher one
#prev#next#top#bottom

Measures of Central Tendency

"Measures of Central Tendency"

-- mode: most often-occurring score; used to construct a frequency distribution; tallest point the normal distribution; most useful with nominal data

-- median: 50th percentile; divides area around curve two equal parts

-- mean: arithmetic average (x-bar = sample mean, U-tail = population mean; geometric; harmonic; quadratic; sum of all scores divided by number of scores
#prev#next#top#bottom

Measures of Variability

Measures of Variability: -- standard deviation & mean are most common

-- variance: of a distribution is equal to the sum of the square deviation scores divided by the total # of scores in the distribution (N)

-- the higher the variance, the more variability; you must first calculate the mean of the dist. and then subtract the mean from each score in the dist.

-- variability: the spread, dispersion, or scatter of the data; instead of measuring averages, this will measure how the scores in the distribution vary

-- to interpret an individual score correctly we must know both the central T and the variability

-- variation ratio: V = ...

-- inclusive range: (XH-XL)+1

-- exclusive range: (XH-XL)

-- variance: average amount of squared deviation from the mean

-- S-squared: sample variance

-- mu-squared = population variance

-- standard deviation: the square root of variance in a normal or near-normal distribution; indicates average amount (or about 2/3) of deviation in scores around the mean

-- describe the skewness of a frequency polygon
#prev#next#top#bottom

Normal Distribution

-- SEE ALSO: standard scores

-- symmetrical

-- not normal is "skewed"

-- standard deviation is how far on average scores are above or below the mean

-- other distributions: asymetrical, skewed, unimodal, bimodal, peaked, flat, etc.

-- "normal curve": AKA "normal probability distribution" or "bell-shaped curve"; one mode; bilaterally symmetrical (not skewed), md, mo, mean all have same value; the tails do not touch the x axis; theoretically, the tails extend to infinity and do not touch the axis

-- standardized normal curve: mean = 0 (see also, Z scores)

-- normal curve table: shows what proportion of scores in a dist. falls within ceratin areas under the normal curve; has four columns; col. 1: Z scores valued 0 - 3.70; col. 2: specifies proportion of the dist. that falls in the area of the mean to a particular Z score
#prev#next#top#bottom

Prediction & Regression

-- see also: correlation

-- the nature of relationships

-- prediction equation: stat procedure for predicting the value of one var from the value of another

-- bivaraite prediction: uses one var to make predictions about another var

-- multivariate prediction: uses two or more vars to make predictions about another var

-- criterion var: value is estimated in a prediction

-- predictor var: var on whose values a prediction is based

-- regression line: AKA prediction line: a straight line on a graph that can be used to predict the value of the criterion var from the value of the predictor var

-- graph limitations: 1) predicted values are only approximations of the correct value, 2) may be only hypothetical perfect correlation

-- "line of best fit": a prediction line that minimizes the size of errors that are made when using it to make predictions. The "predicted" criterion value (Y-hat) is the value of the Y that is predicted by the regression line.

-- Y-hat: predicted Y value: distinguished from observed Y value

-- e: error of prediction

-- "least squares criterion": is satisfied when the prediction line is placed so that the sum of the squared prediction errors for all observations is as small as possible
#prev#next#top#bottom

Sampling Distributions

-- underlie hypotheses tests and confidence intervals in parametric, inferential statistics
#prev#next#top#bottom

Standard Deviation: Types

Text for "Standard Deviation: Types": from Goodwin, REM 510

-- square root of the variance

-- Deviation Score: the measure of the difference between the mean and an individual score of a distribution; how far away on average a given score is from the mean

-- Symmetrically unimodal Deviation: the mean, median, mode all have same value (skewed: measure of central T are not equal)

-- population standard deviation: variability of population raw scores

-- sample standard deviation: variability of sample raw scores

-- standard error of estimate: variability of ACTUAL scores on Y around a predicted score (Y-hat); used to make population statements about an individual's actual Y given her Y-hat to set confidence intervals around Y-hat; how much error there is in predicting Y from X

-- standard error of the mean: standard deviation of a sampling distribution of the mean; variability of sample means around mu; used to compute a confidence interval around a sample mean, or to conduct a one-group X test; can also be SEOM-estimated by s

-- standard error of mean differences; used to conduct an independent-groups z-test (estimated for t-test)

-- standard error of the correlation: standard deviation of a sampling distribution of the correlation; variability of sample correlations (r) around population correlation; used to compute a confidence interval around a sample correlation (r) or to conduct a significance test on r

-- standard deviation of differences scores: variability of difference raw scores; used to calculate the standard error of differences

-- standard error of differences: standard deviation of s a sampling distribution of difference scores; dep-grps t-test

-- standard error of differences in proportions: standard deviation of a sampling distribution of the proportion; sig test on the dif between two sample populations
#prev#next#top#bottom

Standard Scores

-- Z-Score: to facilitate comparisons across distributions; how far from the mean in standard deviation amounts; not good for reporting because half the scores are negative; other standard scores build on z-scores; converting to z-scores does not change the shape of the distribution; mean = 0, S = 1

-- T-Score: mean = 50, S = 10 (no one would have a negative score, that's its purpose); T = 10z + 50 (converted from Z scores)

-- CEEB: College Entrance Exam Boards: CEEB = 100z + 500

-- AGCT: Army Classification Test: mean = 100, S = 20

-- IQ: 15z + 100

-- all scores multiply z by S and add desired mean

-- "new" scale score = "new" S(z) + "new" mean

-- Stanine scores: ordinal, all other standard scores are ordinal; order high to low or vice versa [stair step]; quick way to make normal distribution, convert highly skewed data to stanines to make distribution normal; de-emphasize making individual comparisons

-- scores used to conduct a standardized normal curve
#prev#next#top#bottom

Statistics (definition)

-- techniques for organizing, tabulating, summarizing and interpreting data
#prev#next#top#bottom

Summation

-- PPRMDAS: Pretty Please Remember My Dear Aunt Sally

- P: parenthesis

- R: radical (e.g. square root)

- M: *

- D: /

- A: +

- S: -

-- sum of deviations from the mean will always = 0
#prev#next#top#bottom

Trivia

-- asymptotic: theoretically the normal distribution never touches the x axis
#prev#next#top#bottom

Type I Error

-- in hypothesis testing, Type I error is synonymous with level of significance
#prev#next#top#bottom

Variable

-- anything that can change; opposite is "constant" (e,g. we are human)

-- x for first var, y for second, z for third, n for # of cases (usually subjects)

-- sigma: summation sign; sum whatever follows it
#prev#next#top#bottom

Contact Us

Enter feedback, comments, questions, or suggestions:

Enter your name:

Enter email address (if you have one):

Send us your comments.

Email this page

Add or change any text to your message in the text field below:

Enter recipient's email address:

Enter your name (optional):

Enter your email address (optional):

Send this page.


amazon.com

Egon G. Guba: Fourth Generation Evaluation

Folder Contents
  Books: Action Research
Books: Evaluation...
Books: Evaluation...
CPP's Evaluation...
  Descriptive Statistics...
  ED Grant Application...
  Educational Research...
  Inferential Statistics...
Links
  Links: Action Research
  Logic Models (Unsorted...
Recent Changes...
Research and...
  The Success Case Method

Utilities
Search
Quick Search
(Best for current topics)
Enter keywords:

exact match
Google

(Indexed quarterly)
positivepractices.com
WWW
Translations

Caution: Machine generated language translations may contain significant errors. Use with care.

Google Translations
AltaVista Translations

About UsContact UsHelpPoliciesSiteMap#Top

Update: 2006-04-18T9:57:45-07:00