|





| |
Glossary of Terms & Statistical Properties
Related to Assessing Young Children
Statistical Properties:
-
Reliability: refers to the consistency of
measurements taken by a given test. Test results need to be
reproducible, stable, and meaningful. Usually reliability is reported
in terms of a reliability coefficient, a number ranging from 0.00 to
1.00. The closer to 1.00 a given test's reliability coefficient is,
the stable the test over time. There are also several different types
of reliability, including:
-
Test-Retest Reliability: this value gives
an index of stability over time. For a preschool test, you would
want a test-retest reliability of 0.90 for a 2 to 6 week period.
-
Alternate Form Reliability: this value,
also called equivalent or parallel form reliability, indicates the
degree to which two forms of a test are equivalent.
-
Internal Consistency Reliability: this
value indicates the degree to which every item consistently measures
some underlying idea.
-
Validity: refers to the extent that a test
measures what it is supposed to measure. It is important to
acknowledge that tests are only valid for the specific purpose they
are designed - validity is a matter of degree and context of the test.
Like reliability, there are also several forms of validity, including:
-
Content Validity: refers to whether items
on a test are representative of the domain or attribute the test is
supposed to measure.
-
Criterion-Related Validity: refers to the
relationship between the test scores and some other criterion or
outcome. There are two further types of criterion-related
validity:
-
Concurrent Validity: the extent to
which the test scores of a given test are related to some other
available measure.
-
Predictive Validity: refers to whether
the score obtained on the test are an accurate predictor of future
performance on that criterion. Ideally, preschool tests should
measure skills that best predict future intellectual or academic
abilities.
-
Construct Validity: refers to whether a
test actually measures some domain or trait.
Sources: Sattler, J. M. (1992). Assessment
of children: Revised and updated third edition [3rd ed.]. San
Diego: Jerome M. Sattler Publisher.
Bracken (1988), adapted by Caven S. Mcloughlin.
Other Technical Terms:
- Ability Test (IQ test): a type of test used to measure
ability in a given domain, typically intellectual ability as in an IQ test.
- Achievement Test: a type of test used to measure knowledge or
skills in one or more academic domains.
- Age Equivalent: a type of derived score that represents the
chronological age corresponding to a given raw score value. This does
not mean a child is performing at a given age; instead it only means the
child earned the same number of points as a child with that age.
- Ceiling: the highest level on a given test; preschool tests
would ideally have a ceiling that is + 2 standard deviations, with + 3 or 4
standard deviations preferred.
- Correlation: the relationship between two or more
variables. Correlations range from -1.00 to +1.00. The closer
any value is to 1.00, the higher degree of relationship between the
variables. Correlations are generally considered positive, meaning
that a high score on one variable predicts a high score on another variable,
or negative, meaning that a low score on one variable predicts a high score
on another variable.
- Floor: the lowest level of a given test; preschool tests
would ideally have a floor that is -2 standard deviations, with - 3 or 4
preferred.
- Grade Equivalent: a type of derived score that represents a
school grade corresponding to a given raw score value. This value does
not mean a child is performing at that grade level; instead it only
signifies that the child earned the same raw score as a child in a given
grade.
- Mean: the average score in a distribution of scores.
- Normal Curve: a symmetrical bell-shaped distribution of
scores where the highest frequency of scores cluster around the mean and
more infrequent scores lie at the outer tails. Most test scores are
assumed to fall along a normal curve distribution.
- Norm Tables: a table organized by chronological age/school
grade, and lists corresponding standard scores for a given raw score on a
test. Ideally preschool tests would have 1 to 2 month divisions over 3
or 4 month divisions for those tables organized by age.
- Percentile Rank: a type of derived score that allows one to
know a child's relative position on a given distribution of scores,
typically a normal distribution.
- Standard Deviation: the degree that scores on a test deviate
from the average score.
- Standard Scores: a raw score that has been transformed to
have a given mean and standard deviation. Such a transformation is
helpful in order to compare scores across different tests. Most tests
report their results in one or more types of standard score:
- Stanford Binet IQ Scores: Mean of 100 and Standard
Deviation of 16.
- Scaled Score: Mean of 10 and Standard Deviation of 3.
- T-Scores: Mean of 50 and Standard Deviation of 10.
- Weschler IQ Scores: Mean of 100 and Standard Deviation of
15.
- Z-Score: Mean of 0 and Standard Deviation of 1.
- Stanine: a contraction of standard-nine; expresses a score as
a whole number ranging from 1 to 9, and has a mean of 5 and standard
deviation of 2.
- Raw Score: a value corresponding to the number of items a
child answered correctly on a test. Because raw scores are meaningless
by themselves, they are typically transformed into one or more standard
scores.
|