Resources - Statistics

Statistics Glossary


Accidental Sample This sampling technique makes no attempt to achieve representativeness, but chooses subjects based on convenience and accessibility. FOR EXAMPLE, the "person-on-the-street" interviews.

Activities Services or functions carried out by a program (i.e., what the program does). FOR EXAMPLE, treatment programs may screen clients at intake, complete placement assessments, provide counseling to clients, etc.

After-Only Designs One-shot studies; evaluation designs involving only measures taken after the program has been completed.

Analysis A systematic approach to problem solving. Complex problems are made simpler by separating them into more understandable elements. This involves the identification of purposes and facts, the statement of defensible assumptions, and the formulation of conclusions.

Analysis of Covariance A method for analyzing the differences in the means of two or more groups of cases while taking account of variation in one interval-ratio variable.

Analysis of Variance A method for analyzing the differences in the means of two or more groups of cases.

Anchors Anchors are items that serve as reference points from which other items in the series or other points in the scale are judged or compared. FOR EXAMPLE, the opposite ends or poles of a scale identify the extremes so all values within the scale are either greater or less than one of these extremes. Also, the scale midpoint serves as an anchor in that it either divides the scale into categories or quantifies the half value.

Applied Research Research designed for the purpose of producing results that may be applied to real world situations.

Association General term for the relationship among variables.

Asymmetric Measure of Association A measure of association that makes a distinction between independent and dependent variables.

Attitude Surveys Data collection techniques designed to collect standard information from a large number of subjects concerning their attitudes or feelings. These typically refer to questionnaires or interviews. FOR EXAMPLE, a questionnaire may be mailed to residents in a community to assess how 'safe' they feel in their neighborhoods.

Attribute A characteristic that describes a person, thing, or event. FOR EXAMPLE, being female and male are attributes of persons.

Attrition The loss of subjects during the course of a study. This may be a threat to the validity of conclusions if participants of study and comparison/control groups drop out at different rates or for different reasons. FOR EXAMPLE, if treatment participants fail to appear for treatment and are subsequently excluded from the follow-up, the treatment and control subjects remaining may not be "comparable" due to attrition.

Audit The systematic examination of records and the investigation of other evidence to determine the propriety, compliance, and adequacy of programs, systems, and operations. The auditing process may include tools and techniques available from such diverse areas as engineering, economics, statistics, and accounting. The U.S. General Accounting Office auditing standards are applicable to all levels of government and not only relate to auditing of financial operations, but also are concerned with whether governmental organizations are: (1) achieving the purposes for which programs are authorized and funds made available, (2) operating economically and efficiently, and (3) complying with applicable laws and regulations.


Backfill Techniques Techniques used in cumulative case studies to collect information needed if the study is to be usable for aggregation; these techniques include, for example, obtaining missing information from the authors on how instances studied were identified and selected.

Baseline Data Initial information on a program or program components collected prior to receipt of services or participation activities. Baseline data are often gathered through intake interviews and observations and are used later for comparing measures that determine changes in a program.

Batch A group of cases for which no assumptions are made about how the cases are selected. A batch may be a population, a probability sample, or a nonprobability sample, but the data are analyzed as if the origin of the data is not known.

Before-After Designs The elementary quasi-experimental design known as the before-after design involves the measurement of "outcome" indicators (e.g., arrest rates, attitudes) prior to implementation of the treatment, and subsequent re-measurement after implementation. Any change in the measure is attributed to the treatment. This design provides a significant improvement over the one-shot study because it measures change in the factor(s) to be impacted. However, this design suffers from threats of history - the possibility that some alternate factor (besides the treatment) has actually caused the change.

Bell-Shaped Curve A distribution with roughly the shape of a bell; often used in reference to the normal distribution but others, such as the t distribution, are also bell-shaped.

Benchmarking Measuring progress toward a goal at intervals prior to the anticipated attainment of the goal. FOR EXAMPLE, measuring and tracking grade-level performance of students in a remedial program at intervals prior to completion of the program.

Benchmarks Measures of progress toward a goal, taken at intervals prior to the program's completion or the anticipated attainment of the final goal. FOR EXAMPLE, semi-annual measures of grade-level performance taken prior to completion of a remedial education program.

Between-Group Variances Indications of how the mean and variances of each group differ from the other groups.

Bias The extent to which a measurement, sampling, or analytic method systematically underestimates or overestimates the true value of an attribute. FOR EXAMPLE, words, sentence structure, attitudes, and mannerisms may unfairly influence a respondent's answer to a question. Bias in questionnaire data can stem from a variety of other factors, including choice of words, sentence structure, and the sequence of questions.

Biased Sample A sample that is not representative of the population to which generalizations are to be made. FOR EXAMPLE, a group of band students would not be representative of all students at the middle school, and thus would constitute a biased sample if the intent was to generalize to all middle school students.

Binary Variables A variable that identifies the presence or absence of a trait, characteristic, opinion, etc.; a "yes/no" variables. FOR EXAMPLE, Male - 0=No, 1=Yes.

Bivariate Analysis An analysis of the relationship between two variables. FOR EXAMPLE, an analysis of the relationship between sex (male/female) and delinquent activity, taking no other factors into account.

Bivariate Data Information about two variables.

Box-and-Whisker Plot A graphic way of depicting the shape of a distribution.


Case A single person, thing, or event for which attributes have been or will be observed. FOR EXAMPLE, a case would be one student if the sample to be studied were 250 high school students.

Case Study A method for learning about a complex instance, based on a comprehensive understanding of that instance, obtained by extensive description and analysis of the instance, taken as a whole and in its context.

Categorical Measure A measure that places data into a limited numbers of groups or categories. FOR EXAMPLE, Current Marital Status - Married, Never Married, Divorced, Widowed.

Causal Analysis A method for analyzing the possible causal associations among a set of variables.

Causal Association A relationship between two variables in which a change in one brings about a change in the other. FOR EXAMPLE, caffeine intake and sleeplessness are causally related if greater amounts of caffeine ingested result in a longer times taken to fall asleep.

Causal Model A model or portrayal of the theorized causal relationships between concepts or variables.

Causal Relationship The relationship of cause and effect. The cause is the act or event that produces the effect. The cause is necessary to produce the effect. FOR EXAMPLE, increasing the number of police on patrol causes crime to decrease.

Central Tendency General term for the midpoint or typical value of a distribution. FOR EXAMPLE, one measure of central tendency of a group of high school students is the average (mean) age of the students.

Closed Question A question with more than one possible answer from which one or more answers must be selected.

FOR EXAMPLE, the following is a closed question:
Sex: (1) Male (2) Female.

The following is not a closed question:
What is your political affiliation? _____________________.

Closed-Ended Questions A question that limits responses to predetermined categories. FOR EXAMPLE, multiple choice and yes/no questions.

Cluster Sample A probability sample for which groups or jurisdictions comprising groups were randomly selected.

Clustering Identifying similar characteristics and grouping cases with similar characteristics together.

Codebook A document which lists the variables in a dataset, possible values for each variable, and the definitions of codes that have been assigned to these values.

Coding The process of converting information obtained on a subject or unit into coded values (typically numeric) for the purpose of data storage, management, and analysis. FOR EXAMPLE, the sex of the respondent may be coded "1" for a female and "2" for a male.

Comparative Change Design The quasi-experimental design known as the comparative change design allows for the measurement of change in relevant outcome factors (using a pre- and post-test) and provides for comparison of this change between a treatment group and a non-random comparison group. Because comparison and treatment groups are not randomly selected, alternate explanations due to prior differences between groups continue to be a threat.

Comparative Post-test Design The elementary quasi-experimental design known as the comparative post-test design involves the measurement of outcomes for both the treatment group as well as a comparison group. However, unlike more sophisticated designs, selection of participants into the treatment and comparison groups is not done randomly. While such a design to some extent overcomes the issues of a one-shot study by allowing comparisons of success, this design is typically plagued by threats due to selection bias. That is, an alternate explanation for differences between group outcomes is that some alternate factor, which was related to the selection process, has actually caused the differences in outcomes.

Comparative Time-Series Design The quasi-experimental design known as the comparative time series tracks some outcome of interest for periods before and after program implementation for both the treatment group as well as a non-randomly selected comparison group. Because comparison and treatment groups are not randomly selected, alternate explanations due to prior differences between groups continue to be a threat.

Comparison Group A group of individuals whose characteristics are similar to those of a program's participants. These individuals may not receive any services, or they may receive a different set of services, activities, or products; in no instance do they receive the same services as those being evaluated. As part of the evaluation process, the experimental group (those receiving program services) and the comparison group are assessed to determine which types of services, activities, or products provided by the program produced the expected changes.

Composite Measure A measure constructed using several alternate measures of the same phenomenon. FOR EXAMPLE, a measure of class standing may be constructed using grade point average, standardized test scores, and teacher rankings.

Concept An abstract or symbolic tag that attempts to capture the essence of reality. The "concept" is later converted into variables to be measured. FOR EXAMPLE, "crime" or "recidivism."

Conditional Distribution The distribution of one or more variables given that one or more other variables have specified values. FOR EXAMPLE, the distribution of time in jail only for female (sex=female) inmates.

Confidence Interval An estimate of a population parameter that consists of a range of values bounded by statistics called upper and lower confidence limits, within which the value of the parameter is expected to be located.

Confidence Level The level of certainty to which an estimate can be trusted. The degree of certainty is expressed as the chance that a true value will be included within a specified range, called a confidence interval.

Confidence Limits Two statistics that form the upper and lower bounds of a confidence interval.

Confidentiality Secrecy. In research this involves not revealing the identity of research subjects, or factors which may lead to the identification of individual research subjects.

Confidentiality Form A written form that assures evaluation participants that information they provide will not be openly disclosed nor associated with them by name. Since an evaluation may entail exchanging or gathering privileged or sensitive information about residents or other individuals, a confidentiality form ensures that the participants' privacy will be maintained.

Confounding An inability to distinguish the separate impacts of two or more individual variables on a single outcome. FOR EXAMPLE, there has over time been an inability to adequately distinguish the separate impacts of genetics and environmental factors on IQ.

Consensus Building Outcome The production of a common understanding among participants about issues and programs.

Constraint A limitation of any kind to be considered in planning, programming, scheduling, implementing, or evaluating programs. FOR EXAMPLE, a major constraint to the development of many programs is the amount of funds available.

Construct A concept that describes and includes a number of characteristics or attributes. The concepts are often unobservable ideas or abstractions. FOR EXAMPLE, "community" or "peer pressure."

Construct Validity The extent to which a measurement method accurately represents a construct and produces an observation distinct from that produced by a measure of another construct.

Consultant An individual who provides expert or professional advice or services, often in a paid capacity.

Contamination The tainting of members of the comparison or control group with elements from the program. Contamination threatens the validity of the study because the group is no longer untreated for purposes of comparison.

Content Analysis A set of procedures for collecting and organizing nonstructured information into a standardized format that allows one to make inferences about the characteristics and meaning of written and otherwise recorded material.

Content Validity The ability of the items in a measuring instrument or test to adequately measure or represent the content of the property that the investigator wishes to measure.

Continuous Variable A quantitative variable with an infinite number of attributes. FOR EXAMPLE, distance or length.

Control Group A group of individuals whose characteristics are similar to those of the program participants but who do not receive the program services, products, or activities being evaluated. Participants are randomly assigned to either the experimental group (those receiving program services) or the control group. A control group is used to assess the effect of program activities on participants who are receiving the services, products, or activities being evaluated. The same information is collected for people in the control group and those in the experimental group.

Control Variable A variable that is held constant or whose impact is removed in order to analyze the relationship between other variables without interference, or within subgroups of the control variable. FOR EXAMPLE, if the relationship between age and frequency of delinquent activity is first investigated for male students, then separately investigated for female students, then sex has been used as a control variable.

Convenience Sample A sample for which cases are selected only on the basis of feasibility or ease of data collection. This type of sample is rarely useful in evaluation and is usually hazardous.

Correlation A synonym for association or the relationship between variables.

Correlation coefficient A numerical value that identifies the strength of relationship between variables.

Cost-Benefit A criterion for comparing programs and alternatives when benefits can be valued in dollars. Cost-benefit is the ratio of dollar value of benefit divided by cost. It allows comparison between programs and alternative methods.

Cost-Benefit Analysis An analysis that compares present values of all benefits less those of related costs when benefits can be valued in dollars the same way as costs. A cost-benefit analysis is performed in order to select the alternative that maximizes the benefits of a program.

Cost-Effectiveness A criterion for comparing alternatives when benefits or outputs cannot be valued in dollars. This relates costs of programs to performance by measuring outcomes in nonmonetary form. It is useful in comparing methods of attaining an explicit objective on the basis of least cost or greatest effectiveness for a given level of cost. FOR EXAMPLE, a treatment program may be more cost-effective than an alternative program if it produces a lower rate of recidivism for the same or lower costs, or the same rate of recidivism for a lower cost.

Covariation The degree to which two measures vary together.

Cross-Sectional Data Observations collected on subjects or events at a single point in time.

Criminal History Files Offender-based files which capture prior arrest records for each offender arrested in that jurisdiction.

Cues The alternative responses to questions that increase or decrease in intensity in an ordered fashion. The interviewee is asked to select one answer to the question.

Culture The shared values, traditions, norms, customs, arts, history, institutions, and experience of a group of people. The group may be identified by race, age, ethnicity, language, national origin, religion, or other social categories or groupings.

Cultural Relevance Demonstration that evaluation methods, procedures, and/or instruments are appropriate for the cultures to which they are applied. FOR EXAMPLE, having questionnaires available in multiple languages may make them more culturally relevant.


Data Documented information or evidence of any kind.

Data Analysis The process of systematically applying statistical and logical techniques to describe, summarize, and compare data.

Data Collection Instrument A form or set of forms used to collect information for an evaluation. Forms may include interview instruments, intake forms, case logs, and attendance records. They may be developed specifically for an evaluation or modified from existing instruments.

Data Collection Plan A written document describing the specific procedures to be used to gather the evaluation information or data. The document describes who collects the information, when and where it is collected, and how it is obtained.

Database A collection of information that has been systematically organized for easy access and analysis. Databases typically are computerized.

Demographic Question A question used in compiling vital background and social statistics. FOR EXAMPLE, age, marital status, or size of household.

Dependent Variable A variable that may, it is believed, be predicted by or caused by one or more other variables called independent variables. FOR EXAMPLE, if it is hypothesized that the treatment will reduce rearrest for drug us, then "rearrest for drug use" is the dependent variable, which is impacted by the independent variable or treatment.

Descriptive Statistic A statistic used to describe a set of cases upon which observations were made. FOR EXAMPLE, the average age of a class in high school calculated by using all members of that class.

Design The overall plan for a particular evaluation. The design describes how program performance will be measured and includes performance indicators.

Dichotomous Variable A variable with only two possible values. FOR EXAMPLE, "sex."

Direct Benefit Result that is closely related with the program by cause and effect. FOR EXAMPLE, increased adherence to probation restrictions is a result of participation in a compliance and sanctions program.

Direct Costs Resources that must be committed to implement a program. FOR EXAMPLE, program staff salaries.

Direct Impact An effect of a program that addresses a stated goal or objective of that program.

Discrete Variable A quantitative variable with a finite number of attributes. FOR EXAMPLE, day of the month.

Dispersion The extent of variation among cases.

Distribution of a Variable Variation of characteristics across cases.

Document Review A technique of data collection involving the examination of existing records or documents. FOR EXAMPLE, the examination of court documents to collect offender sentences.

Dummy Variables A dichotomous variable, typically used in regression analysis, which indicates the existence (and lack of existence) of a characteristic or group of characteristics in a case. FOR EXAMPLE, "white/male - 0=No, 1=Yes."


Ecological Fallacy False conclusions made by assuming relationships found through research with groups can be attributed to individuals.

Effect Size The size of the relationship between two variables (particularly between program variables and outcomes).

Effectiveness Ability to achieve stated goals or objectives, judged in terms of both output and impact.

Efficiency The degree to which outputs are achieved in terms of productivity and input (resources allocated). Efficiency is a measure of performance in terms of which management may set objectives and plan schedules and for which staff members may be held accountable.

Empirical Relying upon or derived from observation or experiment.

Empirical Research Research that uses data drawn from observation or experience.

Empirical Validity Empirical evidence that an instrument measures what it has been designed to measure.

Estimation Error The amount by which an estimate differs from a true value. This error includes the error from all sources (for example, sampling error and measurement error).

Evaluability Assessment An evaluability assessment is a systematic process used to determine the feasibility of a program evaluation. It also helps determine whether conducting a program evaluation will provide useful information that will help improve the management of a program and its overall performance.

Evaluation Evaluation has several distinguishing characteristics relating to focus, methodology, and function. Evaluation (1) assesses the effectiveness of an ongoing program in achieving its objectives, (2) relies on the standards of project design to distinguish a program's effects from those of other forces, and (3) aims at program improvement through a modification of current operations.

Evaluation Plan A written document describing the overall approach or design that will be used to guide an evaluation. It includes what will be done, how it will be done, who will do it, when it will be done, and why the evaluation is being conducted.

Evaluation Practice A practice or set of practices that consist mainly of management information and data incorporated into regular program management information systems to allow managers to monitor and assess the progress being made in each program toward its goals and objectives. Ideally, a program is self-evaluating, continuously monitoring its own activities.

Evaluation Team The individuals, such as the evaluation consultant and staff, who participate in planning and conducting the evaluation. Team members assist in developing the evaluation design, developing data collection instruments, collecting data, analyzing data, and writing the report.

Ex-post Facto Design A research design in which all group selection, pretest data, and posttest data are collected after completion of the treatment. The evaluator is thus not involved in the selection or placement of individuals into comparison or control groups. All evaluation decisions are made retrospectively.

Experimental Data Data produced by an experimental or quasi-experimental design.

Experimental Design A research design in which the researcher has control over the selection of participants in the study, and these participants are randomly assigned to treatment and control groups.

Experimental Group A group of individuals participating in the program activities or receiving the program services being evaluated or studied. Experimental groups (also known as treatment groups) are usually compared to a control or comparison group.

Experimental Mortality The loss of subjects from an experiment due to such factors as illness, lack of interest, or refusal to participate. This loss may effect the comparability of results between the experimental and control groups.

External Validity The extent to which a finding applies (or can be generalized) to persons, objects, settings, or times other than those that were the subject of study.

External Validity Threats Factors that may reduce the transferability of a program's findings to other groups or jurisdictions. FOR EXAMPLE, a program may appear successful using a group of specially selected clients (e.g., first time offenders). However, it would not be a fair test of how this program would work on the general offender population.


Feasibility Study A study of the applicability or practicability of a proposed action or plan.

Field Notes A written record of observations, interactions, conversations, situational details, and thoughts during the study period.

Flow Chart A graphic presentation using symbols to show the step-by-step sequence of operations, activities, or procedures. Used in computer system analysis, activity analysis, and in general program sequence representations.

Focus Group A group of 7 to 10 people convened for the purpose of obtaining perceptions or opinions, suggesting ideas, or recommending actions. A focus group is a method of collecting information for the evaluation process.

Focused Interview An interview organized around several predetermined questions or topics but providing some flexibility in the sequencing of the questions and without a predetermined set of response categories or specific data elements to be obtained.

Forced-Choice Question A question that requires respondents to choose between available options. Options such as "other" or "none of the above" are not available alternatives.

Forecasting Estimating the likelihood of an event taking place in the future, based on available data from the past.

Formative Evaluation A type of process evaluation of new programs or services that focus on collecting data on program operations so that needed changes or modifications can be made to the program in the early stages. Formative evaluations are used to provide feedback to staff about the program components that are working and those that need to be changed.

Frequency Distribution A distribution of the count of cases corresponding to the attributes of an observed variable. FOR EXAMPLE, a frequency distribution of a class of 45 students may indicate that 25 were male and 20 were females.

Function A group of related activities and/or projects for which an organizational unit is responsible. This is also the principal purpose a program is intended to serve.


Gamma A measure of association; a statistic used with ordinal variables.

Generalizability The extent to which the findings of a study can be applied to other populations, settings, or times. FOR EXAMPLE, the findings that a treatment program for adult females reduced recidivism may not be subsequently generalized to include adult males or juveniles.

Goal A desired state of affairs that outlines the ultimate purpose of a program. This is the end toward which program efforts are directed. FOR EXAMPLE, the goal of many criminal justice programs is a reduction in criminal activity.


Halo Effect Bias created by an observer's tendency to rate, perhaps unintentionally, certain objects or persons in a manner that reflects what was previously anticipated.

Hawthorne Effect A tendency of research subjects to act atypically as a result of their awareness of being studied, as opposed to any actual treatment that has occurred. FOR EXAMPLE, if a school principal observes a classroom of students reacting politely and enthusiastically to a new student teacher, such behavior could be a result of the principal's presence in the classroom, as opposed to the quality of the student teacher.

Histogram A graphic depiction of the distribution of a variable.

History Effect This threat to internal validity refers to specific events, other than the program, which may have taken place during the course of study which may have produced the results. FOR EXAMPLE, a highly publicized trial involving local law enforcement which occurs during the time of program operation may impact community attitudes.

Hypothesis A specific statement regarding the relationship between two variables. In evaluation research, this typically involves a prediction that the program or treatment will cause a specified outcome. Hypotheses are confirmed or denied based on empirical analysis.


Immediate Outcome The changes in program participants' knowledge, attitudes, and behavior that occur at certain times during program activities. FOR EXAMPLE, acknowledging gang involvement is an immediate outcome.

Impact The ultimate effect of the program on the problem or condition that the program or activity was supposed to do something about. FOR EXAMPLE, a 10% reduction in drug activity as a result of increased drug enforcement and investigation. (There also may be unexpected or unintended impacts.)

Impact Evaluation A type of outcome evaluation that focuses on the broad, long-term impacts or results of program activities. For example, an impact evaluation could show that a decrease in a community's crime rate is the direct result of a program designed to provide community policing.

Implementation Development of a program. The process of putting all program functions and activities into place.

Implementation Strategy The plan for development of a program and procedure for ensuring the fulfillment of intended functions or services.

Implemented Developed or put into place.

Incident-based Crime Files Data bases or files which maintain information on each offense or incident of crime occurring in a jurisdiction.

Independent Variable A variable that may, it is believed, predict or cause fluctuation in an dependent variable. FOR EXAMPLE, if it is believed that age influences the frequency of delinquent behavior, age is the independent variable and frequency of delinquent behavior is the dependent variable. In evaluation research, the treatment (or lack of) is typically treated as an independent variable since it is hypothesized that the treatment will influence some subsequent behavior or state.

Index A set of related measures combined to characterize a more abstract concept.

Index Crimes Part 1 crimes under the Uniform Crime Reporting System. These include murder and non-negligent manslaughter, forcible rape, robbery, aggravated assault, burglary, larceny-theft, motor vehicle theft, and arson.

Index of Dispersion A measure of spread; a statistic used especially with nominal variables.

Indicator A measure that consists of ordered categories arranged in ascending or descending order of desirability.

Indirect Benefit Results that are related to a program, but not its intended objectives or goals. FOR EXAMPLE, an increase in acceptable caseload per probation officer is due to an increased adherence to probation restrictions arising from a compliance program.

Indirect Costs The costs associated with impacts or consequences of a program. FOR EXAMPLE, the costs due to reincarceration.

Indirect Impact An effect of a program that is not associated with one of its stated objectives.

Inferential Statistic A statistic used to describe a population using information from observations on only a probability sample of cases from the population. FOR EXAMPLE, the average age of a class in high school calculated using a random sample of members of that class.

Informed Consent A written agreement by the program participants to voluntarily participate in an evaluation or study after having been advised of the purpose of the study, the type of the information being collected, and how information will be used.

Information System An organized collection, storage, and presentation system of data and other knowledge for decision making, progress reporting, and for planning and evaluation of programs. It can be either manual or computerized, or a combination of both.

In-Person Interviewing Face-to-face interviewing. The interviewer meets personally with the respondent to conduct the interview.

Input Organizational units, people, dollars, and other resources actually devoted to the particular program or activity.

Instrument A tool used to collect and organize information. FOR EXAMPLE, questionnaires, scales, tests.

Instrumental Outcome A measure or measures of phenomena directly related to program goals and objectives.

Instrumentation Bias Bias introduced in a study by a change in the measurement instrument during the course of the study. FOR EXAMPLE, the scale loses its calibration over time or a stopwatch slows.

Intermediate Outcome Results or outcomes of program activities that must occur prior to the final outcome in order to produce the final outcome. FOR EXAMPLE, a prison vocation program must first result in increased employment (intermediate outcome) before it may expect to reduce recidivism (final outcome).

Internal Consistency The extent to which all items in a scale or test measure the same concept.

Internal Validity The extent to which the causes of an effect are established by an inquiry.

Internal Resource An agency's or organization's resources, including staff skills and experience and any information already available through current program activities.

Internal Validity Threat Factors other than program participation that may affect the results or findings. FOR EXAMPLE, changes in the data collection instrument may influence the findings or a pre-test may influence responses to a post-test.

Interquartile Range A measure of spread; a statistic used with ordinal, interval, and ratio variables.

Interrater Reliability The extent to which two different researchers obtain the same result when using the same instrument to measure a concept.

Interrupted Times Series Design The interrupted time series design involves repeated measurement of an indicator (e.g., reported crime) over time, encompassing periods both prior to and after implementation of a program. The goal of such an analysis is to assess whether the treatment (or program) has "interrupted" or changed a pattern established prior to the program's implementation. However, the impact of alternate historical events may threaten the interpretation of the findings. FOR EXAMPLE, an interruped times series study may collect quarterly arrest rates for drug related offenses in a given community for two years prior to and two years following the implementation of a drug enforcement task force. The analysis focuses on changes in patterns before and after the introduction of the program.

Interval Estimate General term for an estimate of a population parameter that is a range of numerical values.

Interval Measure A quantitative measure with equal intervals between categories, but with no absolute zero. FOR EXAMPLE, IQ scores.

Interval Scale A measurement scale that measures quantitative differences between values of a variable, with equal distances between the values.

Interval Variable A quantitative variable that attributes of which are ordered and for which the numerical differences between adjacent attributes are interpreted as equal. FOR EXAMPLE, Intelligence scores.

Intervening Variable A variable that causally links other variables to each other. In a causal model, this intermediate variable must be influenced by one variable in order for a subsequent variable to be influenced. FOR EXAMPLE, it may be expected that a vocational program will change an offender's employment status which will subsequently reduce recidivism. Participation in the vocational program would be the independent variable, employment status - the intervening variable, and rearrest - the dependent variable.

Interviews Interviews involve face-to-face situations or telephone contacts in which the researcher orally solicits responses.


Judgment Sample A sample selected by using discretionary criteria rather than criteria based on the laws of probability.

Judgmental Forecasting Judgmental Forecasting attempts to elicit and synthesize informed judgments and are often based on arguments from insight.


Kaldor-Hicks Criterion A criterion of equity which states that one social state is better than another if there is a net gain in efficiency and if those that gain can compensate the losers.

Kendall's tau A measure of association used to correlate two ordinal scales.

Kenneth Arrow's Impossibility Theorem A theorem to demonstrate that it is impossible to aggregate individual preferences through majority voting without violating one or more of five reasonable conditions of democratic decision-making.

Known-group Validation A procedure for validating an instrument which involves testing on a group for which the results are already known. FOR EXAMPLE, testing a self-report instruments on a group of known offenders.

Kurtosis A term used to describe a curve indicating that it is more peaked than the normal curve.


Lambda A measure of association; a statistic used with nominal variables.

Level of Measurement Refers to the four levels of variables and their empirical attributes - nominal, ordinal, interval, and ratio.

Likert Scale A type of composite measure using standardized response categories in survey questionnaires. Typically a range of questions using response categories such as strongly agree, agree, disagree, and strongly disagree are utilized to construct a composite measure.

Logic Model Describes how a program should work, presents the planned activities for the program, and focuses on anticipated outcomes. While logic models present a theory about the expected program outcome, they do not demonstrate whether the program caused the observed outcome. Diagrams or pictures that illustrate the logical relationship among key program elements through a sequence of "if-then" statements are often used when presenting logic models.

Longitudinal Data Sometimes called "time series data," observations collected over a period of time; the sample (instances or cases) may or may not be the same each time but the population remains constant. FOR EXAMPLE, quarterly arrest rates for drug-related offenses in a given city for a period of two years.

Longitudinal Study The study of the same group over a period of time. These generally are used in studies of change.


Management The guidance and control of action required to execute a program. Also, the individuals charged with the responsibility of conducting a program.

Management Information System An information collection and analysis system, usually computerized, that facilitates access to program and participant information. It is usually designed and used for administrative purposes.

Marginal Distribution The distribution of a single variable based upon an underlying distribution of two or more variables.

Matching A method utilized to create comparison groups, in which groups or individuals are matched to those in the treatment group based on characteristics felt to be relevant to program outcomes.

Matrix of Categories A method of displaying relationships among themes in analyzing case study data that shows whether changes in categories or degrees along one dimension are associated with changes in the categories of another dimension.

Maturation Effect A threat to the internal validity of an evaluation in which observed outcomes are a result of natural changes of the program participants over time rather than because of program impact. FOR EXAMPLE, age cohorts generally mature and/or change crime commission tendencies over time. This may naturally alter crime commission patterns independent of program participation.

Mean A measure of central tendency, the arithmetic average; a statistic used primarily with interval-ratio variables following symmetrical distributions. FOR EXAMPLE, the average age or average height of a group of middle school students.

Measurement A procedure for assigning a number to an object or an event.

Measurement Error The difference between a measured value and a true value.

Measures of Association Statistics that indicate the strength and nature of a relationship between variables.

Measures of Central Tendency Measures that indicate the midpoint or central value of a distribution. These measures include the mean, median, and mode. FOR EXAMPLE, one measure of central tendency of a group of high school students is the average (mean) age of the students.

Median A measure of central tendency, the value of the case marking the midpoint of an ordered list of values of all cases; a statistic used primarily with ordinal variables and asymmetrically distributed interval-ratio variables.

Meta-analysis The systematic analysis of a set of existing evaluations of similar programs in order to draw general conclusions, develop support for hypotheses, and/or produce an estimate of overall program effects.

Methodology The way in which information is found or something is done. The methodology includes the methods, procedures, and techniques used to collect and analyze information. FOR EXAMPLE, questionnaires or use of secondary information.

Mission The part of a goal or endeavor assigned as a specific responsibility of a particular organizational unit. It includes the task, together with the purpose, which clearly indicates the action to be taken and the reasons.

Mode A measure of central tendency, the value of a variable that occurs most frequently; a statistic used primarily with nominal variables.

Monitoring An on-going process of reviewing a program's activities to determine whether set standards or requirements are being met.

Monitoring System An on-going system to collect data on a program=s activities and outputs, designed to provide feedback on whether the program is fulfilling its functions, addressing the targeted population, and/or producing those services intended. FOR EXAMPLE, a computerized intake system may be utilized which captures client characteristics, and subsequently provides monthly reports on the numbers of clients processed and receiving services.

Mortality Threat A threat to the internal validity of an evaluation caused by participants withdrawing or dropping out prior to program conclusion. FOR EXAMPLE, the impact of the success of a drug awareness program is difficult to assess in a school with a high attrition rate.

Multivariate Analysis An analysis of the relationships between more than two variables.


N Number of cases. Uppercase "N" refers to the number of cases in the population. Lower case "n" refers to the number of cases in the sample.

Nominal Variable A quantitative variable whose attributes have no inherent order. FOR EXAMPLE, "sex" or "race."

Nonequivalent Comparison Group Design Evaluation designs that use nonrandomized comparison groups to evaluate program effects. Also referred to as quasi-experimental designs.

Non-experimental Data Data not produced by an experiment or quasi-experiment. FOR EXAMPLE, the data may be administrative records or the results of a survey.

Nonindex Crimes Part 2 crimes under the Uniform Crime Reporting System. Twenty-two crimes are included, such as simple assault, vandalism, gambling, drunkenness, and the like. (See Index Crimes.)

Non-probability Sample A sample not produced by a random process. FOR EXAMPLE, it may be a sample based upon an evaluator's judgment about which cases to select.

Non-respondent A person who fails to answer either a questionnaire or a question.

Non-response Bias The bias created by the failure of part of a sample to respond to a survey or answer a question. If those responding and those not responding have different characteristics, the responding cases may not be representative of the population from which they were sampled.

Normal Distribution (Curve) A theoretical distribution that is closely approximated by many actual distribution of variables.

Normative Question A type of evaluation question requiring comparison between what is happening (the condition) to norms and expectations or standards for what should be happening (the criterion).

Null Hypothesis A hypothesis stating that two variables are not related. Research attempts to disprove the null hypothesis by finding evidence of a relationship.


Objective Specific results or effects of a program's activities that must be achieved in pursuing the program's ultimate goals. FOR EXAMPLE, a treatment program may expect to change offender attitudes (objective) in order to ultimately reduce recidivism (goal).

Observation A data collection strategy in which the activities of subjects are visually examined. The observer attempts to keep his/her presence from interfering in or influencing any behaviors. FOR EXAMPLE, watching an interrogation through a one-way mirror or collecting information on arrest techniques by "riding along" involve observation.

Observational Techniques Data collection strategies which use observation of subjects as a means to collect data. These techniques generally involve attempts by the observer to not alter or change the behavior being observed. FOR EXAMPLE, collecting data on cases or courtroom procedures by watching, and recording, courtroom activity is an observational technique.

One-group Designs Research designs which study a single program with no comparison or control group.

One-shot Case Study The one-shot case study involves the measurement of an identified "outcome" after a treatment or program has been implemented. However, there are no measures taken or available for comparison (i.e., status before the program, or outcome of a comparison or control group). Without a comparison measure, there is no means for inferring that the "outcome" was actually influenced by the treatment or program.

Open-ended Interview An interview in which, after an initial or lead question, subsequent questions are determined by topics brought up by the person being interviewed; the concerns discussed, their sequence, and specific information obtained are not predetermined and the discussion is unconstrained, able to move in unexpected directions.

Open-ended Question A question that does not have a set of possible answers from which to make a selection but permits the respondent to answer in essay form. On a questionnaire, the respondent would write an essay or short answer or fill in a blank. During an interview, the respondent would give the interviewer an unstructured, narrative answer. The interviewer would record the response verbatim or select salient features. If a structured interview were used, a question might appear to be open-ended to the interviewee but could be "closed down" by the interviewer, who would have a set of alternative answers to check.

Operational Definition Detailed description of how a concept or variable will be measured and how values will be assigned. FOR EXAMPLE, one operational definition of prior criminal behavior may involve reported arrests for felony offenses based on an FBI fingerprint search, while another operational definition may involve self-reported criminal history obtained by response to a short list of questions on a standardized questionnaire.

Operationalization A process of describing constructs or variables in concrete terms so that measurements can be made. FOR EXAMPLE, one process for operationalizing prior criminal behavior may involve reported arrests for felony offenses based on an FBI fingerprint search, while another process may involve self-reported criminal history obtained by response to a short list of questions on a standardized questionnaire.

Operationalize To define a concept in a way that can be measured. In evaluation research, to translate program inputs, outputs, objectives, and goals into specific measurable variables. FOR EXAMPLE, one way to operationalize prior criminal behavior may involve only reported arrests for felony offenses based on an FBI fingerprint search, while another means to operationalize may involve self-reported criminal history obtained by response to a short list of questions on a standardized questionnaire.

Operational Plan A tactical statement of when and what critical milestones must be passed to attain objectives programmed for a specific period.

Ordinal Scale Data Data classified into exhaustive, mutually exclusive, and ordered or ranked categories. FOR EXAMPLE, a typical ordinal scale may involve responses of "very good," "good," "satisfactory," "poor," and "very poor."

Ordinal Variable A quantitative variable whose attributes are ordered but for which the numerical differences between adjacent attributes are not necessarily interpreted as equal. FOR EXAMPLE, amount of school completed - (1)elementary school, (2)middle school, (3)high school, (4)college.

Outcome Evaluation An evaluation used by management to identify the results of a program's effort. It seeks to answer management's question, "What difference did the program make?" It provides management with a statement about the net effects of a program after a specified period of operation. This type of evaluation provides management with knowledge about: (1) the extent to which the problems and needs that gave rise to the program still exist, (2) ways to ameliorate adverse impacts and enhance desirable impacts, and (3) program design adjustments that may be indicated for the future.

Outcome The results of program operations or activities. FOR EXAMPLE, anticipated outcomes of DARE programs may include increased knowledge about drugs and alcohol, changed attitudes about drugs and alcohol, and reduced involvement with drugs and alcohol.

Outlier Instances that are aberrant or do not fit with other instances: instances that, compared to other members of a population, are at the extremes on relevant dimensions. FOR EXAMPLE, while sentences for most criminal offenders may involve between one and twenty years, extreme cases may involve sentences (multiple consecutive sentences) of 300 years or more.

Output Immediate measures of what the program did. FOR EXAMPLE, the output of a drug enforcement team may include the amount of marijuana shipments seized, the number of drug rings investigated, and the number of drug arrests made.

Outside Evaluator An evaluator not affiliated with the agency prior to the program evaluation. Also known as third-party evaluator.


Panel Data A special form of longitudinal data in which observations are collected on the same sample of respondents over a period of time.

Panel Interviewing Conducting repeated interviews with the same group of respondents over time.

Parameter A number that describes a population. FOR EXAMPLE, percent of males in the population.

Participant A resident, family, complex, neighborhood, or community receiving or participating in services provided by the program. Also known as client or target population group.

Participant Observation A research method involving direct participation of the researcher in the events being studied. The researcher may either reveal or hide the true reason for involvement.

Pearson Product-Moment Correlation Coefficient A measure of association; a statistic used with interval-ratio variables.

Peer Review An assessment of a product conducted by a person or persons of similar expertise to the author.

Performance Evaluation An evaluation that compares actual performance with that planned in terms of both resource utilization and production. It is used by management to redirect program efforts and resources and to redesign the program structure.

Performance Measurement Involves ongoing data collection to determine if a program is implementing activities and achieving objectives. It measures inputs, outputs, and outcomes over time. In general, pre-post comparisons are used to assess change.

Performance Measures Ways to objectively measure the degree of success a program has had in achieving its stated objectives, goals, and planned program activities. FOR EXAMPLE, number of clients served, attitude change, and rates of rearrest may all be performance measures.

Pilot A pretest or trial run of a program, evaluation instrument, or sampling procedure for the purpose of correcting any problems before it is implemented or used on a larger scale.

Pilot Test Preliminary test or study of the program or evaluation activities to try out procedures and make any needed changes or adjustments. FOR EXAMPLE, an agency may pilot test new data collection instruments that were developed for the evaluation.

Planning The process of anticipating future occurrences and problems, exploring their probable impact, and detailing policies, goals, objectives, and strategies to solve the problems. This often includes preparing options documents, considering alternatives, and issuing final plans.

Point Biserial Coefficient A measure of association between an interval-ratio variable and a nominal variable with two attributes.

Point Estimate An estimate of a population parameter that is a single numerical value. FOR EXAMPLE, the percent of males in the population.

Policy A governing principle pertaining to goals, objectives, and/or activities. It is a decision on an issue not resolved on the basis of facts and logic only. FOR EXAMPLE, the policy of expediting drug cases in the courts might be adopted as a basis for reducing the average number of days from arraignment to disposition.

Policy Analysis An analysis used to help managers understand the extent of the problem or need that exists and to set realistic goals and objectives in response to such problem or need. It may be used to compare actual program activities with the program's legally established purposes in order to ensure legal compliance.

Population The total number of individuals or objects being analyzed or evaluated.

Posttest A test or measurement taken after services or activities have ended. It is compared with the results of a pretest to show evidence of the effects or changes resulting from the services or activities being evaluated.

Precision The exactness of a question's wording or the amount of random error in an estimate.

Pretest A test or measurement taken before services or activities begin. It is compared with the results of a posttest to show evidence of the effects of the services or activities being evaluated. A pretest can be used to obtain baseline data.

Primary Data Data collected by the researcher specifically for the research project. FOR EXAMPLE, a survey of program participants undertaken by the researcher involves the collection of primary data, while a subsequent review of the program's case files involves the use of secondary data.

Probability Distribution A distribution of a variable that expresses the probability that particular attributes or ranges of attributes will be, or have been observed.

Probability Sample A group of cases selected from a population by a random process. Every member of the population has a known, nonzero probability of being selected.

Probability Sampling A method for drawing a sample from a population such that all possible samples have a known and specified probability of being drawn.

Probe To examine a subject in an interview in depth, using several questions.

Problem statement A problem statement should describe the problem, describe its causes, and identify potential approaches or solutions to the problem through the use of literature reviews. In program evaluation, inclusion of a problem statement as part of the model provides an opportunity for the importance of a program to be conveyed. A detailed description of the problem and who is affected can provide a baseline for comparison purposes and a greater understanding of who has benefited from program services.

Process The programmed, sequenced set of things actually done to carry out a program mission.

Process Evaluation Process evaluation focuses on how a program was implemented and operates. It identifies the procedures undertaken and the decisions made in developing the program. It describes how the program operates, the services it delivers, and the functions it carries out. Like monitoring evaluation, process evaluation addresses whether the program was implemented and is providing services as intended. However, by additionally documenting the program's development and operation, it allows an assessment of the reasons for successful or unsuccessful performance, and provides information for potential replication.

Productivity The relationship between production of an output and one, some, or all of the resource inputs used in accomplishing the assigned task. It is measured as a ratio of output per unit of input over time. It is a measure of efficiency and is usually considered as output per person-hour.

Program A major endeavor authorized and funded to achieve a significant purpose, defined in terms of the principal actions/activities required. It may cross organizational lines.

Program Activities Activities, services, or functions carried out by the program (i.e., what the program does). FOR EXAMPLE, treatment programs may screen clients at intake, complete placement assessments, provide counseling to clients, etc.

Program Analysis The analysis of options in relation to goals and objectives, strategies, procedures, and resources by comparing alternatives for proposed and ongoing programs. It embraces the processes involved in program planning and program evaluation.

Program Effectiveness Evaluation The application of scientific research methods to estimate how much observed results, intended or not, are caused by program activities. Effect is linked to cause by design and analyses that compare observed results with estimates of what might have been observed in the absence of the program.

Program Failure A program shortcoming in which the outcome criteria are not affected by participation of the subjects in the program (i.e., the program does not accomplish its objective). FOR EXAMPLE, a prison alternative which has no impact on recidivism rates.

Program Implementation Objective What is planned to be done in the program, components, or services. FOR EXAMPLE, providing security patrols in five buildings three times each evening is a program implementation objective.

Program Justification The narrative and related analyses and statistical presentations supporting a program budget request. It includes: (1) definitions of program objectives, including a rationale for how the proposed program is expected to help solve the problem and the magnitude of the need, (2) plans for achieving the objectives , and (3) the derivation of the requested appropriation in terms of outputs or workloads showing productivity trends and the distribution of funds among organizational units.

Program Model A flowchart or model which identifies the objectives and goals of a program, as well as their relationship to program activities intended to achieve these outcomes.

Public Program Program conducted by a federal, state, or local governmental agency.

Purposive Sample Instances appropriately selected to answer different evaluation questions, on various systematic bases, such as best or worst practices; a judgmental sample. If conducted systematically, can be widely useful in evaluation.


Qualitative Analysis An analysis that ascertains the nature of the attributes, behavior, or opinions of the entity being measured. FOR EXAMPLE, in describing a person, a qualitative analysis might conclude that the person is tall, thin, and middle-aged.

Qualitative Data Information that is difficult to measure, count, or express in numerical terms. For example, how safe a resident feels in his or her apartment is qualitative data.

Qualitative Research Research involving detailed, verbal descriptions of characteristics, cases, and settings. Qualitative research typically uses observation, interviewing, and document review to collect data.

Quantify To attach numbers to an observation.

Quality Control A procedure for keeping quality of inputs or outputs to specifications.

Quantitative Data Information that can be expressed in numerical terms, counted, or compared on a scale. FOR EXAMPLE, the number of  911 calls received in a month.

Quantitative Analysis An analysis that ascertains the magnitude, amount, or size, for example, of the attributes, behavior, or opinions of the entity being measured. FOR EXAMPLE, in describing a population, a quantitative analysis might conclude that the average person is 5 feet 11 inches tall, weighs 180 pounds, and is 45 years old.

Quantitative Research Research that examines phenomenon through the numerical representation of observations and statistical analysis.

Quasi-Experimental Design A research design with some, but not all, of the characteristics of an experimental design. While comparison groups are available and maximum controls are used to minimize threats to validity, random selection is typically not possible or practical.

Questionnaire A printed form containing a set of questions for gathering information.

Quota Sampling A nonprobability stratified sampling procedure in which units are selected for the sample to adhere to certain proportions of characteristics desired. FOR EXAMPLE, an interviewer is instructed to interview 100 individuals in a mall (half male and half female). If the interviewer obtains 50 female interviews first, only males will be interviewed until that quota is also met.


Random Assignment The assignment of individuals in the pool of all potential participants to either the experimental (treatment) group or the control group in such a manner that their assignment to a group is determined entirely by chance.

Random Comparison Group Design In this research design, the comparison group is randomly selected from the population of interest, even though the treatment group is not selected randomly.

Randomized Comparative Change Design In the experimental design known as the randomized comparative change design a treatment and control group are randomly selected for study. Both groups are administered a pre-test. The treatment group is given the treatment, while the control group is not. Both groups are tested or measured after the treatment. The test results of the two groups are compared. The pretest allows a check on the randomization process, and allows for control of any differences found.

Randomized Comparative Post-Test Design In the experimental design known as the randomized comparative post-test design a treatment and control group are randomly selected for study. The treatment group is given the treatment, while the control group is not. Both groups are tested or measured after the treatment. The test results of the two groups are compared.

Randomized Controlled Trial In a randomized controlled trial, the impact of a program is determined by randomly assigning individuals to an intervention group or control group.

Random Digit Dialing Rather than selecting names and numbers of individuals to be called, computers are used to generate random sets of seven-digit numbers, which are then called as the survey sample.

Random Sampling A procedure for sampling from a population that gives each unit in the population a known probability of being selected into the sample.

Range A measure of spread which gives the distance between the lowest and the highest values in a distribution; a statistic used primarily with interval-ratio variables. FOR EXAMPLE, a study may report that ages in the sample ranged from 21 to 65 years.

Ratio Measure A level of measurement which has all the attributes of nominal, ordinal, and interval measures, and is based on a "true zero" point. As a result, the difference between two values or cases may be expressed as a ratio. FOR EXAMPLE, it may be reported that person A weighed twice as much as person B, because weight is typically measured using a ratio measure (i.e., pounds).

Recidivism The repetition of criminal or delinquent behavior.

Regression Analysis A method for determining the association between a dependent variable and one or more independent variables.

Regression Coefficient An asymmetric measure of association; a statistic computed as part of a regression analysis.

Regression Discontinuity Design In this research design, subjects are placed into treatment and control groups based on a score obtained on some variable. Those scoring higher of the assignment variable are placed into one group, while those scoring lower are placed in the other group.

Regression Effects The tendency of subjects, who are initially selected due to extreme scores, to have subsequent scores move inward toward the mean. Also known as statistical regression/regression to the mean/regression fallacy. FOR EXAMPLE, students with the highest grades in a midterm exam are more likely to have scores closer to the mean at the final. This effect may be misinterpreted in evaluation research as being a result of the program.

Regression Fallacy The tendency of subjects, who are initially selected due to extreme scores, to have subsequent scores move inward toward the mean. Also known as statistical regression/regression to the mean/regression effect. FOR EXAMPLE, students with the highest grades in a midterm exam are more likely to have scores closer to the mean at the final. This effect may be misinterpreted in evaluation research as being a result of the program.

Reliability The extent to which a measurement instrument yields consistent, stable, and uniform results over repeated observations or measurements under the same conditions each time. FOR EXAMPLE, a scale is unreliable if it weighs a child three times in three minutes and gets three different weights.

Reliability Assessment An effort required to demonstrate the repeatability of a measurement or how likely a question may be to get consistently similar results. It is different from verification (checking accuracy) or validity.

Replication The duplication of an experiment or program.

Representative Reflecting the characteristics or nature of the larger population to which one wants to generalize.

Representative Sample A sample that has approximately the same distribution of characteristics as the population from which it was drawn.

Request For Proposal An open solicitation to potential grantees or contractors inviting them to compete for money available to evaluate programs.

Research Design A plan of what data to gather, from whom, how and when to collect the data, and how to analyze the data obtained.

Resistant Statistic A statistic that is not much influenced by changes in a few observations.

Resources Assets available and anticipated for operations. They include people, equipment, facilities and other things used to plan, implement, and evaluate public programs whether or not paid for directly by public funds.

Response Rate The percentage of persons in a sample who respond to a survey.

Response Style The tendency of a respondent to answer in a specific way regardless of how a question is asked. FOR EXAMPLE, some persons may be more likely to use extreme categories, such as "very good" or "excellent", while others may shy away from use of such extremes.

Response Variable A variable on which information is collected and which there is an interest because of its direct policy relevance. FOR EXAMPLE, in studying policies for retraining displaced workers, employment rate might be the response variable.


Sample A subset of the population. Elements are selected intentionally as a representation of the population being studied.

Sample Design The sampling procedure used to produce any type of sample.

Sampling Distribution The distribution of a statistic.

Sampling Error The maximum expected difference between a probability sample value and the true value.

Sampling Frame The complete list of the universe or population of interest in the study. FOR EXAMPLE, all persons living in a given area, or all offenders eligible for a given treatment.

Scale An aggregate measure that assigns a value to a case based on a pattern obtained from a group of related measures.

Scientific Sample Synonymous with Probability Sample. A group of cases selected from a population by a random process. Every member of the population has a known, nonzero probability of being selected.

Scoping Analyzing alternative ways for conducting an evaluation. It is clarifying the validity of issues, the complexity of the assignment, the users of final reports, and the selection of team members to meet the needs of an evaluation. Scoping ends when a major go/no-go decision is made about whether to do the evaluation.

Secondary Data Data that has been collected for another purpose, but may be reanalyzed in a subsequent study. FOR EXAMPLE, state criminal history files may be searched both to analyze prior criminal history of offenders in treatment programs and to identify subsequent recidivism. However, such data was not originally collected for such purposes.

Selection Bias Potential biases introduced into a study by the selection of different types of people into treatment and comparison groups. As a result, the outcome differences may potentially be explained as a result of pre-existing differences between the groups, as opposed to the treatment itself.

Selection Effects 1) Selection bias is a threat to the internal validity of an evaluation when the researcher chooses non-equivalent groups for comparison. FOR EXAMPLE, when the recidivism rate of a program tested on first time offenders is compared to the recidivism rate of the general prison population.

2) Selection bias is a threat to the external validity of an evaluation if the study group is not representative of the larger population to which results are intended to be inferred. FOR EXAMPLE, a program may appear successful using a group of specially selected clients (e.g., first time offenders). However, it would not be a fair test of how this program would work on the general offender population.

Self-evaluation The evaluation of a program by those conducting the program.

Self-Reported Data Information that program participants generate themselves that is used to assess program processes or outcomes.

Significance Level The probability of rejecting a set of assumptions when they are in fact true.

Simple Random Sample A method for drawing a sample from a population such that all samples of a given size have equal probability of being drawn.

Sleeper Effect An impact of a study that does not appear immediately, but may manifest at a later time.

Spread General term for the extent of variation among cases.

Spuriousness A condition in which two variables vary together, but are not in fact causally related. Both may be influenced independently by a third variable. FOR EXAMPLE, it may be found that children who eat more ice cream are less likely to be involved in delinquent behavior. Rather than concluding that "ice cream consumption" reduces "delinquent behavior," it may be found that both behaviors are a function of a third variable, "income."

Staffing Personnel required for a program or a project.

Standard A criterion for evaluating performance and results. It may be a quantity or quality of output to be produced, a rule of conduct to be observed, a model of operation to be adhered to, or a degree of progress toward a goal.

Standard Deviation A measure of the spread, the square root of the variance; a statistic used with interval-ratio variables.

Standard Instruments An assessment, inventory, questionnaire, or interview that has been tested with a large number of individuals and is designed to be administered to program participants in a consistent manner. Results of tests with program participants can be compared to reported results of the tests used with other populations.

Standardized Question A question that is designed to be asked or read and interpreted in the same way regardless of the number and variety of interviewers and respondents.

Statistic A number computed from data on one or more variables.

Statistical Analysis Analyzing collected data for the purposes of summarizing information to make it more usable and/or making generalizations about a population based on a sample drawn from that population.

Statistical Conclusion Validity The extent to which the observed statistical significance (or the lack of statistical significance) of the covariation between two or more variables is based on a valid statistical test of that covariation.

Statistical Control A statistical technique used to eliminate variance in dependent variables caused by extraneous sources. In evaluation research, statistical controls are often used to control for possible variation due to selection bias by adjusting data for program and control group on relevant characteristics.

Statistical Procedure A set of standards and rules based in statistical theory by which one can describe and evaluate what has occurred.

Statistical Sample Synonymous with probability sample; a group of cases selected from a population by a random process in which every member of the population has a known, nonzero probability of being selected.

Statistical Significance The degree to which a value is greater or smaller than would be expected by chance. Typically, a relationship is considered statistically significant when the probability of obtaining that result by chance is less than 5% if there were, in fact, no relationship in the population.

Statistical Test Type of statistical procedure that is applied to data to determine whether the results are statistically significant (that is, the outcome is not likely to have resulted by chance alone).

Statistical Weighting A technique used to assure representation of certain groups in the sample. Data for underrepresented cases are weighted to compensate for their small numbers, making the sample a better representation of the underlying population.

Stem The statement portion of a question.

Stem-and-Leaf Plot A graphic or numerical display of the distribution of a variable.

Strategic Evaluation An evaluation used by managers as an aid to decide which strategy a program should adopt in order to accomplish its goals and objectives at a minimum cost. In addition, strategy evaluation might include alternative specifications of the program design itself, detailing milestone and flow networks, manpower specifications, progress objectives, and budget allocations.

Strategic Plan The process of comprehensive, integrative program planning that considers, at a minimum, the future of current decisions, overall policy, organizational development, and links to operational plans.

Stratified Random Sampling A sampling procedure for which the population is first divided into strata or subgroups based on designated criteria and then the sample is drawn, either proportionately or disproportionately, from each subgroup.

Structural Equation Modeling A method for determining the extent to which data on a set of variables are consistent with hypotheses about causal association among the variables.

Structured Interview An interview in which questions to be asked, their sequence, and detailed information to be gathered are all predetermined; used where maximum consistency across interviews and interviewees is needed.

Summative Evaluation A type of outcome evaluation that assesses the results or outcomes of a program. This type of evaluation is concerned with a program's overall effectiveness.

Supplementary Variable A variable upon which information is collected because of its potential relationship to a response variable.

Survey The collection of information from a common group through interviews or the application of questionnaires to a representative sample of that group.

Surveys Data collection techniques designed to collect standard information from a large number of subjects. Surveys may include polls, mailed questionnaires, telephone interviews, or face-to-face interviews.

Symmetric Measure of Association A measure of association that does not make a distinction between independent and dependent variables.

Systematic Sample A sample drawn by taking every nth case from a list, after starting with a randomly selected case among the first n individuals.


Target An objective (constraint or expected result) set by management to communicate program purpose to operating personnel (for example, maintaining a monthly output level).

Target Population The population, clients, or subjects intended to be identified and served by the program. FOR EXAMPLE, a boot camp program may identify, as its target population, 18-20 year old first-time violent offenders.

Telescoping The tendency of respondents (particularly in victim surveys) to move forward and report as having occurred events which actually took place before the reference period or time period being studied. FOR EXAMPLE, a respondent asked if she had been the victim of a robbery in the last year, recalls and reports an incident that actually occurred 18 months prior.

Testing Bias Bias and foreknowledge introduced to participants as a result of a pretest. The experience of the first test may impact subsequent reactions to the treatment or to retesting.

Test-retest Administration of the same test instrument twice to the same population for the purpose of assuring consistency of measurement.

Theory Failure A program shortcoming in which the intermediate program effects succeed as planned but the outcome criteria remain unchanged.

Time-series Designs Research designs that collect data over long time intervals - before, during, and after program implementation. This allows for the analysis of change in key factors over time.

Transformed Variable A variable for which the attribute values have been systematically changed for the sake of data analysis.

Treatment Group The subjects of the intervention being studied.

Treatment Variable An independent variable in program evaluation that is of particular interest because it corresponds to a program's intent to change some dependent variable. FOR EXAMPLE, number of sessions with the case counselor or participation in training programs.

Trend The change in a series of data over a period of years that remains after the data have been adjusted to remove seasonal and cyclical fluctuations.

Triangulation The combination of methodologies in the study of the same phenomenon or construct; a method of establishing the accuracy of information by comparing three or more types of independent points of view on data sources (for example, interviews, observation, and documentation; different times) bearing on the same findings. Akin to corroboration and an essential methodological feature of case studies.


Uniform Crime Reports Standard information maintained by the U.S. Department of Justice on crime statistics as reported by participating police departments. The UCR includes the number of offenses reported and arrests made for major categories of crime.

Unit of Analysis The class of elemental units that constitute the population and the units selected for measurement; also, the class of elemental units to which the measurements are generalized.

Univariate Analysis An analysis of a single variable.

Unobtrusive Measures Any method of data collection in which the subjects are not aware that they are being studied. FOR EXAMPLE, physical traces, observation, analysis of existing data, and archives.

Usability Evaluation Assesses the degree to which a product or item can be operated by its users, the efficiency of the product/item and/or satisfaction with the product or item by the users.


Validity The extent to which a measurement instrument or test accurately measures what it is supposed to measure. FOR EXAMPLE, a reading test is a valid measure of reading skills, but it is not a valid measure of total language competency.

Validity Assessment The procedures necessary to demonstrate that a question or questions are measuring the concepts that they were designed to measure.

Variable Variables can be classified into three categories:

  1. Independent (input, manipulated, treatment, or stimulus) variables, so called because they are "independent" of the outcome; instead, they are presumed to cause, effect, or influence the outcome.
  2. Dependent (output, outcome, response) variables, so called because they are "dependent" on the independent variable; the outcome presumably depends on how these input variables are managed or manipulated.
  3. Control (background, classificatory, or organismic) variables, so called because they need to be controlled, held constant, or randomized so that their effects are neutralized, canceled out, or equated for all conditions. Typically included are such factors as age, sex, IQ, SES (socioeconomic status), educational level, and motivational level; it is often possible to redefine these particular examples as either independent or dependent variables, according to the intent of the research.
A fourth category having to do with conceptual states within the organism is often cited: intervening variables (higher order constructs). These cannot be directly observed or measured and are hypothetical conceptions intended to explain processes between the stimulus and the response. Such concepts as learning, intelligence, perception, motivation, need, self, personality trait, and feeling illustrate this  category.

Variance A measure of the spread of the values in a distribution. The larger the variance, the larger the distance of the individual cases from the group mean.

Verification An effort to test the accuracy of the questionnaire response data. The concern is uniquely with data accuracy and deals with neither the reliability nor the validity of measures.

Victimization Information Data collected from the victims of crime concerning offenses of which they were the victims, offender characteristics, and/or victim characteristics.


Weighting The assignment of different adjustment factors to data in order to take into account the relative importance of that data.


X2 Chi-square measures the significance of a relationship if one exists.


Yoked Concurrent with. FOR EXAMPLE, data collection and analyses in case studies are iterative and concurrent - that is, are yoked.

Yule's Q (gamma) A special case of a measure of association (gamma) used with ordinal variables that can only be used in 2 x 2 tables.


Z Scores Standard deviation units measuring the deviation from the mean relative to the standard deviation.

Certisafety Section Home Page

Copyright ©2000-2019 Geigle Safety Group, Inc. All rights reserved. Federal copyright prohibits unauthorized reproduction by any means without permission. Disclaimer: This material is for training purposes only to inform the reader of occupational safety and health best practices and general compliance requirement and is not a substitute for provisions of the OSH Act of 1970 or any governmental regulatory agency. CertiSafety is a division of Geigle Safety Group, Inc., and is not connected or affiliated with the U.S. Department of Labor (DOL), or the Occupational Safety and Health Administration (OSHA).