American Sociological Association

Social Psychology Quarterly - Add Health

Design Features of Add Health

Kathleen Mullan Harris*
Carolina Population Center 
University of North Carolina at Chapel Hill 

October 2007 

Abstract:  This document was prepared to be used by those interested in a ready reference for the design features of the National Longitudinal Study of Adolescent Health (Add Health).  It provides a summary of features incorporated in the first ten years of completed work on the project.  It also provides a description of the field work planned for a fourth wave of home interviews to be conducted in the years 2007-08 on the same sample.  The reader is referred to our web site ( for additional information, and for availability of data.  Parts of this document may be incorporated into other documents for grant applications, papers for publication, or public presentations, without further permission.  This document will be useful for those planning to use existing Add Health data.

*Kathleen Mullan Harris, PhD, is Principal Investigator and Director of Add Health (  Address correspondence to Add Health, Carolina Population Center, CB# 8120, University Square, 123 W. Franklin St., Chapel Hill, NC 27516.



The National Longitudinal Study of Adolescent Health (Add Health) is a longitudinal study of a nationally representative sample of adolescents in grades 7-12 in the United States in 1994-95 who have been followed through adolescence and the transition to adulthood with three in-home interviews.  Add Health was developed in response to a mandate from the U.S. Congress to fund a study of adolescent health and was designed by a nation-wide team of multidisciplinary investigators from the social, behavioral, and health sciences.  The original purpose of the research program was to help explain the causes of adolescent health and health behavior with special emphasis on the effects of multiple contexts of adolescent life.  Add Health has become a national data resource for over 3,500 Add Health researchers who have obtained more than 300 independently funded research grants and have produced over 1,000 research articles published in multiple disciplinary journals and research outlets. 

This document describes the design features of Add Health, for data already collected in Waves I, II, and III, and to be collected in the Wave IV follow up.  We encourage researchers and students of public health, human development, biomedical sciences, and related fields to explore the possibilities in this rich dataset. 


Schools as Primary Sampling Units

Add Health used a school-based design.  The primary sampling frame was derived from the Quality Education Database (QED).  From this frame we selected a stratified sample of 80 high schools (defined as schools with an 11th grade and more than 30 students) with probability proportional to size.  Schools were stratified by region, urbanicity, school type (public, private, parochial), ethnic mix, and size.  For each high school selected, we identified and recruited one of its feeder schools (typically a middle school) with probability proportional to its student contribution to the high school, yielding one school pair in each of 80 different communities.  More than 70 percent of the originally selected schools agreed to participate in the study.  Replacement schools were selected within each stratum until an eligible school or school-pair was found.  Overall, 79 percent of the schools that we contacted agreed to participate in the study.  Because some schools spanned grades 7 to 12, we have 132 schools in our sample, each associated with one of 80 communities.  School size varied from fewer than 100 students to more than 3,000 students.  Our communities were located in urban, suburban, and rural areas of the country. 

From September 1994 until April 1995, in-school questionnaires were administered to students in these schools.  Each school administration occurred on a single day within one 45- to 60-minute class period.  Add Health completed in-school questionnaires from over 90,000 students.  The in-school questionnaire provided measurement on the school context, friendship networks, school activities, future expectations, and a variety of health conditions.  An additional purpose of the school questionnaire was to identify and select special supplementary samples of individuals in rare but theoretically crucial categories.  School administrators also completed a 30-minute questionnaire in the first and second waves of the study.


Core and Special Supplemental Samples

Add Health obtained rosters of all enrolled students in each school.  From the union of students on school rosters and students not on rosters who completed in-school questionnaires, we chose a sample of adolescents for a 90-minute in-home interview constituting the Wave I in-home sample.  To form a core sample, we stratified students in each school by grade and sex and randomly chose about 17 students from each strata to yield a total of approximately 200 adolescents from each pair of schools.  (Students who did not participate in the in-school survey were eligible to be selected for participation in this sample.)  The core in-home sample is essentially self-weighting, and provides a nationally representative sample of 12,105 American adolescents in grades 7 to 12.  From answers provided on the in-school survey, we drew supplemental samples based on ethnicity (Cuban, Puerto Rican, and Chinese), genetic relatedness to siblings (twins, full sibs, half sibs, and unrelated adolescents living in the same household), adoption status, and disability. We also oversampled black adolescents with highly educated parents.  The number of adolescents in each of these special samples who were interviewed at Wave I are shown in Table 1 below.  Note that individuals can be assigned to more than one group.  For example, a Cuban twin pair would be counted as two of the 538 Cubans and one of the 767 twin pairs in Wave I. 

Table 1.  Case Counts in Add Health Wave I Data

(Notes: overlaps not removed; * numbers represent pairs of adolescents). 

                    Core Sample 12,105

                    Cuban 538

            Puerto Rican 633

            Chinese 406

            High-Education Black 1,547

            Disabled 957

            Full Sibling* 1,251

            Half Sibling* 442

            Non-related* 662

            Adopted 560

            Twin* 784

            Saturated Sample 3,702 

For two large schools and 14 small schools, interviews with all enrolled students were also attempted in Wave I.  The two large schools were selected purposefully.  One is predominantly white and is located in a small town; the other is characterized by marked ethnic heterogeneity and is located in a major metropolitan area.  The 14 smaller schools are located in rural and urban areas, and both public and private schools are represented. We collected complete social network data in the saturated field-settings, providing unbiased and complete coverage of the social networks and romantic partnerships in which adolescents are embedded by generating a large number of romantic and friendship pairs for which both members of the pair have in-home interviews.  The core sample plus the special samples produced a sample size of 20,745 adolescents in Wave I.  Below we show the Add Health study design for selecting the in-school

Your browser may not support display of this image. 
and in-home Wave I samples.  The Wave I in-home sample is the basis for all subsequent longitudinal follow-up interviews, and thus this innovative design remains a major strength of the longitudinal data as well.  

Seventy-nine percent of all sampled students in all of the groups participated in Wave I of the in-home phase of the survey (20,745).  A parent, usually the resident mother, also completed a 30-minute op-scan interviewer-assisted interview.  Over 85 percent of the parents of participating adolescents completed the parental interview in the first wave.  The parent questionnaire gathered data on such topics as heritable health conditions, marriage and marriage-like relationships, involvement in volunteer, civic, or school activities, health-related behaviors, education, employment, household income and economic assistance, parent-adolescent communication and interaction, the parent's familiarity with the adolescent's friends and friends' parents, and neighborhood characteristics. 

In 1996, all adolescents in grades 7 through 11 in Wave I (plus 12th graders who were part of the genetic sample and the adopted sample) were followed up one year later for the Wave II in-home interview (N=14,738).  We conducted the adolescent in-home interviews using audio-CASI technology (audio-computer assisted self interview) on laptop computers for sensitive health status and health-risk behavior questions. Add Health was the first national study to use ACASI technology in an adolescent population.  The use of ACASI and CASI techniques has been found to enhance the quality of self-reporting of sensitive and illegal information (Turner et al. 1998). 

The school and Waves I and II in-home interviews constitute the adolescent period in Add Health and contain unique data about family context, school context, peer networks, spatial networks, and genetic pairs.  The social context data, in particular, are unusual because we did not rely on self-reports to generate an image of an adolescent's world.  Family context data come from parent questionnaires, from adolescent in-school and in-home questionnaires, and from interviews with additional adolescents living in the same household.  For certain measures, such as parenting behaviors and parent-child relations, we have reports from both the child's and the parent's perspective (and in some cases from a sibling as well). 

Contextual Data

School context data come from the administrator questionnaires (usually principals) who reported on school policies, the provision of health services, and other school characteristics.  In addition, school context variables can be constructed by aggregating student responses from the in-school and in-home questionnaires, enabling researchers to describe schools with respect to their social demographic composition, the behaviors of their students, the health status of their students, and the attitudes of their students towards school. 

Peer network data were obtained in the in-school questionnaire.  Adolescents nominated their five best male and five best female friends from the school roster (using a unique id).  Because nominated school friends also took the in-school interview, characteristics of respondents peer networks can be constructed by linking friends' data from the in-school questionnaire and constructing variables based on friends' actual responses.  In the in-home Wave I and Wave II interviews, respondents nominated their best friend, as well as their romantic and sexual partners.  If their friend or partner is also a member of the in-home sample, their data can be linked to construct friendship and partner contexts.  In the 16 schools that were part of the “saturated” sample, all students in the school were also interviewed in the home.  Complete friendship and sexual networks can therefore be constructed with these data. 

Spatial data indicating the exact location of all households in the survey were collected using hand-held Global Positioning System (GPS) devices or recording actual addresses.  These data make possible the interweaving of spatial and social networks, and the construction of community contexts.  More than 2,500 attributes for community and neighborhood contexts at multiple spatial units of observation have been obtained and merged with the Wave I and Wave II survey data to describe the neighborhood and community contexts in which adolescents are embedded.  Neighborhood and community data were gathered from a variety of sources, such as the U.S. Census, the Centers for Disease Control and Prevention, the National Center for Health Statistics, the Federal Bureau of Investigation, and the National Council of Churches. 

Finally, the “genetic pairs data,” based on more than 3,000 pairs of adolescents who have varying degrees of genetic relatedness (see Table 1), represent a fully articulated behavioral genetic design, and are unprecedented for a national study of this magnitude.  These data represent pairs of adolescents who took the exact same questionnaires, share the same home environment, and share, in most cases, the same school and neighborhood environment.  Thus, from the outset, Add Health was designed to address biological contributions to health by permitting researchers to explore gene-environment interactions in relation to health and behavioral outcomes.  In addition, the embedded genetic design created new opportunities for research on adolescents living in relatively rare family structures, such as blended step-families and surrogate-parent families, and research on adopted children with a remarkable sample of 560 adopted children in Wave I of Add Health.  In all followup interviews, high priority has been placed on locating and reinterviewing pairs in the genetic sample to maintain the integrity of this sample for longitudinal research purposes.  

The Transition to Adulthood: Wave III

With NICHD funding for a continuation of the program project, Add Health conducted a Wave III follow-up interview with original Wave I respondents as they entered the transition to adulthood.  When adolescents finish high school, they enjoy greater independence and begin to explore new lifestyles.  As a result, their social contexts change and their experiences broaden. Wave III data capture these expanding experiences by focusing on the multiple domains of young adult life that individuals enter during the transition to adulthood, and their well-being in these domains: labor market, higher education, relationships, parenting, civic participation, and community involvement. With the longitudinal data from adolescence, this third wave of in-home interviews allows researchers to map early trajectories out of adolescence in health, achievement, social relationships, and economic status and to document how adolescent experiences and behaviors are related to decisions, behavior, and health outcomes in the transition to adulthood.  The fundamental purpose of this third follow-up was to understand how what happens in adolescence is linked to what happens in the transition to adulthood when adolescents begin to negotiate the social world on their own and develop their expectations and goals for their future adult roles.

Wave III data collection was conducted nationwide (including Hawaii and Alaska) between August 2001 and April 2002.  Respondents were now aged 18-26 and in the midst of the transition to adulthood.  Add Health completed interviews on 15,170 respondents at Wave III, resulting in a 76% response rate. (For more details, see Wave III nonresponse report at  In the interest of confidentiality, no paper questionnaires were used.  As in earlier waves, data were recorded on laptop computers. For less sensitive material, the interviewer read the questions and entered the respondent's answers. For more sensitive material, the respondent entered his or her own answers in privacy. The average length of a complete interview was 134 minutes. The laptop interview took approximately 90 minutes and was immediately followed by the collection of biological specimens. Most interviews were conducted in respondents' homes. 

In Wave III we continued to collect data on health and health related behavior that were measured at earlier waves, including repeated measures of diet, physical activity, access and use of health services, sexual behavior, contraception, sexually transmitted infections, pregnancy and childbearing, suicidal intentions and thoughts, mental health and depression, substance use and abuse, injury, delinquency, and violence.  We again obtained physical measurements of height and weight, and collected data on pubertal development, chronic and disabling conditions, and other forms of morbidity.  Wave III contains new data specific to the late adolescent, young adulthood life stage on parent-child and sibling relations, contact with friends from high school, the role of mentors and mentoring relationships, personal income, wealth and debt, civic and political participation, children and parenting, involvement with the criminal justice system, and religion and spirituality.  Extensive data were collected on relationships, including a complete history since Wave I and measures of relationship intimacy, quality, commitment, shared activities, length, exclusivity, and sexual, union, and fertility behaviors.  Codebooks for all three waves of Add Health instruments can be downloaded from the Add Health website at

Wave III Design Features

We incorporated several new design features into Wave III that tapped research topics particularly salient in the late adolescent-early adulthood life stage. 

  • Binge Sample:  A special sub-sample of freshman and sophomores in 2- and 4-year colleges was chosen, along with a control group of non-college same-age peers, and  administered additional questions about binge drinking for an independently funded R01 grant on that topic.
  • Biological Specimens:  New data collection of biological specimens was included in Wave III.  At the end of the Wave III interview, urine and saliva samples were collected for tests of HIV and curable STDs.  Additional saliva was also collected from a subsample of the genetic sample (full sibs and twins) for DNA extraction.  (For more information, see the report on Wave III Biomarker Data Collection at  Several publications using these data have reported STD and HIV prevalence rates found in our national sample at Wave III (Miller et al. 2004, 2005; Morris et al. 2006).
  • Couples Sample:  Wave III contains a new “couples sample” in which Add Health respondents recruited their romantic partners to take the same Wave III interview as their Add Health partner.  The couples sample is made up of slightly over 1,500 pairs of partners, with roughly 500 married couples, 500 cohabiting couples, and 500 dating couples.

For more details on the Add Health research design, design documents, available data sets, codebooks, and publications, please see Harris et al., 2003. 

Additional Add Health Wave III Data

There are two independently funded projects that will contribute additional data to Add Health.  Researchers at the University of Texas have collected the high school transcripts of Add Health Wave III sample members in coordination with the Wave III fieldwork.  This study, Adolescent Health and Academic Achievement (AHAA) collected high school transcripts and other data from all Add Health high schools except two special education schools that did not maintain students' academic transcripts, and from approximately 1,400 additional schools where Add Health respondents last attended high school. Approximately 91% of Wave III respondents signed a valid transcript release form and high school transcripts were collected for most respondents (N= approximately 12,000).  The transcripts were coded using procedures designed for the National Educational Longitudinal Study (NELS) and the National Assessment of Educational Progress (NAEP).  The transcript data provide detailed information on students' grades, courses taken, the overarching academic structure of each students school, and information on the last school each student attended.  Add Health has merged these education data and releases them as part of Add Health.  For more information on the construction of the education data, please see

Add Health researchers Barry Popkin and Penny Gordon-Larsen at UNC have received funding to develop a database of time-varying modifiable physical activity-related environmental factors for Waves I, II, and III.  Using a multidisciplinary approach that blends spatial analysis methodologies with traditional epidemiological methods, they are linking area-level data to the individual data in Add Health that will add community-level measures such as recreation facilities (public, private), transportation options, crime, land use, air pollution, walkability, climate, and cost of living.  The environmental data come from US Geologic Survey, US Census, US Department of Labor Statistics, and others extant sources.  Add Health is in the process of making these additional environmental data publicly available on a flow basis beginning in 2008. 


A fourth in-home interview will be fielded with original Wave I respondents in 2007-08 when they are aged 24-32, representing a Wave IV follow-up of a nationally representative sample of over 20,000 adolescents in 1994-95.  The scientific purpose of Wave IV is to study developmental and health trajectories across the life course of adolescence into young adulthood using an integrative approach that combines social, behavioral, and biomedical sciences in its research objectives, design, data collection, and analysis. We have designed Add Health Wave IV in the vision of the NIH Roadmap by integrating biological information into our models of health and human development and by stimulating interdisciplinary research teams that aim to reshape biomedical research to improve people's health (Zerhouni 2003).  Add Health participants will be ages 24-32 and settling into young adulthood at Wave IV.  At the same time that the Add Health cohort assumes adult roles and responsibilities, they develop crucial health habits and lifestyle choices that set pathways for their future adult health and well-being.   

We will collect longitudinal survey data on the social, economic, psychological, and health circumstances of our respondents, longitudinal geographic data, and new biological data to capture the prevailing health concerns of our Add Health cohort as well as biological markers of future chronic health conditions and disease. When Wave IV data are combined with existing longitudinal Add Health data over 10 years of our respondents' lives beginning in adolescence and extending through their transition to adulthood, Add Health will provide unique opportunities to study linkages in social, behavioral, and biological processes that lead to health and achievement outcomes in young adulthood.   

Several features of the Wave IV data collection represent new directions in Add Health, including methods to obtain more objective measures of health status and exploring alternative modes of reinterviewing our sample in the future.  Wave IV will employ innovations in the collection of biological measures in a field setting on a large national sample that are both practical and ground-breaking.  For example, we plan to collect DNA on everyone in our national sample and obtain indicators of metabolic syndrome and immune functioning using noninvasive procedures.  The combination of longitudinal social, behavioral, and environmental data with new biological data will expand the breadth of research questions that can be addressed in Add Health regarding predisease pathways, gene-environment interactions, the relationship between personal ties and health, factors that contribute to resilience and wellness, and environmental sources of health disparities. 

Once Wave IV data collection is complete at the end of 2008, information on available data will be posted on the Add Health website.  Data will be released in late summer 2009, though biological data may be released later than that to allow the labs to conclude their work.  Data from Wave IV will contain an AID that will enable researchers to link Wave IV data with that collected in Waves I, II, and III.  A public-use dataset will be created containing survey data for the same people who are in the current Add Health public-use dataset.  The contractual dataset will contain survey data for all respondents and will be available under contractual agreement to qualified researchers.  The Add Health safeguards--(1) providing only half the core cases for public use to limit the risk of deductive disclosure, and (2) requiring researchers and their institutions to enter into a contract with CPC--will be continued for the Wave IV data. 


Miller, W.C., C.A. Ford, M. Morris, M.S. Handcock, J.L. Schmitz, M.M. Hobbs, M.S. Cohen,

    K.M. Harris, and J.R. Udry.  2004. “The Prevalence of Chlamydial and Gonococcal Infection Among Young Adults in the United States.”  Journal of the American Medical Association 291(18):2229-2236.

Miller, W.C., H.Swygard, M.M. Hobbs, C.A. Ford, M. Morris, M.S. Handcock, J.L. Schmitz,

    M.S. Cohen, K.M. Harris, and J.R. Udry.  2005.  “The Prevalence of Trichomoniasis in Young Adults in the United States.” Sexually Transmitted Diseases 32(10):593-598.

Morris, M., M.S. Handcock, W.C. Miller, C.A. Ford, J.L. Schmitz, M.M. Hobbs, M.S. Cohen,

    K.M. Harris, and J.R. Udry.  2006. “Prevalence of HIV Infection Among Young Adults in the U.S.: Results from the Add Health Study.”  The American Journal of Public Health 96(6):1091-1097.

Zerhouni, E.  2003. “The NIH Roadmap.” Science 302:63, 64, 72.  

Get Involved


ASA needs you to serve the discipline

Join a Section

Explore ASA's community of specialists in the discipline

Join ASA

Join or renew your membership