Modern Approaches to Longitudinal Data Analysis Brent J. Small, PhD University U e s ty o of Sout South Florida, o da, Tampa, a pa, FL Moffitt Cancer Center, Tampa, FL Wefel JS, Vardy J, Ahles T, et al: International Cognition and Cancer Task Force recommendations to harmonise studies of cognitive function in patients with cancer. The Lancet Oncology 12:703-708, 2011 These are Classic Issues 1. Identification of intraindividual change 2. Identification of interindividual differences in intraindividual change 3. Interrelationships in behavioral change 4. Causes of intraindividual change 5 Causes of interindividual differences in 5. intraindividual change Baltes PB, Nesselroade JR: History and rationale of longitudinal research, in Nesselroade JR, Baltes PB (eds): Longitudinal research in the study of behavior and development. New York, NY, Academic Press, 1979, pp 1-39 Tailored to Cancer and Cognition 1. Does cognitive performance change among persons with cancer? 2. Identification of interindividual differences in intraindividual change 3 Interrelationships in behavioral change 3. 4. Causes of intraindividual change 5. Causes of interindividual differences in intraindividual change g Baltes PB, Nesselroade JR: History and rationale of longitudinal research, in Nesselroade JR, Baltes PB (eds): Longitudinal research in the study of behavior and development. New York, NY, Academic Press, 1979, pp 1-39 Tailored to Cancer and Cognition 1. Does cognitive performance change among persons with cancer? 2. Is there variability in rate of change in cognition across persons? 3 Interrelationships in behavioral change 3. 4. Causes of intraindividual change 5. Causes of interindividual differences in intraindividual change g Baltes PB, Nesselroade JR: History and rationale of longitudinal research, in Nesselroade JR, Baltes PB (eds): Longitudinal research in the study of behavior and development. New York, NY, Academic Press, 1979, pp 1-39 Tailored to Cancer and Cognition 1. Does cognitive performance change among persons with cancer? 2. Is there variability in rate of change in cognition across persons? 3 Are changes in different cognitive abilities 3. related to one another? 4. Causes of intraindividual change 5. Causes of interindividual differences in intraindividual change Baltes PB, Nesselroade JR: History and rationale of longitudinal research, in Nesselroade JR, Baltes PB (eds): Longitudinal research in the study of behavior and development. New York, NY, Academic Press, 1979, pp 1-39 Tailored to Cancer and Cognition 1. Does cognitive performance change among persons with cancer? 2. Is there variability in rate of change in cognition across persons? 3 Are changes in different cognitive abilities 3. related to one another? 4. Do certain events increase or decrease the rate of change? 5. Causes of interindividual differences in intraindividual change Baltes PB, Nesselroade JR: History and rationale of longitudinal research, in Nesselroade JR, Baltes PB (eds): Longitudinal research in the study of behavior and development. New York, NY, Academic Press, 1979, pp 1-39 Tailored to Cancer and Cognition 1. Does cognitive performance change among persons with cancer? 2. Is there variability in rate of change in cognition across persons? 3 Are changes in different cognitive abilities 3. related to one another? 4. Do certain events increase or decrease the rate of change? 5. Are there factors that predict why some people change more rapidly than others? Baltes PB, Nesselroade JR: History and rationale of longitudinal research, in Nesselroade JR, Baltes PB (eds): Longitudinal research in the study of behavior and development. New York, NY, Academic Press, 1979, pp 1-39 Challenges of Longitudinal Data • Attrition • Practice effects • Unbalanced research designs • Heterogeneity of change 1.0 Attention Memory Executive Functioning Motor Total Score 0.8 0.6 Z Z-Score 0.4 0.2 0.0 -0.2 -0.4 06 -0.6 -0.8 0 2 4 6 8 10 12 14 Time of Measurement Jacobs SR, Small BJ, Booth-Jones M, et al: Changes in cognitive functioning in the year after hematopoietic stem cell transplantation. Cancer 110:1560-7, 2007 Jacobs SR, Small BJ, Booth-Jones M, et al: Changes in cognitive functioning in the year after hematopoietic stem cell transplantation. Cancer 110:1560-7, 2007 What is Your Model of Change? • People change at the same rate – Repeated measures ANOVA is sufficient • People change at different rates – Random effects models are necessar necessary • Many investigators use statistical analyses that do not match up with their view of changes in behavior Outline • Repeated Measures ANOVA (rANOVA) – Classic method, but may not be optimal • Random Effects Models of Change – Dealing with ith attrition and practice effects – Growth Mixture Models – Latent Difference Score • Data Harmonization Approaches – How can we use the data we have already collected? rANOVA – Benefits • Readily available and widely respected • Of the 7 papers that examined differential change between groups, all used rANOVA Traditional Roadblocks Mi i D Missing Data t • rANOVA can be fitted with incomplete data • Anova and lmer in R suggest any difference can be overcome with extra model assumptions S h i it Assumption Sphericity A ti • Difficult to achieve with unbalanced designs • This issue has largely been settled through the use of correction factors (e.g., ԑ) • Methods – Studies on behavioral, systems and cognitive neuroscience from Nature, Nature Science, Science Nature Neuroscience, Neuron, and The Journal of Neuroscience were examined. – Use of appropriate methods for longitudinal comparisons p was examined • Group • Time • Group Gro p X Time Nieuwenhuis S, Forstmann BU, Wagenmakers E-J: Erroneous analyses of interactions in neuroscience: a problem of significance. Nature Neuroscience 14:1105-1107, 2011 Results Common Error • Researchers contrast significance levels of the two effects, rather than combine them. • The claim of differential change may not be statisticallyy valid • May increase Type I error rate rANOVA – Summary • Readily available and easy to learn • Reporting the group X time interaction is critical for establishing differential change • But does it allow us to answer the questions that we are most interested in? Random Effects Models (REM) • Can flexibly model time effects • It is less affected by randomly missing data • Use is increasing in psychological and medical literature, but can still pose difficulties Modeling change over time: An overview Building models at each of At two levels ofpersons) a At level-1 (within person) level-2 (between Model the individual change Model inter-individual differences in hierarchy trajectory, which describes how each person’ss status depends on time person change, which describe how features of the change trajectories vary across people residuals for person i, one per occasion i j 4 4 3 2 slope for person i (“growth rate”) 1 1 CA+ 2 1 CA- 0 0 0 intercept for person i (“initial status”) Prrocessing Speed Prrocessing Speed 3 1 2 Time Yij 0i 1i (Time)ij ij 0 1 2 Age 0 i 00 01 Group i 0 i 1i 10 11 Group i 1i © Singer & Willett, page 21 For intercepts For slopes Resources Practice Effects and REM • Repeated test exposure may bias estimates of true longitudinal changes – Generalized practice effects – Material specific practice effects • Solutions – Alternate forms – Sequential research designs – Including practice as a predictor in REM Changes in Cognition after HSCT • Examined whether cognitive functioning improved after HSCT • Evaluated whether changes were influenced by practice effects or participant attrition Jacobs SR, Small BJ, Booth-Jones M, et al: Changes in cognitive functioning in the year after hematopoietic stem cell transplantation. Cancer 110:1560-7, 2007 Research Design Sequential design was employed Group PreHSCT 6months post 12months post A X X X X X B C X Results Memory 1.0 No Controls Att iti Attrition Practice Attrition & Practice Z-Score 0.8 0.6 0.4 0.2 0.0 0 6 Time of Measurement (months) 12 Conclusions • Cognitive performance generally improved after HSCT • Sequential research designs are effective at addressing practice effects • However However, these designs are very costly costly, in terms of number of participants Growth Mixture Modeling Objectives • REM allow us to specify within person change processes. • Can we identify subgroups of patients based upon a pattern of fatigue scores following t t treatment? t? • Can these subgroups g be distinguished g by y certain demographic, clinical, and psychosocial variables? Growth Mixture Models • Variant V i t off random d effects ff t modeling d li • A categorical latent variable is incorporated to specify sub-populations • Used traditionally when the sub-populations are unknown Traditional Latent Modeling Subjects Overall Latent Curve Traditional Latent Modeling Group A Subjects Group B Subjects Overall Latent Curve Growth Mixture Modeling Group A Subjects Group B Subjects Overall Latent Curve Group A Latent Curve Group B Latent Curve Participant Characteristics Age 55.3 + 9.9 % CT+RT 42.5 Measurement Point n Fatigue Score Baseline 245 3.1 + 2.3 2 months 230 2 3 + 2.1 2.3 21 4 months 214 2.2 + 2.1 6 months 194 2.1 + 2.1 Measures • Composite Score of 4 FSI items – – – – Most fatigue Least fatigue Average level of fatigue Current level of fatigue g • Sca Scale e 0 to o 10 0 • Alpha: .92 Statistical Analyses • Linear Decline – 1 Class • Quadratic Decline – 1 Class – 2 Class – 3 Class Fit of Statistical Models Free Parameters -2LL Δ-2LL BIC ΔBIC Linear-1 class 6 1746.20 --- 3524.43 --- L + Q- 1 Class 10 1730.94 15.26 3516.94 -8.49 L+Q–2 Classes 20 1599.72 131.22 3309.55 -207.39 L+Q–3 Classes 30 1586.62 13.10 3338.39 +28.84 Model Model Estimates Growth Model n 246 Estimate Intercept 2.26 (.13)** Li Linear Slope Sl -.14 14 (.02)** ( 02)** Quadratic Slope *, p <.01; ** p < .001 .05 (.01)* Quadratic Mixture Models 4,5 4 35 3,5 3 2,5 , 2 1,5 1 0,5 0 O Overall-Predicted ll P di d End of 2 Months 4 Months 6 Months Treatment Model Estimates Growth Model Growth Mixture Model Class 1 Class 2 246 80 166 Intercept 2.26 (.13)** .58 (.08)** 2.95 (.16)** Li Linear Slope Sl -.14 14 (.02)** ( 02)** -.05 05 (.02)** ( 02)** -.17 17 (.04)** ( 04)** .05 (.01)* -.001 (.007) .07 (.02)** n Estimate Quadratic Slope *, p <.01; ** p < .001 Quadratic Mixture Models 4,5 4 35 3,5 3 2,5 , 2 1,5 1 0,5 0 Overall-Predicted Cl Class 1 ((n = 80) Class 2 (n = 166) End of 2 Months 4 Months 6 Months Treatment Predictors of Class Membership • Demographic – Age, race, education, marital status, education • Clinical – Treatment Treatment, disease stage ssurgery rger ttype, pe menopa menopausal sal status, hormone therapy, BMI, Charlson comorbidity • Psychosocial – Fatigue g catastrophizing, p g, exercise Multivariate LR Summary • Growth Mixture Models allow us to identify underlying homogeneity in heterogeneity of change • May be useful to identify those whose cognition is most affected by the diagnosis of and treatment ea e for o cancer. ca ce Latent Change Score Models Latent Change Score Models • Quite often we are interested in relating changes in one variable with changes in another b. Lifestyle Activities 55 55 50 50 T-scorre T-scorre a. Cognitive Performance 45 40 45 40 Verbal Speed Episodic Memory Semantic Memory Physical Activities Social Activities Cognitive Activities 35 35 0 2 4 6 8 Years of Follow-up Follow up 10 12 0 2 4 6 8 Years of Follow-up Follow up 10 12 Participants T t l Total S Sample l 1 S Sample l 2 952 446 506 68 6 + 6.7 68.6 67 68 9 + 5.8 68.9 58 68 3 +7.5 68.3 +7 5 Gender (% Female) 63.4 59.6 66.8* Education (M + SD) 14.2 + 3.1 13.4 + 3.1 14.8 + 3.1 N & Average Followup n M Yrs n M Yrs n M Yrs Wave 2 714 3.1 320 2.9 394 3.2 Wave 3 569 6.3 241 5.9 328 6.6 Wave 4 171 8.9 171 8.9 Wave 5 126 12.28 126 12.28 -- -- Baseline n Age (M + SD) Cognitive Performance Measures • Processing Speed – Lexical decision time – Semantic decision time • Episodic Memory – Word recall – Story recall • Se Semantic a c Memory e oy – Fact recall – Vocabulary Bivariate Latent Change Score Models Lifestyle Activities x0* FXT1 x0 xs* xs x0 xs x x xy FXT2 XT2 x xy FXT3 … XT3 … YT3 … FYT3 … 1 ys ys* y0* y0 ys y0 y YT2 yx y y FYT1 yx FYT2 Cognitive Performance McArdle JJ, Hamagami F: Latent difference score structural models for linear dynamic analyses with incomplete longitudinal data, in Collins LM, Sayer AG (eds): New methods for the analysis of change. Washington, DC, American Psychological Association, 2001, pp 139-175 Testable Models • • • • No coupling Activities predicting cognitive performance Cognitive performance predicting activities Dual coupling • Models are independent of age, gender, years off education d i and d self-reported lf dh health l h at baseline No Coupling Lifestyle Activities x0* FXT1 x0 xs* xs x0 xs x x xy FXT2 XT2 x xy FXT3 … XT3 … YT3 … FYT3 … 1 ys ys* y0* y0 ys y0 y YT2 yx y y FYT1 yx FYT2 Cognitive Performance Lifestyle Activities Predicting Cognition Lifestyle Activities x0* FXT1 x0 xs* xs x0 xs x x xy FXT2 XT2 x xy FXT3 … XT3 … YT3 … FYT3 … 1 ys ys* y0* y0 ys y0 y YT2 yx y y FYT1 yx FYT2 Cognitive Performance Cognition Predicting Lifestyle Activities Lifestyle Activities x0* FXT1 x0 xs* xs x0 xs x x xy FXT2 XT2 x xy FXT3 … XT3 … YT3 … FYT3 … 1 ys ys* y0* y0 ys y0 y YT2 yx y y FYT1 yx FYT2 Cognitive Performance Dual Coupling Lifestyle Activities x0* FXT1 x0 xs* xs x0 xs x x xy FXT2 XT2 x xy FXT3 … XT3 … YT3 … FYT3 … 1 ys ys* y0* y0 ys y0 y YT2 yx y y FYT1 yx FYT2 Cognitive Performance Results Summary • Models suggested that cognitive activities buffer cognitive decline, but reverse causation was present • Latent Change Score models allow us to examine how processes change together • Leading and lagging relationships can be posed, where experimental manipulation may be difficult Integrative Data Analysis What is Integrative Data Analysis? • Similar to a meta-analysis, but with raw data • Information from multiple samples is combined in one data analysis • There are numerous advantages, but also challenges to this method of data analysis Advantages • Increased frequencies of low base rate outcomes base-rate • Increased statistical power • Replication • Broader psychometric assessment of constructs • Increased sample heterogeneity Curran, P. J. & Hussong, A. M (2009). Integrative data analysis: The simultaneous analysis of multiple data sets. Psychological Methods, 14, 81-100 Challenges • Heterogeneity due to sampling • Heterogeneity due to measurement • Heterogeneity due to geographic region • Heterogeneity due to study design • Heterogeneity due to history Curran, P. J. & Hussong, A. M (2009). Integrative data analysis: The simultaneous analysis of multiple data sets. Psychological Methods, 14, 81-100 Latent Variables as an Approach • “Structural equation models do not require all variables to be measured on all individuals under all conditions conditions” (McArdle, (McArdle 1994) • Absent data is treated as incomplete or missing • Sensitivity analyses can be conducted that evaluate assumptions of incompleteness Latent Variables as Measures Verbal Ability: Study 1 Verbal Ability: Study 2 COWA Verbal Ability Boston Naming Vocabulary COWA Verbal Ability Boston Naming Vocabulary Illustration McArdle JJ, Grimm KJ, Hamagami F, et al: Modeling Life-Span Growth Curves of Cognition Using Longitudinal Data With Multiple Samples and Changing Scales of Measurement. Psychological Methods 14:126-149, 2009 Illustration McArdle JJ, Grimm KJ, Hamagami F, et al: Modeling Life-Span Growth Curves of Cognition Using Longitudinal Data With Multiple Samples and Changing Scales of Measurement. Psychological Methods 14:126-149, 2009 Summary • Integrative data analysis is receiving considerable attention by scientists and funding agencies • Groups like the ICCTF may be in a strong position to advocate for these projects and bring b g together oge e interested e es ed sc scientists e ss Conclusions • Test the models that reflect the underlying processes • Random effects models allow longitudinal data to be treated flexibly and innovatively • Make friends with your local statistician Acknowledgements • • • • • • • • Paull JJacobsen, P b M Moffitt ffitt C Cancer C Center t Heather Jim, Moffitt Cancer Center Roger g Dixon, University y of Alberta Christopher Hertzog, Georgia Institute of Technology Jack McArdle, University of Southern California Cathy McEvoy McEvoy, University of South Florida Funding – ACS RCS 01-070-01 (Booth-Jones, PI) – R01 CA82822 (Jacobsen, PI) – R03 AG024082 (Small, PI) – R37 AG008235 (Victoria Longitudinal Study: Dixon, PI) Contact Information: – [email protected]