Agreement Between Prospective and Retrospective Measures of Childhood Maltreatment

Key Points Question What is the agreement between prospective and retrospective measures of childhood maltreatment? Findings This systematic review and meta-analysis of 16 unique studies and 25 471 unique participants found poor agreement between prospective and retrospective measures of childhood maltreatment, with Cohen κ = 0.19. On average, 52% of individuals with prospective observations of childhood maltreatment did not retrospectively report it, and likewise, 56% of individuals retrospectively reporting childhood maltreatment did not have concordant prospective observations. Meaning Because findings from this meta-analysis demonstrated that prospective and retrospective measures of childhood maltreatment identify largely different groups of individuals, the 2 measures cannot be used interchangeably to study the associated health outcomes and risk mechanisms.

identified from the electronic searches and included in the analyses will be used to identify additional studies. We will identify all studies on this topic written in English and published before 1 January 2018.

Data extraction
Three authors will independently extract data from eligible articles. Inconsistencies will be resolved in consensus meetings and confirmed with the authors of the primary studies when necessary. Relevant missing information will be requested from authors.
We will also collect and code information about the following variables from all studies identified with prospective assessment of childhood maltreatment:  Study quality will be assessed with an adapted version of the Newcastle-Ottawa Scale, which has been recommended by the Cochrane collaboration. This will include whether: the sample was representative, non-maltreated participants were drawn from the same sample as the maltreated participants, sample retention was >70% between prospective and retrospective assessments, the prospective measure was validated (e.g., based on official records or instruments that have been tested for psychometric validity and reliability), the retrospective measure was validated, the prospective and retrospective measures assessed the same time period, and the prospective and retrospective measures was based on the same source or reporter.

Data synthesis
We will use the extracted data to build contingency tables in order to compute the following outcomes:  the prevalence of childhood maltreatment based on prospective or retrospective measures  the conditional probability of retrospective reports among those with prospective observations  the conditional probability of prospective observations among those with retrospective reports  the raw percent agreement between measures  Cohen's kappa If the identified studies report multiple effect sizes for different childhood maltreatment types, we will average the Cohen's kappas across maltreatment types to generate one overall effect size. We will also undertake a sensitivity analysis selecting the largest kappa from each study to assess the upper limit of agreement.
We will perform random-effects meta-analyses to summarize the outcomes listed above. In the presence of significant heterogeneity in effect sizes, we will perform subgroup analyses and meta-regression analyses to test the role of selected predictors.

Study selection
As shown in Figure 1, we identified k=7,279 articles through a search in MEDLINE, PsycINFO, Embase, and Sociological Abstracts. We reviewed the abstracts of these articles and removed those that did not assess childhood maltreatment prospectively, were literature reviews, case studies, conference proceedings, or duplicate articles (k=6,071). We then reviewed the full-texts of the remaining k=1,208 articles and excluded articles in which childhood maltreatment was not assessed prospectively or which were duplicates (k=157).
We also identified additional studies from citations of identified articles (k=2). Next, we extracted data from the remaining k=1,053 articles, pooling data from k=603 articles based on overlapping samples so that each sample was represented once. This resulted in k=450 independent samples with prospective measures of childhood maltreatment (shown in eAppendix 1 [description on page 18]). Finally, we excluded samples which did not have corresponding retrospective measures of childhood maltreatment (k=428) or which did not have data on agreement between prospective and retrospective measures of childhood maltreatment that we could obtain (k=2). This led to a final total of k=20 studies with data on agreement between prospective and retrospective measures of childhood maltreatment, and k=16 with paired data (e.g., comparing prospectively identified childhood maltreatment [yes/no] with retrospectively reported childhood maltreatment [yes/no]) to allow us to compute Cohen's kappa).

Sensitivity analyses
We ran sensitivity analyses selecting the highest effect size for the studies reporting multiple effect sizes for different childhood maltreatment types (instead of averaging them).
We tested publication bias visually through a funnel plot and formally through funnel-plotbased tests, such as the Begg's test and the Egger's test. The effect sizes and the corresponding sampling variances were not correlated (Begg's test: tau=0.18, p=0.344) but there was some asymmetry of the funnel plot (Egger's test: z=3.2718, p=0.001) suggesting possible publication bias. To identify and correct for funnel-plot asymmetry arising from publication bias, we used a trim-and-fill procedure. The trim-and-fill results were similar to the results of our original meta-analyses (kappa=0.23, 95%CI=0.17-0.30; p<0.001; I 2 =95%; k=17), suggesting no substantial role of publication bias on the meta-analysis results.
Jack-knife sensitivity analyses showed overall little evidence for undue effects of individual studies in the meta-analyses: The Cohen's kappa estimates in 16 automated permutations where each study was omitted in turn showed similar estimates and overlapping confidence intervals (kappa range=0.23-0.26).
Finally, we tested putative predictors of heterogeneity across studies with subgroup and meta-regression analyses. First, we considered if the measure used for prospective assessment of maltreatment could explain heterogeneity in effect sizes. Agreement with retrospective reports was similar regardless of whether prospective assessment was based on records (e.g., child protection records or medical reports; kappa=0.19, 95%CI=0. 10

Reporting methods should include
Description of relevance or appropriateness of studies assembled for assessing the hypothesis to be tested X Rationale for the selection and coding of data X Documentation of how data were classified and coded (eg, multiple raters, blinding, and interrater reliability) X Assessment of confounding X Assessment of study quality, including blinding of quality assessors; stratification or regression on possible predictors of study results X Assessment of heterogeneity X Description of statistical methods (eg, complete description of fixed or random effects models, justification of whether the chosen models account for predictors of study results, dose-response models, or cumulative metaanalysis) in sufficient detail to be replicated X Provision of appropriate tables and graphics X Reporting of results should include Graphic summarizing individual study estimates and overall estimate X 6 Data collection process 10 Describe method of data extraction from reports (e.g., piloted forms, independently, in duplicate) and any processes for obtaining and confirming data from investigators.

6-7
Data items 11 List and define all variables for which data were sought (e.g., PICOS, funding sources) and any assumptions and simplifications made.

8-9
Risk of bias across studies 15 Specify any assessment of risk of bias that may affect the cumulative evidence (e.g., publication bias, selective reporting within studies).

8
Additional analyses 16 Describe methods of additional analyses (e.g., sensitivity or subgroup analyses, meta-regression), if done, indicating which were pre-specified.

RESULTS
Study selection 17 Give numbers of studies screened, assessed for eligibility, and included in the review, with reasons for exclusions at each stage, ideally with a flow diagram.

Figure 1
Study characteristics 18 For each study, present characteristics for which data were extracted (e.g., study size, PICOS, follow-up period) and provide the citations.

22-23
Risk of bias within studies 19 Present data on risk of bias of each study and, if available, any outcome level assessment (see item 12). eTable 4; 13 Results of individual studies 20 For all outcomes considered (benefits or harms), present, for each study: (a) simple summary data for each intervention group (b) effect estimates and confidence intervals, ideally with a forest plot.

DISCUSSION
Summary of evidence 24 Summarize the main findings including the strength of evidence for each main outcome; consider their relevance to key groups (e.g., healthcare providers, users, and policy makers).

Section/topic # Checklist item Reported on page #
Limitations 25 Discuss limitations at study and outcome level (e.g., risk of bias), and at review-level (e.g., incomplete retrieval of identified research, reporting bias).

16-17
Conclusions 26 Provide a general interpretation of the results in the context of other evidence, and implications for future research.

16-17
FUNDING Funding 27 Describe sources of funding for the systematic review and other support (e.g., supply of data); role of funders for the systematic review. Whether the sample is officially named "y" = yes "n" = no study_name The name of the sample. If the sample does not have an official name, this is a brief description of the sample -example_author The first author's surname from an example study using that sample -example_year The year of publication of the example study using that sample -example_link The link to the example study using that sample -example_samplesize The sample size of the example study using that sample -example_ageatlatestassessment The age at latest assessment of the example study using that sample -example_genderfemale The proportion of females in the example study using that sample sampledescriptionnotes A brief description of the sample location The location of the sample -v_type The type of childhood adversity that was assessed "ace" = a range of childhood adversities including maltreatment "victimisation" = maltreatment + bullying "maltreatment" = multiple maltreatment subtypes (e.g., physical, sexual, and emotional abuse, neglect) "bullying" = bullying only "physical abuse" = physical abuse only "sexual abuse" = sexual abuse only "emotional abuse" = emotional abuse only "neglect" = physical or emotional neglect only "domestic violence" = domestic violence only "institutionalisation" = institutionalisation only © 2019 Baldwin J et al. JAMA Psychiatry.

Variable name Variable description Coding v_assessment
The method of prospective assessment of childhood maltreatment "welfare" = Child Protection Services records "hospital" = hospital records "interview" = interview "questionnaire" = questionnaire "mixed" = welfare/hospital records & interview/questionnaire "multiple" = interview & questionnaire v_reporter The reporter/source of the prospective assessment of childhood maltreatment "mixed" = welfare/hospital records & another reporter (e.g., parent, child, teacher) "multiple" = multiple reporters (e.g., parent, child, teacher) "records" = welfare/hospital records "parent" = parent "self" = child "teacher" = teacher retro_data Whether corresponding retrospective measures of childhood maltreatment are available "y" = yes "n" = no link_pro_retro_study Link to a study including both prospective and retrospective measures of childhood maltreatment -Note. Many different studies using prospective measures of childhood maltreatment were based on the same sample, and we pooled information across studies so that each sample was represented once in the dataset. However, for each sample we provide an example study. These example studies may not reflect the full breadth of all of the prospective measures of childhood maltreatment available for that sample.