If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Centre for Affective Disorders, Department of Psychological Medicine, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, United KingdomNational Service for Affective Disorders, South London and Maudsley NHS Foundation Trust, London, United Kingdom
Centre for Affective Disorders, Department of Psychological Medicine, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, United KingdomNational Service for Affective Disorders, South London and Maudsley NHS Foundation Trust, London, United Kingdom
Centre for Affective Disorders, Department of Psychological Medicine, Institute of Psychiatry, Psychology & Neuroscience, King’s College London, London, United KingdomNational Service for Affective Disorders, South London and Maudsley NHS Foundation Trust, London, United KingdomCognitive and Behavioral Neuroscience Unit, D’Or Institute for Research and Education, Rio de Janeiro, Brazil
Overgeneralized self-blaming emotions, such as self-disgust, are core symptoms of major depressive disorder and prompt specific actions (i.e., action tendencies), which are more functionally relevant than the emotions themselves. We have recently shown, using a novel cognitive task, that when feeling self-blaming emotions, maladaptive action tendencies (feeling like hiding and feeling like creating a distance from oneself) and an overgeneralized perception of control are characteristic of major depressive disorder, even after remission of symptoms. Here, we probed the potential of this cognitive signature, and its combination with previously employed functional magnetic resonance imaging (fMRI) measures, to predict individual recurrence risk. For this purpose, we developed a user-friendly hybrid machine/statistical learning tool, which we make freely available.
A total of 52 medication-free patients with remitted major depressive disorder, who had completed the action tendencies task and our self-blame fMRI task at baseline, were followed up clinically over 14 months to determine recurrence. Prospective prediction models included baseline maladaptive self-blame–related action tendencies and anterior temporal fMRI connectivity patterns across a set of frontolimbic a priori regions of interest, as well as including established clinical and standard psychological predictors. Prediction models used elastic net regularized logistic regression with nested 10-fold cross-validation.
Cross-validated discrimination was highly promising (area under the receiver-operating characteristic curve ≥ 0.86), and positive predictive values over 80% were achieved when including fMRI in multimodal models, but only up to 71% (area under the receiver-operating characteristic curve ≤ 0.74) when solely relying on cognitive and clinical measures.
This study shows the high potential of multimodal signatures of self-blaming biases to predict recurrence risk at an individual level and calls for external validation in an independent sample.
). These models have pointed to overgeneralized self-blame as causing worthlessness and hopelessness, core symptoms of MDD that are distinctive compared with other recurrent emotional disorders such as panic disorder (
). Indeed, using a novel action tendencies task, we have observed that self-blame–related action tendencies, in particular feeling like creating a distance from oneself and hiding, as well as an overgeneralized perception of control for other people’s wrongdoing, were more frequent in patients with remitted MDD compared with control participants cross-sectionally (
). In this comparison, feeling like hiding was also more common in patients with MDD when their friend was described to act badly toward them, which could be interpreted in different ways but may reflect an overgeneralized sense of responsibility (
). If measures of maladaptive self-blame–related action tendencies prospectively predicted future recurrence risk in remitted MDD, this would provide critical evidence for their role in MDD vulnerability. Apart from its pathophysiological importance, the identification of novel predictors of recurrence is needed for developing accurate risk prediction tools, because clinical variables are poor predictors at an individual level (
) has raised great promise for developing predictive markers in MDD. Machine learning models are powerful at making personalized predictions because they condense multiple variables, such as MRI signal in different brain regions, or different modalities (e.g., cognitive, structural MRI and functional MRI [fMRI]) (
). Successful development of novel treatments and stratification algorithms to improve long-term outcomes of recurrent MDD, however, requires predictors that capture the neurocognitive underpinnings of asymptomatic precursors of recurrent symptoms. Using a standard general linear model, we previously identified an fMRI signature of patients with MDD who will develop a recurring episode after recovering from previous depression over 1 year of follow-up (
), but the machine learning approach we used to show its predictive value at the individual level with a positive predictive value of 74% had two main weaknesses. First, it used a high variance leave-one-out cross-validation method (
). Second, it used the extracted clusters showing the most significant association with recurrence risk in our SPM general linear model. This technique overestimates model performance owing to selection of voxels and variables on the basis of another model comparing recurring episode and stable remission patients in the same sample (
), suggesting its central importance. The SCC, however, is only part of a network of regions involved in self-blaming feelings. Using fMRI, we demonstrated that proneness to self-blaming feelings in healthy people was associated with increased functional connectivity between the SCC and the right superior anterior temporal lobe (ATL) (
), which we had previously demonstrated to enable differentiated interpretations of the meaning of social behavior (e.g., differentiating actions as impolite or absent-minded, rather than just overgeneralizing them as “bad”) (
In an independent previous sample of patients with remitted MDD, we demonstrated abnormal functional connectivity between the right superior ATL and frontolimbic networks when experiencing self-blaming feelings relative to feelings related to blaming others (compared with never-depressed control participants independent of medication status) (
) neuroanatomical model of mood disorders, which includes the superior ATL because of its close anatomical connection with the medial prefrontal network including the subgenual cortex and the hypothalamus, thereby providing a crucial and often forgotten link between the limbic forebrain (including the hippocampus) and the frontal cortex. The ATL is probably underreported because fMRI in this region requires optimized imaging parameters owing to the abundance of artifacts (
The aim of the current study was to develop a user-friendly prediction tool and use it to provide robust estimates for the potential to predict MDD recurrence risk at the individual level, when using cognitive and fMRI-based signatures of self-blaming biases. Further, as one could argue that fMRI is relatively expensive for widespread use, we also probed the potential to replace it by our novel cognitive task for future clinical decision support systems.
Methods and Materials
Participants and Standard Measures
Participants were recruited as a part of a prospective cohort study (from 2011 to 2014) in a clinical research facility (ethics reference: 07/H1003/194) (
), remitted for ≥6 months, psychotropic medication–free, right-handed, native English speaking, and normal or corrected-to-normal vision. Main exclusion criteria were current Axis I disorders including a history of substance or alcohol abuse, and past comorbid Axis I disorders being the likely cause of depressive symptoms. Patients were subsequently followed up clinically over 14 months using the well-validated Longitudinal Interval Follow-up Evaluation for DSM-IV (
For our multimodal prediction model, we used a complete-cases analysis (see Supplemental Methods) including 52 patients with MDD with complete follow-up data that we were able to categorize into recurring episode over 14 months (n = 18) versus stable remission with no recurring episode (n = 34).
The following standard clinical and psychological measures were selected for use in the prediction model as based on previous reports establishing them as reproducible predictors of MDD recurrence at the group level: the number of previous episodes (categorized into nonrecurrent [i.e., 1 previous episode], recurrent [2–4 episodes], and highly recurrent [>5 episodes]) (
Stimuli for both fMRI and behavioral tasks were written sentences each presenting an abstract hypothetical social behavior contrary to social and moral values. Participants were asked to imagine the situation in each stimulus for two conditions differing by the agency of the participant. In the self-agency condition, the participant was described to act toward their friend, and in the other-agency condition, their friend acted toward them. Stimuli were based on the value-related moral sentiment task (VMST) used previously (
). The same social behaviors were used in both conditions (90 trials per condition, 50% per se negative social behaviors [e.g., “does act stingily”] and 50% negated positive behaviors [e.g., “does not act generously”]). Participants were asked to provide the name of their best friend of the same gender, with whom they were not related and not romantically involved.
After the fMRI and behavioral tasks, participants were asked to rate all items in the self- and other-agency conditions for unpleasantness ("How strongly would you feel unpleasant feelings?": scale of 1 [not unpleasant] to 7 [extremely unpleasant]). As in our previous article (
), we restricted our analyses to the items deemed most emotionally relevant to each participant (defined as items rated equal or higher than the individual’s median unpleasantness rating for each condition: self- and other-agency).
). Participants were presented with hypothetical social actions in self-agency and other-agency conditions (stimuli described above). Stimuli were presented for 5 seconds in three runs in pseudorandom order; runs were counterbalanced across participants and were interspersed with a baseline visual fixation of pattern condition (n = 90). In the scanner, participants were asked to decide whether each situation would feel mildly or very unpleasant to ensure that they paid attention to the task and that they made an emotional decision about the stimuli. We used a jittered (range = 500 ms) intertrial interval with mean duration of 4 seconds.
); 6-mm-radius sphere; Montreal Neurological Institute coordinates: x = 58, y = 0, z = −12] and the creation of an interaction term with the psychological variable (the contrast between the most highly unpleasant items in the self-agency condition vs. the visual fixation condition and highly unpleasant items in the other-agency condition vs. the visual fixation condition).
Regression coefficient averages over a priori regions of interest (ROIs) (depicted in Figure 1B and further described below) as defined previously using an independent sample (
) and used to capture self-blame–related fMRI connectivity with our ATL seed region, as well as to capture ATL connectivity with these ROIs irrespective of psychological condition, while covarying for root mean square movement parameters during the scan (obtained from the realignment process). In addition, we modeled standard blood oxygen level–dependent effects following the modeling approach as previously described (
), but in order to reduce the number of ROIs, we chose only right-sided ROIs reflecting our right-sided ATL seed region, except for the midline regions, in which we used bilateral ROIs (see Supplemental Methods and Table S2).
), for each described social behavior, the participant was asked, “What would you feel like doing in response to this behavior?” “Please select the option that you feel that you would most strongly feel like doing” (see Figure S1): “verbally or physically attacking/punishing your best friend,” “verbally or physically attacking/punishing yourself,” “apologising/fixing what you have done,” “hiding,” “creating a distance from your best friend,” “creating a distance from yourself,” “no action,” or “other action.” They then rated the hypothetical situation regarding how much control they feel they would have (responses ranging from 1 [none at all] to 7 [complete]).
We have previously reported that self-agency–related hiding and self-distancing, as well as an overgeneralized perception of control for others’ wrongdoing, were characteristic of MDD at baseline compared with control participants (
), we separately calculated the proportion of trials in which hiding or self-distancing were selected in the self-agency condition for the highly unpleasant items as defined using individual median splits as described above. In order to reduce the number of variables, we z-transformed these proportions across the whole sample to combine hiding and self-distancing into a single average of z scores of maladaptive self-blaming action tendencies. We measured overgeneralized perception of control as done previously by subtracting each subject’s average perceived control rating in the highly unpleasant other-agency condition trials from the same quantity in their highly unpleasant self-agency trials. Perceived control is expected to be higher during self-agency, so a smaller value in the calculated difference measure indicates abnormally higher perception of control in the other-agency condition, which is conceptually related to the notion of omnipotent responsibility guilt (i.e., caused by an exaggerated sense of responsibility for others’ well-being) as shown to be associated with MDD (
We developed a software tool (AL) to implement the proposed methods and make this tool available for use by the scientific and clinical community as an R package (https://www.github.com/AndrewLawrence/dCVnet). We adopted a hybrid approach combining statistical and machine learning to balance their benefits and limitations in this type of prediction problem. In contrast to traditional statistical methods, we reduced overfitting and allowed correlated predictors through regularization. We also carefully cross-validated performance by including all modeling decisions within the cross-validation. This hybrid approach can likewise be contrasted to more complex (and data-hungry) (
) machine learning methods: each model uses relatively few of the available predictors, which are chosen a priori based on previously published results and clinical utility. To further reduce complexity, we used a model that assumes that the effects of the predictors are linear and additive, and any interaction terms or predictor transformations must be prespecified. The software tool is called dCVnet, for “double cross-validation for the elastic net.” It employs elastic net–regularized logistic regression with double (also termed “nested”) cross-validation, consisting of an outer cross-validation of model performance measures with an inner cross-validation to independently tune elastic net hyperparameters.
Elastic Net Model
Elastic net–regularized binary logistic regression extends logistic regression with regularization. This acts as a penalty to model complexity and reduces this by shrinking coefficients toward zero (
). The elastic net regularization penalty comprises two types of penalty with different roles: 1) the L2 (Ridge) penalty allows correlated predictors to jointly enter the model and stabilizes solutions and 2) the L1 (LASSO) penalty encourages variable selection (
). The amount of regularization (hyperparameter: lambda) and balance of the two types of regularization penalty (hyperparameter: alpha; indicating the fraction of the penalty that is of LASSO type) are hyperparameters of the elastic net procedure in that they determine the learning performance of the algorithm, and tuning alpha and lambda adapts the algorithm to different problem settings.
Cross-Validation and Tuning
We implemented double cross-validation in dCVnet to obtain performance estimates without optimistic bias at the same time as tuning hyperparameters based on cross-validated performance (
). For both inner and outer cross-validation loops, our tool employed repeated k-fold cross-validation (Figure S4), an unbiased estimate of expected (out-of-sample) prediction error with lower variance than bootstrap, hold-out, and leave-one-out cross-validation methods (
For this particular application of dCVnet, stable hyperparameter selection could be obtained (for all models) with 30 repetitions of 10-fold cross-validation, and stable cross-validation results in 100 repetitions of 10-fold cross-validation. For tuning, six logarithmically spaced values of alpha were considered between 0.01 (mostly Ridge) and 1.0 (a LASSO model). For each alpha, 100 lambda values were determined from the data by glmnet (
). The tuned alpha and lambda were selected based on the minimum mean square error.
dCVnet was programmed in R software version 3.6.2 (R Foundation for Statistical Computing) and makes repeated calls to the glmnet 3.0.2 package’s binomial elastic net fitting function to produce the multiple models required for nested k-fold cross-validation (
). The novelty of dCVnet lies in providing a documented and standardized implementation of this particular machine learning pipeline, making it accessible to researchers lacking programming experience required for more general machine learning software environments (e.g., sklearn, caret, tidymodels, or mlr). We make this software including documentation available (https://www.github.com/andrewlawrence/dCVnet) to improve reproducibility and future open source development of clinical decision support systems. We ran permutation tests to rule out biases and information leakage within dCVnet, which was confirmed by a distribution of the balanced accuracy measures from the permuted datasets around the expected mean of 50% (i.e., chance) (Figure S2, Table S1, and Supplemental Methods).
Predicted probabilities of depression recurrence were compared with the known recurrence status to calculate a variety of cross-validated prediction performance measures (
). These included common classification metrics (such as positive predictive value and negative predictive value) based on a 50% classification cutoff. For positive predictive value and negative predictive value, we adjusted our estimates for the 42% prevalence of recurrence in the wider study sample. We further considered threshold-independent performance measures: the Brier score (mean squared difference between predicted probability and binary outcome) and the area under the receiver-operating characteristic curve (AUC) (also termed the concordance statistic; see Supplemental Methods). Finally, we considered model calibration (i.e., the fidelity of the predicted probability to the observed proportion with the outcome) by calculating the intercept and slope of the calibration graph (
). Consistent with our aims, at this point in model development, the measure of prime interest is the AUC, as this reflects the discriminative potential of a model independent of the classification threshold, or model calibration.
As shown in Table 1, there were no demographic differences between groups. As expected, the recurring episode group showed a higher Beck Depression Inventory score at baseline. We obtained an AUC of at least 0.86, ≥80% cross-validated accuracy, and moderate calibration for classifying patients into recurring episode versus stable remission when combining multimodal information from fMRI and standard clinical and psychological measures, with or without our novel action tendencies task (Table 2, Figure 1). We further probed the contributions of the different modalities to predictive performance. Relying solely on standard clinical and psychological measures achieved an AUC of 0.73 and a positive predictive value of 68% (Table 2). Our results showed, however, that the novel cognitive measure, our action tendencies task, did not improve prediction performance to a relevant degree (AUC = 0.74; positive predictive value of 71%) (Table 2). We therefore also ran a prediction model that only contained the action tendencies task variables to demonstrate that they were associated with recurrence risk (AUC = 0.67) (Table 2), but this model performed slightly worse compared with known clinical and standard measures. Likewise, when using fMRI without standard measures, there was an association with recurrence risk (AUC = 0.69), but this was not as strong as in the multimodal model.
Significant group difference at an uncorrected two-sided threshold of p ≤ .05.
MADRS Score, Mean (SD)
GAF Score, Mean (SD)
A total of 52 patients with fully remitted MDD had available baseline data for all predictors included in the multimodal prediction model and were followed up clinically over 14 months to determine recurrence. As expected, there were slightly higher residual depressive symptoms on the more sensitive BDI in the group with a subsequent recurrence.
BDI, Beck Depression Inventory; GAF, Global Assessment of Functioning; MADRS, Montgomery–Åsberg Depression Rating Scale; MDD, major depressive disorder.
a Significant group difference at an uncorrected two-sided threshold of p ≤ .05.
When comparing unregularized odds ratios from single-predictor logistic regressions with the regularized odds ratios, there was relatively little shrinkage of coefficients (i.e., a small lambda was chosen), reflecting the relatively good cross-validated performance of the model (Table 3). Interestingly, there were predictors that on their own had no relevant association with recurrence risk but contributed to the multivariate prediction. Maladaptive action tendencies were associated with recurrence risk by themselves and in the full regularized model (Table 3), whereas overgeneralized perception of control was not (Table 3).
Table 3Model Coefficients Predicting Recurrence
All Modalities β
All Modalities OR
Global Assessment of Functioning score
Number of episodes categorized
Rosenberg Self-Esteem Scale
PANAS negative affect
PANAS positive affect
Beck Depression Inventory
SCC–ATL connectivity (self-blame)
Basal ganglia–ATL connectivity (self-blame)
Frontopolar–ATL connectivity (self-blame)
Hypothalamus–ATL connectivity (self-blame)
Insula–ATL connectivity (self-blame)
Amygdala–ATL connectivity (self-blame)
Hippocampus–ATL connectivity (self-blame)
SCC–signal time course
Basal ganglia–signal time course
Frontopolar–signal time course
Hypothalamus–signal time course
Insula–signal time course
Amygdala–signal time course
Hippocampus–signal time course
ATL BOLD effect (self-blame)
SCC BOLD effect (self-blame)
Perceived control for self-blame vs. other blame
Maladaptive action tendencies
Displayed are the multivariate semi-standardized regularized logistic regression coefficient (β) and OR for each predictor variable from the all modalities model (n = 52), which is generated by dCVnet. Recurrence was coded as the positive outcome, so that positive regression coefficients for any predictor can be interpreted as a positive association with recurrence risk. For reference, univariate unregularized logistic regression ORs are provided. For fMRI measures, self-blame indicates that the measurement is derived from a contrast between self-blame trials and fixation cross trials. Motion parameters were derived from the SPM realignment and used root mean squares of the three translation and three rotation parameters, averaged after z transformation. Figure S3 details the a priori rationale for each predictor.
We confirmed that combinations of clinical, psychological, and self-blame fMRI measures were highly promising candidates for predicting individual recurrence risk in MDD. Using a principled hybrid statistical/machine learning approach, we observed excellent cross-validated discriminative performance, with an AUC of 0.9 and positive predictive values above the suggested benchmark of 80% for clinically useful markers (
). In contrast, our novel cognitive task to capture self-blame–related action tendencies was not sufficient to replace fMRI. We nevertheless obtained AUC values above 0.7, which is considered fair discriminatory value (
), and there was an association of maladaptive action tendencies with subsequent recurrence risk (odds ratio above 2.5). Yet, there was no relevant improvement on known clinical and standard measures regarding its discriminative value. These findings of modest predictive utility are in keeping with the previous literature showing that clinical and standard measures in themselves are unable to make accurate predictions of individual recurrence risk (
Despite the lack of evidence for the predictive utility of the action tendencies task in this dataset, there are several strategies that could be pursued to improve the value of experimental cognitive tasks in future prediction models. First, our task was purely text based and lacked immersive features. Immersion could be improved by adopting a virtual reality paradigm (
). Second, any single task is unlikely to capture all relevant aspects of vulnerability; thus, future work could integrate a wider range of cognitive measures: for example, a recently developed cognitive task measuring effort and reward decisions showed promising but modest effect sizes for predicting recurrence in MDD (
) resulting in similar AUC values of around 0.70 as our action tendencies task, and its combination with action tendencies may outperform the separate models.
Although a wide variety of clinical and standard measures with known predictive value were assessed, performance of the model using just these features was modest (AUC = 0.73), and one might ask whether important predictors were omitted. Recently, promising results were shown for recurrence risk prediction using a machine learning model that included childhood trauma (
), our model did not include stressful life events during the follow-up period because our aim was prospective prediction from baseline data. We note that although 60% of our 93 patients with completed follow-up data in the overall study reported stressful life events during the 14-month follow-up period (as determined by clinical interview and standardized questionnaires), there was no difference in the rate of occurrence of these events between the recurring episode (61%) and stable remission (59%) groups. This is consistent with the hypothesis that life events trigger recurrence not directly, but rather through their interaction with other factors, such as self-blaming biases (
). Yet, these methods do not allow integrating cognitive and clinical variables in a straightforward way. Hence, we decided to employ a simpler approach to the imaging analysis, relying on ROIs, and this approach may be more feasible for future clinical applications. By employing the elastic net as a machine learning extension of logistic regression, we alleviated overfitting by shrinking the regression coefficients toward zero, which allows for automatic variable selection by omitting some predictors. The nested cross-validation is vital to provide realistic estimates of out-of-sample prediction accuracy and thereby estimate internal validity (
), a shortcoming of many studies of clinical predictors that commonly employ shared cross-validation for both model tuning and performance estimation. Although this prognostic model is at an early stage, our approach follows the Medical Research Council Prognosis Research Strategy Partnership (
), we must nevertheless replicate these findings in a larger independent sample before drawing clinical conclusions. Further, we included patients who were fully remitted from symptoms and had no relevant comorbidity; thus, generalization to patients with partial remission and comorbidity will need to be investigated.
Maladaptive self-blame–related action tendencies and fMRI measures predicted subsequent recurrence risk in our MDD sample. When including fMRI, our statistical learning–based risk prediction tool showed promising potential to predict recurrence risk at an individual level. This calls for external validation in an independent sample.
Acknowledgments and Disclosures
This work was supported by a Medical Research Council Clinician Scientist Fellowship (Grant No. G0902304 [to RZ]), the LABS-D'Or Hospital Network (Rio de Janeiro, Brazil) (to JM), and a Medical Research Council Doctoral Training Partnership (Grant No. 2064430 [to DF]). RZ and DS were partly funded and AJL was fully funded by the National Institute for Health Research Biomedical Research Centre at South London and Maudsley NHS Foundation Trust and King’s College London and by NARSAD Independent Investigator Grant No. 24715 from the Brain & Behavior Research Foundation. The views expressed are those of the authors and not necessarily those of the National Health Service, the National Institute for Health Research, or the Department of Health.
We are grateful to Dr. Karen Lythe for collecting the primary data.
RZ has collaborations with e-health companies Depsee Ltd, EMIS PLC, and Alloc Modulo Ltd. All other authors report no biomedical financial interests or potential conflicts of interest.