Pre-arrest Prediction of Survival Following In-hospital Cardiac Arrest: A Systematic Review of Diagnostic Test Accuracy Studies

Aim: To evaluate the test accuracy of pre-arrest clinical decision tools for in-hospital cardiac arrest survival outcomes. Methods: We searched Medline, Embase, and Cochrane Library from inception through January 2022 for randomized and non-randomized studies. We used the Quality Assessment of Diagnostic Accuracy Studies framework to evaluate risk of bias, and Grading of Recommendations Assessment, Development and Evaluation methodology to evaluate certainty of evidence. We report sensitivity, specificity, positive predictive outcome, and negative predictive outcome for prediction of survival outcomes. PROSPERO


Introduction
In-hospital cardiac arrest (IHCA) occurs with an incidence of 1-10 per 1,000 hospital admissions. 1,2 Currently, only 20-30 % of adult IHCA patients survive to hospital discharge. [3][4][5][6] Some of these patients survive with unfavourable neurological outcome that may not be valued by the patient. [7][8][9] Several factors including older age and comorbidities are associated with potential futility of cardiopulmonary resuscitation (CPR). 6,10,11 Therefore, it is necessary for healthcare providers to discuss the appropriateness of attempting cardiopulmonary resuscitation with patients at risk of cardiac arrest. 12 Do-not-attempt-CPR (DNACPR) decisions provide a process to document a clinical or patient decision that an individual should not receive resuscitation in the event of cardiac arrest. However, previous studies have identified variability in decision-making 13,14 and found DNACPR status to be inappropriately associated with demographic factors such as gender, ethnicity, and language. [15][16][17] A key barrier to making DNACPR decisions is that the prediction of outcome following IHCA can be challenging. 13,14 Pre-arrest prediction rules may serve as an important decision aid to facilitate DNACPR discussions and reduce variability in decision-making. Accordingly, several pre-arrest prediction rules have been developed over the years. 18, 19 However, no systematic reviews have assessed the test accuracy of current pre-arrest prediction rules for IHCA. The International Liaison Committee on Resuscitation (ILCOR) task force on Education, Implementation, and Teams ranked the topic as a high priority and initiated this systematic review in collaboration with the Pediatric Life Support and Advanced Life Support task forces. The aim of this systematic review was to assess whether any pre-arrest prediction rule can predict survival outcomes following IHCA with sufficient precision to support its implementation in clinical practice.

Methods
This systematic review is reported in accordance with the Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies. 20 The review was completed as part of the evidence evaluation process of ILCOR's Education, Implementation, and Teams task force and was registered at the International Prospective Registry for Systematic Reviews (PROSPERO CRD42021268005). No ethical approval was required to conduct this study.
In accordance with the review process of ILCOR, we used the PICOST format (Population, Intervention, Comparison, Outcome, Study Design, Timeframe) to frame this research question: For hospitalized adults and children experiencing an in-hospital cardiac arrest (P), does use of any pre-arrest clinical prediction rule (I), compared to no clinical prediction rule (C), predict return of spontaneous circulation, survival to hospital discharge/ 30-days or survival with favorable neurological outcome (O). We included randomized controlled trials and non-randomized studies (non-randomized controlled trials, interrupted time series, controlled before-and-after studies, cohort studies, case series where n ≥ 5) in all languages. We excluded editorials, commentaries, opinion papers, and conference abstracts (S). We searched Medline, Embase, and Cochrane databases for all years. The search strategy was created and performed by an information specialist on January 8 th , 2021, and an updated search was conducted on January 13 th , 2022 (T). The search strategy is described in Appendix 1.

Definitions
We included studies on pre-arrest clinical prediction rules aiming to predict the chance of surviving (or not surviving) an IHCA, with or without favorable neurological outcome. We defined IHCA as any cardiac arrest with clinical indication for cardiopulmonary resuscitation (CPR) occurring inside the hospital regardless of the underlying cause of the arrest. 21 Studies on patients with out-of-hospital cardiac arrest being transported to the hospital with ongoing CPR were excluded. We defined pre-arrest clinical prediction rules as a set of clinical variables available before a cardiac arrest to predict the chance of surviving a cardiac arrest (+/-favourable neurological outcome). Studies utilizing termination of resuscitation rules and post-arrest prediction rules were excluded.
We chose to report predicted survival, as opposed to predicted death, as this is most commonly used outcome by ILCOR. Thus, we characterized true positives as a patient surviving that was predicted to survive and valued perfect negative predictive values (i.e. no missed survivors). We included the following outcomes: return of spontaneous circulation (ROSC), survival to hospital discharge/ 30-day survival, and survival with favorable neurological outcome. As studies may use different instruments (e.g. cerebral performance category or modified Ranking scale) with different cut-offs to define favourable neurological outcomes, we did not pre-specify any strict criteria for this outcome. We prospectively defined the following subgroup analyses of interest: paediatric patients vs. adult patients, studies before vs. after 2010, and historical cohorts vs. prospective clinical studies. We chose the cut-off of 2010 as studies have found stagnating survival rates after 2010 and lower survival rates before 2010. 6,[22][23][24] Data extraction and quality assessment Following completion of database searches, we reviewed study titles and abstracts, and excluded obviously irrelevant papers. We subsequently reviewed full-text papers against study inclusion criteria. At both the title/ abstract and full-text screening stage, each paper was independently reviewed by two reviewers using Covidence software (Covidence®, Melbourne, Australia). Disagreements were solved by discussion with a third reviewer. In the event that key data were not reported, we contacted the corresponding author by email and sent a reminder two weeks after in case of no response. Data were extracted on a spreadsheet created by the authors to identify study-and patient characteristics and test accuracy outcomes.

Bias assessment
Bias assessment was conducted independently by two reviewers using the revised framework for Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2 tool). 25 This framework comprises 4 domains for study bias: patient selection, index test, reference standard, and flow and timing and 3 domains for applicability: patient selection, index test, reference standard. In case of disagreement regarding risk of bias for any domain, consensus was reached by discussion with a third reviewer. One reviewer (TD) was excluded from bias assessment to mitigate conflicts of interest as she had published studies that were part of the review.

Data analysis and synthesis
We report positive predictive values (PPV), specificity, sensitivity, negative predictive values (NPV), positive likelihood ratios, negative likelihood ratios, and area under receiver operating characteristic curves (AUC) with 95% confidence intervals for patient outcomes when possible. We calculated each diagnostic outcome with 95% confidence intervals (CI) using Stata version 16.0 (StataCorp LP, College Station, TX, USA). We report the AUCs presented in the studies. In case no AUC was presented in the study, we calculated an AUC based on the sensitivity and specificity for the cut-off used. In accordance with the Cochrane handbook, we decided not to conduct any meta-analysis of studies investigating the Good Outcome Following Attempted Resuscitation (GO-FAR) score due to high risk of bias. Due to serious clinical heterogeneity and high risk of bias, no meta-analysis was conducted for the other scores.
We assessed the overall certainty of evidence using the Grading of Recommendations Assessment, Development and Evaluation (GRADE) methodology. 26 We used the bias domains of the QUADAS tool to feed in to the bias part of GRADE, and the applicability part to feed in to directness. We used GRADEpro software (McMaster University, 2014) to synthesize the overall risk of bias across studies.

Results
We identified 2521 studies, of which 23 studies were eligible for inclusion (  Table 1 and Table 2. Characteristics of each score are presented in Supplement 2 and likelihood ratios are presented in Supplement 3.

Risk of bias assessment and certainty of evidence
Bias assessment performed using the QUADAS-2 framework are presented in Table 3. We rated flow and timing as a concern for all studies as factors that contribute to calculation of the clinical prediction (e.g. age, co-morbid state) may have informed the decision to terminate resuscitation efforts, thus creating a self-fulfilling prophecy. 27 Moreover, there was concern about patient selection (including applicability) in several studies due to missing data, patient exclusions, single-centre designs, and patient cohorts pre-dating 2010, due to lower survival rates compared with more recent cohorts. There was concern about the index test (including applicability) in several studies due to lack of pre-specified cut-offs and physiological parameters that may change frequently, making the score challenging to apply in the clinical setting. Finally, there was risk of bias for the reference standard in studies assessing neurological outcomes as these are at risk of subjectivity and inconsistency in reporting.
The overall certainty of evidence was rated as very low across all scores. Certainty of evidence was downgraded for risk of bias, indirectness, imprecision, and inconsistency for all scores (Appendix 4).
The prognosis after resuscitation (PAR) score to predict survival to hospital discharge was used in 5 smaller cohort studies, 3 older studies published 1994-1999 28,29,33 and 2 more recent studies from 2014 and 2018. 35,38 The studies evaluated different cut-offs to avoid missing survivors and found NPVs of 95.4-100% with 95% confidence intervals ranging from 79.6-100% (Table 1).
Moreover, the following scores were investigated: The modified early warning score (MEWS), 31 the National Early Warning Score (NEWS), 37,39 the Clinical Frailty Scale, 36 the APACHE III score, 28 a neuronal network model, 30 and the modified pre-arrest morbidity (MPI) score (Table 1). 29,38 In addition, several studies were found that did not report data to calculate predictive values with confidence intervals. Limpawattana et al. 38 reported the following predictive values without confidence intervals (predicted death as opposed to 20.0-28.6) for prediction of survival to hospital discharge. 42 One study derived 5 classification and regression trees to predict survival to hospital discharge with a Cerebral Performance Category (CPC) of 1 and two of these models were externally validated in a second study (Table 2). 46,47 Finally, one study used the GO-FAR 2 score and one study used the Prediction of outcome for In-Hospital Cardiac Arrest (PIHCA) score to predict survival to hospital discharge with a CPC≤2. 41,48

Sub-group analyses
We did not identify sufficient data to undertake our pre-planned sub-group analyses.

Discussion
This is the first systematic review on diagnostic accuracy test studies for pre-arrest prediction of survival for IHCA. We identified 23 studies using 13 different pre-arrest prediction rules. We identified no prospective implementation of any score and the level of evidence was rated as very low certainty. The most extensively validated score was the GO-FAR score that predicted chance of survival to hospital discharge with a CPC of 1 resulting in NPVs of 95-99%, albeit with significant statistical uncertainty around the estimate of NPV in some studies. We found no studies on clinical prediction rules for paediatric patients.
Pre-arrest prediction rules may be used to facilitate DNACPR discussions, empower patients to express their wishes based on objective information, and to make DNACPR decisions. It is widely considered that CPR is sometimes initiated even though a DNACPR decision should have been in place due to futility. 15,50,51 This exposes the patient and their family to the harms of a non-beneficial resuscitation attempt and diverts the hospital resuscitation team from their other clinical duties.
If a pre-arrest prediction tool is reliable, clinical implementation could potentially support DNACPR decisionmaking and contribute to fewer futile CPR attempts. At the same time, it could reduce variability in decisionmaking and contribute to equity in decision-making. However, reliance on a pre-arrest prediction tool whose test accuracy is inadequate might lead to more patients not receiving CPR, where it might have been beneficial and in line with patient's values and preferences.
A widely used definition for futility within medical research is a survival chance of <1%. [52][53][54] Accordingly, a lower boundary of the 95% confidence interval >0.99 for the NPV may be considered acceptable in some instances for a pre-arrest prediction rule. 55 However, use of a pre-arrest prediction resulting in 1% of potential survivors not being resuscitated may not be universally accepted. Notably, none of the included studies had 95% confidence intervals for the NPV or sensitivity >0.99 but the GO-FAR score performed well with a NPV >99% in 3 studies, ranging from 96.2-100% for the point estimate in all studies. 18,40,[42][43][44][45]49 An issue for all of the identified studies is that no prospective implementation was used, and several studies were based on patient cohorts from the 1980's, 90's, and 2000's where survival rates were lower compared to contemporary cohorts after 2010. 6,22-24 As clinicians may have inaccurate expectations about survival outcomes and may terminate resuscitative efforts prematurely based on patient characteristics, 56,57 the use of historical cohorts may induce a self-fulfilling prophesy and lead to a critical risk of bias in the studies.
Overall, comorbidity scores such as the PAM score and the PAR score performed differently in patient cohorts from the 1980's, 1990's, and 2000's. 28,29,32,33,35, 38 These findings suggest that the scores may not be applicable to contemporary patient cohorts. Moreover, the early warning scores such as the NEWS and MEWS were overall highly inaccurate for prediction of patient survival and may be used to measure patient deterioration but should not be used to predict survival outcomes following cardiac arrest. 31, 37,39 With 7 studies investigating the GO-FAR score, this is the most extensively validated tool. However, several issues should be mentioned in relation to the clinical applicability. First, the score utilizes physiologic measures of hypotension and respiratory insufficiency captured within 4 hours before the cardiac arrest as part of the score. As these components of the score may fluctuate over time, the overall score may also change, making it challenging to use to inform clinical decision-making. Second, the GO-FAR score measures survival to hospital discharge with a CPC of 1. Although this outcome may be considered relevant by patients and clinicians, the generally used definition of favorable neurological outcome include survival with a CPC of 2 may be highly valued by patients and relatives. 21,58 The GO-FAR 2 score and the PIHCA score aim to predict survival with CPC ≤ 2 which resembles the generally used definition of favorable neurological outcome. 41,48 Both scores performed reasonably well in the derivation and internal validation, but the scores have not yet been externally validated and utilize physiologic measures as does the GO-FAR score.
Our review identified several important knowledge gaps. First, there are no prospective implementation studies on any pre-arrest prediction model. As such, it would be premature to consider use of any of these scores in clinical practice even though a perfect prediction may not be needed to initiate a DNACPR discussion. Second, no studies included paediatric patients or were conducted in low resource settings. Third, we found no evidence for return of spontaneous circulation and long-term survival outcomes. Fourth, there is a knowledge gap linking pre-arrest prediction scores to patient/ relative perspectives. We do not know which information the patients would prefer to know regarding the predicted outcomes and which cerebral performance categories that would be valued by different patient groups. Finally, scores that utilize physiologic measures within short time intervals before a cardiac arrest are difficult to use in clinical practice as they might only indicate the pre-arrest deterioration. A reliable score that may combine elements such as acute and chronic comorbidities, admission blood samples, age, and frailty without using vital signs measured in the hours before the arrest is needed.

Limitations
This systematic review included only historical cohort studies and one case control study. In addition, there were issues with indirectness and clinical application of the clinical prediction rules resulting in very low certainty evidence. There was a large clinical heterogeneity among the included studies and the heterogeneity combined with high risk of bias prevented us from conducting meta-analyses.

Conclusions:
This systematic review identified very low certainty evidence for 13 different scores to predict survival to hospital discharge/ 30 days and favorable neurological outcome. None of these were able to reliably predict no chance of survival or favorable neurological outcome. We identified no evidence for children.

Conflicts of interest: KGL, JB, and RG are members of the ILCOR EIT Task Force (RG as chair). TD is vice
chair of the ILCOR first aid task force, JT is member of the ILCOR pediatric life support task force, and KC is member of the ILCOR advanced life support task force. RG is ERC Director of Guidelines and ILCOR.

Acknowledgement:
The following non-task force members are acknowledged for their contributions: Information specialist Jenny Ring.      n/a n/a n/a n/a 0.62 (0.51 to 0.73) n/a n/a n/a n/a 0.60 (0.49 to 0.71) Limpawattana 2018 38 (PAM ≥6, PAR (n/a) or MPI≥5) n/a n/a n/a n/a 0.  n/a n/a n/a n/a