- Open Access
Derivation and validation of a prognostic model for postoperative risk stratification of critically ill patients with faecal peritonitis
Annals of Intensive Carevolume 7, Article number: 96 (2017)
Prognostic scores and models of illness severity are useful both clinically and for research. The aim of this study was to develop two prognostic models for the prediction of long-term (6 months) and 28-day mortality of postoperative critically ill patients with faecal peritonitis (FP).
Patients admitted to intensive care units with faecal peritonitis and recruited to the European GenOSept study were divided into a derivation and a geographical validation subset; patients subsequently recruited to the UK GAinS study were used for temporal validation. Using all 50 clinical and laboratory variables available on day 1 of critical care admission, Cox proportional hazards regression was fitted to select variables for inclusion in two prognostic models, using stepwise selection and nonparametric bootstrapping sampling techniques. Using Area under the receiver operating characteristic curve (AuROC) analysis, the performance of the models was compared to SOFA and APACHE II.
Five variables (age, SOFA score, lowest temperature, highest heart rate, haematocrit) were entered into the prognostic models. The discriminatory performance of the 6-month prognostic model yielded an AuROC 0.81 (95% CI 0.76–0.86), 0.73 (95% CI 0.69–0.78) and 0.76 (95% CI 0.69–0.83) for the derivation, geographic and temporal external validation cohorts, respectively. The 28-day prognostic tool yielded an AuROC 0.82 (95% CI 0.77–0.88), 0.75 (95% CI 0.69–0.80) and 0.79 (95% CI 0.71–0.87) for the same cohorts. These AuROCs appeared consistently superior to those obtained with the SOFA and APACHE II scores alone.
The two prognostic models developed for 6-month and 28-day mortality prediction in critically ill septic patients with FP, in the postoperative phase, enhanced the day one SOFA score’s predictive utility by adding a few key variables: age, lowest recorded temperature, highest recorded heart rate and haematocrit. External validation of their predictive capability in larger cohorts is needed, before introduction of the proposed scores into clinical practice to inform decision making and the design of clinical trials.
Prognostic scores and models of illness severity are useful both clinically and for research. They support critical care physicians in decision making through more accurate prognostication; they describe and summarise case mix, and inform health economic evaluations of cost-effectiveness. Many types of models exist, and their roles are not mutually exclusive, as their combined use may afford better prognostic reliability . These tools are usually insufficiently accurate to be useful for predicting individual survival and are generally reserved for benchmarking quality of care and for research studies [2,3,4], for example when examining heterogeneity of treatment effect in clinical trials .
When considering prognostication in the context of the wide ranging spectrum of intra-abdominal infections, complexity is increased by the heterogeneity of aetiology, clinical manifestations and pathophysiological mechanisms. The International Sepsis Forum Consensus Conference on Definitions of Infection in the Intensive Care Unit describes intra-abdominal infections as a “very heterogeneous group of infectious processes that share an anatomical site between the diaphragm and the pelvis” . The anatomical, clinical and pathophysiological heterogeneity of these infections, together with their varied aetiology and prognosis, have given rise to a range of prognostic instruments tailored to specific populations.
Generic “peritonitis” prognostic tools (aimed at peritonitis of any origin), such as the Mannheim Peritonitis Index (MPI) or the Peritonitis Index of Altona II (PIA II), rely on factors such as age, degree of organ failure, origin of sepsis and intra-operative findings to risk-stratify different types of peritonitis, but, given the considerable heterogeneity of intra-abdominal infections, these scoring systems may not be sufficiently specific in terms of aetiology [7, 8]. Other scoring systems have been devised to explicitly address the issue of prognostication in selected forms of peritonitis, such as the left colonic Peritonitis Severity Score (PSS), developed for patients with distal large bowel peritonitis of various origins . The physiological and operative severity score for the enumeration of mortality and morbidity (POSSUM) is another risk adjustment model, developed in 1991 for use in surgical patients . A modification of this prognostic model, obtained by excluding some of the physiological factors of the original POSSUM, was developed for use specifically in patients undergoing surgery for colorectal cancer (CR-POSSUM) . Importantly, all of these scores incorporate intra-operative findings and are either designed to cater for, and include, the whole heterogeneous spectrum of peritoneal infections (such as the MPI and PIA II), or to focus on a very narrow subset of peritonitis, identified by location (left colonic, in the case of PSS) or aetiology (colorectal malignancy, as in CR-POSSUM).
To date no prognostic score has been developed for the critically ill patient with faecal peritonitis (FP) in the postoperative phase. We therefore aimed to specifically study critically ill patients suffering from FP, in the postoperative phase, and quantify their mortality risk at 28 days and 6 months. International multicentre prospectively collected patient datasets, such as The GenOSept and GAinS cohorts, provided an opportunity to develop and evaluate such prognostic systems.
Aim, design and setting
The Genetics of Sepsis and Septic Shock in Europe (GenOSept) and Genomic Advances in Sepsis (GAinS) are prospectively gathered cohorts of critically ill septic patients with FP recruited from multiple centres in Europe. They include data from patients with various degrees of illness severity, including potential risk modifiers and confounding factors (such as comorbidities, indices of acute physiological derangement, organ support, radiological and laboratory findings, origin of FP) [12, 13]. These diagnostically homogeneous cohorts of FP patients, gathered primarily for the purposes of studying genetic epidemiology in sepsis, also provide high-quality data well suited to the development and testing of a prognostic model specific to this postoperative patient population.
The primary aim of this study was to develop and validate a prognostic modelling tool able to stratify postsurgical critically ill patients with FP, by quantifying their mortality risk in the short- (28 day) and long-term (6 month), independently from intra-operative surgical findings, using prospectively collected data from the GenOSept and GAinS cohorts.
The same inclusion and exclusion criteria were used for both cohorts. Inclusion criteria: adult patients (>18 years) admitted to a High Dependency Unit (HDU) or Intensive Care Unit (ICU) with FP, defined as visible inflammation of the serosal membrane that lines the abdominal cavity, secondary to contamination by faeces, as diagnosed by the operating surgeon at laparotomy. All critically ill patients in this cohort, therefore, were recruited after the diagnosis was established during surgical source control. Exclusion criteria: peritonitis due to gastric or upper GI-tract perforation (e.g. gastric or duodenal ulcer perforation, small bowel perforation), patient or legal representative unwilling or unable to give consent; patient pregnant; advanced directive to withhold or withdraw life-sustaining treatment or admitted for palliative care only; patient already enrolled in an interventional research study of a novel/unlicensed therapy (patients enrolled in interventional studies examining the clinical application or therapeutic effects of widely accepted, “standard” treatments, were not excluded); patient immunocompromised (known regular systemic corticosteroid therapy, exceeding 7 mg/kg/day of hydrocortisone or equivalent, within 3 months of admission and prior to acute episode; known regular therapy with other immunosuppressive agents, e.g. azathioprine; known to be HIV positive or have acquired immunodeficiency syndrome as defined by the Centre for Disease Control; neutrophil count less than 1000 mm−3 due to any cause, including metastatic disease and haematological malignancies or chemotherapy, but excluding severe sepsis; organ or bone marrow transplant receiving immunosuppressive therapy).
The definition of sepsis was based on the International Consensus Criteria: “the clinical syndrome defined by the presence of both infection and a systemic inflammatory response” . Patients were followed for up to 6 months from enrolment or until death.
Database and quality assurance
The case report form (CRF) was developed and tested by CH, CG, AG, JDC and Dr J. Millo, together with other members of the GenOSept Consortium. Variables recorded included demographic, clinical and outcome data. A specific electronic case report form (eCRF) was developed by Lincoln, Paris, France, using software developed in collaboration with JDC. The database was password-protected, allowing investigators to enter data into the eCRF online, and included audit trail capability for data entry and subsequent modifications. To minimise errors, logical range checks were in place so that the investigators would be alerted if an attempt was made to enter data values outside the expected ranges.
Quality assurance (QA) was performed by P.H., C.G., A.W., A.G. and C.H, who systematically reviewed all data. Data queries (DQs) were generated within the eCRF for missing or erroneous data and sent electronically to the relevant investigators for action, where necessary. Up to the end of January 2011, an estimated 3986 valid DQs had been generated, with a response rate by the investigators of approximately 92%. Common reasons for DQs were missing information, particularly the Charlson Index, antimicrobial use, estimated day of onset of FP before ICU admission, information about circumstances of GCS assessment and outcome data.
All patients’ eCRFs were reviewed by experienced critical care physicians. Where the patient’s eligibility for inclusion in the relevant cohort was unclear, clarification was sought from the investigators. Regular QA reports were provided to the relevant Management Committee for review; the National Investigators were contacted regarding quality issues if necessary.
In order to build the prognostic model, patients recruited up to January 2011 (included in the GenOSept cohort) were divided into two subsets of patients: one for derivation and the other for external geographic validation. To limit the effect of potentially unmeasured and unaccounted confounding factors, related to possible differences in national systems of healthcare provision among participating countries across Europe, these patients were divided into UK (derivation) and non-UK (geographic validation) sub-cohorts, with the aim of optimising homogeneity in the datasets and decreasing potential background noise. Subsequent patients recruited in the UK between January 2011 and March 2015 (included in the GAinS cohort) were included in the temporal validation cohort.
We evaluated all 50 clinical and laboratory variables available on admission to critical care (day 1) (for a full list, see Additional file 1). The primary outcome was 6-month mortality risk with the secondary outcome being 28-day mortality risk. To select the variables to include in the model, Cox proportional hazards regression analysis for 6-month mortality was fitted, using stepwise backwards selection, to determine the predictors to be included in the models from 50 bootstrapped samples derived from the derivation subset (nonparametric bootstrap procedure). Increasing the number of bootstrap replications did not alter the model significantly. The p value cut-off used was 0.05. The same predictor variables were employed to construct a prognostic tool for the secondary outcome, 28-day mortality.
The procedure of bootstrapping is a re-sampling method which relies on random sampling with replacement of the available observations. This procedure allows evaluation of the characteristics of an estimator (such as its variance) by measuring those properties when obtaining multiple samples from the original dataset (and of size equal to the observed dataset) [15, 16].
A final Cox proportional hazards regression analysis for both 6-month and 28-day mortalities was fitted using the set of variables found to be significant in the majority of bootstrap replications.
We confirmed that the proportional hazards assumption was met by drawing Kaplan–Meier Curves and Nelson Aalen plots for the covariates after categorisation. Predictors which satisfy the proportional hazard assumption show very similar curves, with the separation between them remaining proportional across analysis time . We also tested the correctness of this assumption testing on the basis of Schoenfeld residuals .
In order to assess for the presence of collinearity (which happens when two variables are almost perfect linear combinations of one another), we calculated the variance inflation factors (VIFs). It is generally accepted that variables with VIFs greater than 10 merit further investigation .
The two models obtained were evaluated using area under the receiver operating characteristic curve (AuROC) analysis, which plots sensitivity against 1-specificity to describe the accuracy of a diagnostic test [20, 21] and to compare the performance of different tests .
Nonparametric bootstrapping and prognostic model derivation for 6-month mortality
The bootstrapping procedure was performed using 50 repetitions based on the UK derivation cohort. A final Cox proportional hazards regression analysis for 6-month mortality was fitted using the set of variables found to be significant in the majority of bootstrap replications. Saturation was reached after 50 bootstrap replications, with additional replications not yielding significantly different results.
A set of 5 variables assessed on day 1 met this criterion (age, SOFA score, lowest temperature, highest heart rate, haematocrit). The Cox proportional hazards model estimates for those risk variables are presented in Table 1.
The same five variables were employed to formulate the 6-month mortality prognostic tool by entering the estimates obtained from the Cox proportional hazards model in the following equation:
where A = age at admission to critical care, S = SOFA score day 1, T = lowest recorded temperature (as °C) on day 1, HR = highest recorded heart rate on day 1, H = haematocrit (as percentage points) on day 1.
The model coefficients used for prediction of 6-month mortality were adjusted for the 28-day mortality outcome. To achieve this, a separate Cox proportional hazards regression analysis was fitted for 28-day mortality, utilising the same set of five variables. The resulting model estimates are presented in Table 1. The estimates were utilised to construct the 28-day mortality prognostic tool as described in the following equation:
While haematocrit and high heart rate did not offer independent predictive power in the 28-day mortality model, they were useful in explaining variability when retained in the model.
Comparison of the prognostic models with preexisting scores
Comparison of the prognostic models with SOFA and APACHE II was performed graphically by drawing the superimposed ROC curves and testing the underlying AuROC obtained, taking into account that the data are correlated, using a nonparametric approach as suggested by DeLong et al. .
For all statistical analyses, Stata version 10.0 was used (StataCorp, Texas, USA; http://www.stata.com).
Baseline and outcome data
The derivation cohort included 462 patients with FP recruited in the UK. Their median (inter-quartile range, IQR) age was 69.4 (58.6–77.2) years. The geographic validation (non-UK) cohort included 515 FP patients recruited to the GenOSept study from the other European countries. Their median (IQR) age was 69.1 (58–77) years. The temporal validation cohort included 323 FP patients recruited in the UK between January 2011 and March 2015. Their median (IQR) age was 68.3 (57.6–77.2) years. For details of the recruiting centres, please see Additional file 1.
The age distribution was not significantly different across the cohorts, although the derivation cohort had a higher proportion of patients aged over 75. Males predominated in all cohorts. The racial distribution was more heterogeneous in the geographic validation cohort, while the derivation and the temporal validation cohorts were almost entirely Caucasian. Among the comorbidities diabetes, previous serious infections and other illnesses were more prevalent in the geographic validation cohort, compared to the other cohorts. The underlying causes for FP varied across cohorts, with anastomotic breakdown being particularly common in the geographic validation cohort. Baseline Sequential Organ Failure Assessment (SOFA) and Acute Physiology and Chronic Health Evaluation II (APACHE II) scores and prevalence of mechanical ventilation on day one were comparable across the cohorts. The occurrence of acute renal failure on day one was more frequent in the geographic validation cohort, with differences with the other cohorts (32.7, 42.8 and 23.3% for the derivation, geographic and temporal validation cohorts, respectively), accompanied by a difference in the utilisation of renal replacement therapy (21, 21.3 and 7.5% for the derivation, geographic and temporal validation cohorts, respectively) on day one. The geographic validation cohort was characterised by higher mortality rates (at all time points) and longer ICU stay, compared to the other two cohorts; this latter feature was also reflected, although to a lesser extent, in the length of hospital stay.
Performance of the prognostic tools
When evaluated using a receiver operating characteristics (ROC) curve, the discriminatory performance of the 6-month prognostic model in the UK derivation sub-cohort yielded an AuROC of 0.81 (95% CI 0.76–0.86) as indicated in Fig. 1a. At geographic validation in the non-UK sub-cohort, the 6-month prognostic model produced an AuROC of 0.73 (95% CI 0.69–0.78; Fig. 1b). At temporal validation, the 6-month model yielded an AuROC of 0.76 (95% CI 0.69–0.83; Fig. 1c).
The 28-day prognostic tool also performed similarly, yielding an AuROC 0.82 (95% CI 0.77–0.88; Fig. 2a) for the derivation UK sub-cohort. At geographic validation in the non-UK sub-cohort, the 28-day prognostic model produced an AuROC of 0.75 (95% CI 0.69–0.80; Fig. 2b). In the temporal validation cohort, the 28-day model yielded an AuROC of 0.79 (95% CI 0.71–0.87; Fig. 2c).
The 6-month FP prognostic score produced numerical values which can be stratified within 5 intervals (0–2; above 2–4; above 4–6; above 6–12; above 12) corresponding to five levels of 6-month mortality risk. The 28-day mortality FP score produces values classified within 5 intervals, corresponding to different risk categories for the outcome (0–2; above 2–4; above 4–8; above 8–16; above 16). The observed mortality rates corresponding to each class of risk for the two scoring systems are presented in Table 4 for all three cohorts (Additional file 1: Figs. S1 and S2 display the corresponding histograms of mortality). A 6-month FP score above 12 is consistently associated with a greater than 50% mortality risk at 6 months across all cohorts. A 28-day FP score above 16 is associated with a greater than 40% mortality risk for the 28-day outcome for the derivation and geographic validation cohorts, but not for the temporal validation cohort, in which the highest observed mortality risk was around 22%.
The discriminatory capabilities of the FP prognostic tools versus the SOFA and APACHE II scores in the FP cohorts
To assess how the FP models compare, as prognostic tools, to the routinely used SOFA and APACHE II scores, we calculated AuROCs for these scoring systems, to predict 6-month and 28-day mortality, in order to compare each tool across all cohorts and for both outcomes. For 6-month mortality, the SOFA score produced AuROCs of 0.73 (95% CI 0.68–0.78), 0.68 (95% CI 0.63–0.72) and 0.62 (95% CI 0.54–0.7) in the derivation, geographic and temporal external validation cohorts, respectively, while the APACHE II score yielded AuROCs of 0.74 (95% CI 0.7–0.79), 0.71 (95% CI 0.66–0.75) and 0.69 (95% CI 0.62–0.77) for those cohorts, respectively. For the 28-day mortality outcome, the SOFA score produced AuROCs of 0.76 (95% CI 0.7–0.82), 0.66 (95% CI 0.6–0.73) and 0.67 (95% CI 0.58–0.77) in the derivation, geographic and temporal external validation cohorts, respectively, while the same AuROCs for the APACHE II score were 0.71 (95% CI 0.64–0.77), 0.69 (95% CI 0.63–0.75) and 0.75 (95% CI 0.67–0.83), respectively.
The AuROCs obtained using the FP scores were consistently superior to those obtained with the SOFA score, with statistical significance across all cohorts (derivation, geographic and temporal external validation) and for both 6-month and 28-day mortality outcomes (Additional file 1: Figs. S3 and S4, respectively).
The AuROCs obtained using the FP scores were also superior to those derived using the APACHE II score for both outcomes, although statistical significance was not consistently achieved across all cohorts (Additional file 1: Figs. S5 and S6, for 6-month and 28-day mortality, respectively).
Faecal peritonitis continues to be associated with a high mortality. Approximately one out of five critically unwell patients with FP in Europe will die in the intensive care unit; this mortality rate increases to over 30% at 6 months.
As we previously reported, and perhaps unexpectedly, the presence of co-morbidities, the time from presumed onset of symptoms to surgery, the underlying cause of FP and the degree of organ support needed in critical care did not appear to influence survival significantly in these postoperative critically ill patients [24, 25]. We are not aware of any prognostic tool designed to assess the risk of long-term mortality specifically in the critically ill postsurgical FP patient. The risk prediction models described in our study aim to improve the SOFA score’s predictive power for mortality at 6 months and 28 days, by adding just a few key variables: age, lowest recorded temperature, highest recorded heart rate and haematocrit on admission to intensive care.
The 6-month mortality model demonstrates AuROCs of 0.81 (0.76–0.86), 0.73 (0.69–0.78) in the derivation and geographic validation cohorts, respectively, while the 28-day prognostic tool yielded AuROCs of 0.82 (0.77–0.88) and 0.75 (0.69–0.80) for the same cohorts. An area under the ROC curve over 0.8 is generally regarded as indicating a good discriminatory capacity . In the temporal validation cohort, the 6-month and 28-day mortality models yielded AuROC of 0.76 (95% CI 0.69–0.83) and 0.79 (0.71–0.87), respectively. The models, therefore, retained reasonable discriminatory capability, and systematically outperformed the other scoring systems tested (SOFA and APACHE II), in these cohorts.
This FP prognostic tool may, therefore, be useful to complement the currently used risk scores and bedside clinical assessment, enhancing the critical care clinician’s capacity to predict long-term outcome, thereby supporting the clinical decision making process in the postoperative phase.
The prognostic models presented here have some strengths, particularly as they have been derived and internally validated using large, homogeneous and recently gathered cohorts of FP patients (hence reflecting current practices and therapies).
Biondo and colleagues have recently evaluated the performance of the MPI as a predictor of immediate postoperative mortality, demonstrating an AuROC of 0.72 (95% CI 0.65–0.79), while, for the more specific left colonic Peritonitis Severity Score (PSS), the AuROC was 0.79 (95% CI 0.72–0.85) for this outcome .
We have previously reported that factors such as age, acute renal dysfunction, hypothermia, lower haematocrit and thrombocytopaenia are associated with an increased risk of death from FP [24, 25], and a number of other studies have evaluated the prognostic relevance of the individual components of our proposed prognostic models.
The SOFA score was developed in a mixed (medical and surgical) ICU population  and has been subsequently externally validated in various populations , such as cardiac surgical patients  and critically ill burn patients .
While the SOFA score was originally developed for the purpose of describing the evolution of organ dysfunction, rather than for prognostic purposes, we previously found that both admission SOFA and trends in the global SOFA scores were closely associated with mortality . Many studies have reported the use of the SOFA score both in isolation [31,32,33,34,35] and in combination with other variables [36, 37], for the purpose of outcome prediction. In our study, neither the SOFA nor the APACHE II scores, when used in isolation, performed as well as the tools developed here. Furthermore, day one SOFA performed particularly poorly in the temporal validation group, while the APACHE II risk model (which was developed for the purpose of outcome prediction) performed more consistently across the three cohorts, both for the 6-month and the 28-day outcome. This finding suggests that the value of SOFA lies primarily in describing temporal changes in organ function. Nevertheless, a single SOFA score can be successfully integrated with other parameters, to provide a prognostic tool with improved accuracy [36, 37], as we have done for day one SOFA in these analyses. While the confidence intervals for the AuROCs were relatively wide, when the FP models were compared to SOFA, statistically significant differences were found across all cohorts. This was not always the case for comparisons with APACHE II, further highlighting the superior prognostic accuracy of this severity score compared to an isolated, day one SOFA score.
The adverse effect of hypothermia on the outcome of critically ill patients has been described by other authors, although data on the relevance of hypothermia to outcomes remain conflicting [38, 39]. Laupland and co-authors studied 10,962 medical, non-scheduled and scheduled surgical patients admitted to critical care with varying degrees of hypothermia and fever. Hypothermia was, after controlling for confounding factors, significantly and independently associated with mortality in medical patients . Tiruvoipati et al. reported data from 175 elderly ICU patients, identifying lower temperatures and the Simplified Acute Physiology Score II (SAPS II) during the first day of ICU admission as being independently associated with higher hospital mortality [39, 40]. An association between severe hypothermia and the risk of ICU acquired infections has also been reported among medical patients .
Highest recorded heart rate
An increased heart rate is a physiological response to infection and sepsis, and part of the systemic inflammatory response syndrome (SIRS). Sprung and colleagues found that the presence of SIRS predicts infection, severity of illness, organ failure and outcome, with the two most common SIRS criteria met during ICU stay being respiratory rate (82%) and heart rate (80%) . Morelli and co-workers randomised a total of 154 septic shock patients to receive a continuous infusion of esmolol (targeting a heart rate of 80–94 bpm) or standard treatment in an open label trial. The patients in the esmolol arm achieved lower heart rates, without an increase of adverse events. Interestingly, an improvement in survival and other secondary outcomes was also reported . Others have found that a high daily mean heart rate was a significant predictor of ICU mortality .
Anaemia in surgical patients undergoing both cardiac and non-cardiac procedures has previously been reported to be associated with worse outcomes [45,46,47,48,49]. Beattie and co-workers performed a retrospective observational study of 7759 non-cardiac surgical patients to establish the relationship between preoperative anaemia and postoperative mortality and found that preoperative anaemia was common and strongly linked with postoperative mortality, even after adjustment for major confounders .
All of the patients with FP included in the analyses reported here underwent laparotomy (the diagnosis of FP was based on the intra-operative finding of faecal soiling of the peritoneal cavity). In addition, a significant proportion of patients (40%) were documented to have cardiovascular co-morbidity, a group in which anaemia has been shown to be associated with worse survival and major adverse cardiovascular events. Although anaemia may be associated with a poor outcome, data on the effects of blood transfusion are conflicting, with most reports not demonstrating benefit from transfusion aimed at achieving a higher haemoglobin threshold [50, 51].
One limitation of the current study is that we were unable to test the performance of other scoring systems such as the colorectal POSSUM, the MPI, PIA II or the PSS in our dataset, as these systems all require some intra-operative or preoperative findings, which were not available to us. On the other hand, the fact that our scores do not require any intra-operative findings could be viewed as an advantage.
A further limitation is the lack of comparison with alternative and more recent versions of severity scores, such as the Simplified Acute Physiology Score (SAPS) 3, the APACHE III or IV or the Mortality Prediction Model (MPM) III. We consider this unlikely to have a significant impact on the validity of our results, as multiple studies have shown that the performance of such tools, even in their more recent versions, is not significantly improved . A pragmatic decision was made to rely on the APACHE II (rather than more recent versions of APACHE) in view of its practicality, the fact that it is the only available non-proprietary version in widespread clinical use [1, 2, 4] and the comparator of choice in multiple other recently published studies [53, 54].
The SOFA score may be a less than ideal comparator, as the SOFA was not originally developed for prognostication. Multiple previous studies have, however, reported using the SOFA score, both in isolation [31,32,33,34,35] and in combination with other parameters [36, 37], for outcome prediction.
Another limitation is that our study was not designed to evaluate the influence on outcome of the timing and adequacy of source control or antibiotic treatment. All patients included in the study reported here received source control via surgical laparotomy prior to recruitment and the overwhelming majority of the patients (91.8%) received antimicrobial therapy deemed to be adequate .
Although the homogeneity of the patient population within our cohorts represents a methodological strength of the study, it may also be considered a potential weakness, as some real-world critically ill patients with FP would have not been included in our analyses.
Mortality differed markedly between the cohorts, even though they were recruited using the same inclusion and exclusion criteria. Whilst it is impossible to identify with certainty which factors explain these differences, multiple potential reasons can be postulated. Firstly, the variation in mortality rates strongly correlates with the occurrence of acute renal failure on day one. Acute renal dysfunction and deteriorating renal function have both been consistently associated with poor outcome in this specific subset of patients [24, 25]. The effects of random variability and the fact that in the UK the centres recruiting to GenOSept and those recruiting to GAinS were not always the same may have also contributed. Finally, improvements in the management of sepsis over the years may have influenced the incidence of renal failure and outcomes.
The present study describes the development of two prognostic models for the risk of 6-month and 28-day mortality in critically ill septic patients with FP, following laparotomy for source control. The tools incorporate five of the major independent risk factors identified in previous studies (SOFA score, age, heart rate, temperature and haematocrit) and combine them to produce a numerical value associated with mortality risk over 6 months or 28 days. Although, in the setting of postoperative FP patients admitted to critical care, the tools appeared to be superior to other existing scoring systems, such as SOFA and APACHE II, these findings should not be considered definitive. External validation in larger cohorts, such as the NELA (National Emergency Laparotomy Audit) or other databases , of their predictive capability is needed before introduction of the scores into clinical practice to inform decision making and the design of clinical trials.
Acute Physiology and Chronic Health Evaluation
acute renal failure
beats per minute
continuous positive airways pressure
electronic case report form
Glasgow Coma Scale
Intensive Care Unit
mean arterial pressure
multiple organ system failure
number of non-missing observations
arterial partial pressure of oxygen
arterial partial pressure of carbon dioxide
ratio of partial pressure arterial oxygen and fraction of inspired oxygen
renal replacement therapy
systolic blood pressure
Sequential Organ Failure Assessment
white cell count
Vincent J-L, Moreno R. Clinical review: scoring systems in the critically ill. Crit Care [Internet]. 2010 [cited 2015 Mar 16];14:207. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2887099&tool=pmcentrez&rendertype=abstract.
Eachempati SR. Critical care scoring systems [Internet]. Merck Man. 2014 [cited 2016 Aug 1]. Available from: http://www.merckmanuals.com/professional/critical-care-medicine/approach-to-the-critically-ill-patient/critical-care-scoring-systems#.
Breslow MJ, Badawi O. Severity scoring in the critically ill: Part 1—interpretation and accuracy of outcome prediction scoring systems. Chest. 2012;141:245–52.
Bouch DC, Thompson JP. Severity scoring systems in the critically ill. Contin Educ Anaesth Crit Care Pain [Internet]. Oxford University Press; 2008 [cited 2016 Aug 1];8:181–5. Available from: http://bjarev.oxfordjournals.org/lookup/doi/10.1093/bjaceaccp/mkn033.
Iwashyna TJ, Burke JF, Sussman JB, Prescott HC, Hayward RA, Angus DC. Implications of heterogeneity of treatment effect for reporting and analysis of randomized trials in critical care. Am J Respir Crit Care Med [Internet]. 2015 [cited 2016 Sep 12];192:1045–51. Available from: http://www.ncbi.nlm.nih.gov/pubmed/26177009.
Calandra T, Cohen J. The international sepsis forum consensus conference on definitions of infection in the intensive care unit. Crit Care Med. [Internet]. 2005 [cited 2014 Apr 28];33:1538–48. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16003060.
Wacha H, Linder M, Feldman U, Wesch G, Gundlach E, Steifensand R. Mannheim peritonitis index—prediction of risk of death from peritonitis: construction of a statistical and validation of an empirically based index. Theor Surg. 1987;1:169–77.
Wittmann DH, Teichmann W, Muller M. 176. Entwicklung und Validierung des Peritonitis-Index-Altona (PIA II). Langenbecks Arch Chir Chir [Internet]. 1987 [cited 2015 May 26];372:834–5. Available from: http://link.springer.com/10.1007/BF01297960.
Biondo S, Ramos E, Deiros M, Ragué JM, De Oca J, Moreno P, et al. Prognostic factors for mortality in left colonic peritonitis: a new scoring system. J Am Coll Surg [Internet]. 2000 [cited 2015 Dec 13];191:635–42. Available from: http://www.ncbi.nlm.nih.gov/pubmed/11129812.
Copeland GP, Jones D, Walters M. POSSUM: a scoring system for surgical audit. Br J Surg [Internet]. 1991 [cited 2015 Dec 23];78:355–60. Available from: http://www.ncbi.nlm.nih.gov/pubmed/2021856.
Tekkis PP, Prytherch DR, Kocher HM, Senapati A, Poloniecki JD, Stamatakis JD, et al. Development of a dedicated risk-adjustment scoring system for colorectal surgery (colorectal POSSUM). Br J Surg [Internet]. 2004 [cited 2015 Dec 23];91:1174–82. Available from: http://www.ncbi.nlm.nih.gov/pubmed/15449270.
European Society of Intensive Care Medicine—GenOSept study [Internet]. Available from: http://www.esicm.org/research/other-studies/genosept.
UK Critical Care Genomics group—GAinS study [Internet]. Available from: http://www.ukccg-gains.org/index.htm.
Bone RC, Balk RA, Cerra FB, Dellinger RP, Fein AM, Knaus WA, et al. Definitions for sepsis and organ failure and guidelines for the use of innovative therapies in sepsis. The ACCP/SCCM Consensus Conference Committee. American College of Chest Physicians/Society of Critical Care Medicine. Chest [Internet]. 1992 [cited 2014 Apr 28];101:1644–55. Available from: http://www.ncbi.nlm.nih.gov/pubmed/1303622.
Chen CH, George SL. The bootstrap and identification of prognostic factors via Cox’s proportional hazards regression model. Stat Med [Internet]. 1985 [cited 2015 May 2];4:39–46. Available from: http://www.ncbi.nlm.nih.gov/pubmed/3857702.
Harrell FE, Lee KL, Mark DB. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med [Internet]. 1996 [cited 2015 Jul 26];15:361–87. Available from: http://www.ncbi.nlm.nih.gov/pubmed/8668867.
Hess KR. Graphical methods for assessing violations of the proportional hazards assumption in Cox regression. Stat Med [Internet]. 1995 [cited 2016 Aug 5];14:1707–23. Available from: http://www.ncbi.nlm.nih.gov/pubmed/7481205.
Therneau TM, Grambsch PM. Modeling survival data: extending the cox model [Internet]. Berlin: Springer; 2000 [cited 2015 May 26]. Available from: https://books.google.com.my/books/about/Modeling_Survival_Data_Extending_the_Cox.html?id=9kY4XRuUMUsC&pgis=1.
Slinker BK, Glantz SA. Multiple regression for physiological data analysis: the problem of multicollinearity. Am J Physiol [Internet]. 1985 [cited 2016 Aug 5];249:R1–12. Available from: http://www.ncbi.nlm.nih.gov/pubmed/4014489.
Metz CE. Basic principles of ROC analysis. Semin Nucl Med [Internet]. 1978 [cited 2014 Apr 28];8:283–98. Available from: http://www.ncbi.nlm.nih.gov/pubmed/112681.
Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology [Internet]. 1982 [cited 2014 Dec 5];143:29–36. Available from: http://www.ncbi.nlm.nih.gov/pubmed/7063747.
Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. Clin Chem [Internet]. 1993 [cited 2014 Apr 28];39:561–77. Available from: http://www.ncbi.nlm.nih.gov/pubmed/8472349.
DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics [Internet]. 1988 [cited 2016 Jul 18];44:837–45. Available from: http://www.ncbi.nlm.nih.gov/pubmed/3203132.
Tridente A, Clarke GM, Walden A, McKechnie S, Hutton P, Mills GH, et al. Patients with faecal peritonitis admitted to European intensive care units: an epidemiological survey of the GenOSept cohort. Intensive Care Med. 2014;40:202–10.
Tridente A, Clarke GM, Walden A, Gordon AC, Hutton P, Chiche J-D, et al. Association between trends in clinical variables and outcome in intensive care patients with faecal peritonitis: analysis of the GenOSept cohort. Crit Care [Internet]. 2015 [cited 2015 Nov 2];19:210. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=4432819&tool=pmcentrez&rendertype=abstract.
Tape TG. University of Nebraska Medical Center: interpreting diagnostic tests—the area under an ROC curve [Internet]. Univ. Nebraska Med. Cent. webpage. 2016. Available from: http://gim.unmc.edu/dxtests/roc3.htm.
Biondo S, Ramos E, Fraccalvieri D, Kreisler E, Ragué JM, Jaurrieta E. Comparative study of left colonic Peritonitis Severity Score and Mannheim Peritonitis Index. Br J Surg [Internet]. 2006 [cited 2015 Dec 13];93:616–22. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16607684.
Vincent JL, Moreno R, Takala J, Willatts S, De Mendonça A, Bruining H, et al. The SOFA (Sepsis-related Organ Failure Assessment) score to describe organ dysfunction/failure. On behalf of the Working Group on Sepsis-Related Problems of the European Society of Intensive Care Medicine. Intensive Care Med [Internet]. 1996 [cited 2014 Apr 28];22:707–10. Available from: http://www.ncbi.nlm.nih.gov/pubmed/8844239.
Ceriani R, Mazzoni M, Bortone F, Gandini S, Solinas C, Susini G, et al. Application of the sequential organ failure assessment score to cardiac surgical patients. Chest [Internet]. 2003 [cited 2015 May 18];123:1229–39. Available from: http://www.ncbi.nlm.nih.gov/pubmed/12684316.
Lorente JA, Vallejo A, Galeiras R, Tómicic V, Zamora J, Cerdá E, et al. Organ dysfunction as estimated by the sequential organ failure assessment score is related to outcome in critically ill burn patients. Shock [Internet]. 2009 [cited 2015 Apr 4];31:125–31. Available from: http://www.ncbi.nlm.nih.gov/pubmed/18650779.
Hynninen M, Wennervirta J, Leppäniemi A, Pettilä V. Organ dysfunction and long term outcome in secondary peritonitis. Langenbecks Arch Surg [Internet]. 2008 [cited 2014 Jun 11];393:81–6. Available from: http://www.ncbi.nlm.nih.gov/pubmed/17372753.
van Ruler O, Kiewiet JJS, Boer KR, Lamme B, Gouma DJ, Boermeester MA, et al. Failure of available scoring systems to predict ongoing infection in patients with abdominal sepsis after their initial emergency laparotomy. BMC Surg [Internet]. 2011 [cited 2014 Apr 28];11:38. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3268736&tool=pmcentrez&rendertype=abstract.
van Ruler O, Lamme B, Gouma DJ, Reitsma JB, Boermeester MA. Variables associated with positive findings at relaparotomy in patients with secondary peritonitis. Crit Care Med [Internet]. 2007 [cited 2014 Apr 28];35:468–76. Available from: http://www.ncbi.nlm.nih.gov/pubmed/17205025.
Sumi T, Katsumata K, Katayanagi S, Nakamura Y, Nomura T, Takano K, et al. Examination of prognostic factors in patients undergoing surgery for colorectal perforation: a case controlled study. Int J Surg [Internet]. 2014 [cited 2014 Jun 11];12:566–71. Available from: http://www.ncbi.nlm.nih.gov/pubmed/24709571.
Jones AE, Trzeciak S, Kline JA. The Sequential Organ Failure Assessment score for predicting outcome in patients with severe sepsis and evidence of hypoperfusion at the time of emergency department presentation. Crit Care Med [Internet]. 2009 [cited 2015 Mar 2];37:1649–54. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2703722&tool=pmcentrez&rendertype=abstract.
Zügel NP, Kox M, Lichtwark-Aschoff M, Gippner-Steppert C, Jochum M. Predictive relevance of clinical scores and inflammatory parameters in secondary peritonitis. Bull Soc Sci Med Grand Duche Luxemb [Internet]. 2011 [cited 2014 Jun 11];41–71. Available from: http://www.ncbi.nlm.nih.gov/pubmed/21634221.
Matsumura Y, Nakada T, Abe R, Oshima T, Oda S. Serum procalcitonin level and SOFA score at discharge from the intensive care unit predict post-intensive care unit mortality: a prospective study. PLoS One [Internet]. 2014 [cited 2015 Mar 2];9:e114007. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=4252062&tool=pmcentrez&rendertype=abstract.
Laupland KB, Zahar J-R, Adrie C, Minet C, Vésin A, Goldgran-Toledano D, et al. Severe hypothermia increases the risk for intensive care unit-acquired infection. Clin Infect Dis [Internet]. 2012 [cited 2014 Apr 28];54:1064–70. Available from: http://www.ncbi.nlm.nih.gov/pubmed/22291110.
Tiruvoipati R, Ong K, Gangopadhyay H, Arora S, Carney I, Botha J. Hypothermia predicts mortality in critically ill elderly patients with sepsis. BMC Geriatr [Internet]. 2010 [cited 2014 Apr 28];10:70. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2955035&tool=pmcentrez&rendertype=abstract.
Le Gall JR, Lemeshow S, Saulnier F. A new Simplified Acute Physiology Score (SAPS II) based on a European/North American multicenter study. JAMA [Internet]. 1994 [cited 2015 May 24];270:2957–63. Available from: http://www.ncbi.nlm.nih.gov/pubmed/8254858.
Laupland KB, Zahar J-R, Adrie C, Schwebel C, Goldgran-Toledano D, Azoulay E, et al. Determinants of temperature abnormalities and influence on outcome of critical illness. Crit Care Med [Internet]. 2012 [cited 2014 Apr 28];40:145–51. Available from: http://www.ncbi.nlm.nih.gov/pubmed/21926588.
Sprung CL, Sakr Y, Vincent J-L, Le Gall J-R, Reinhart K, Ranieri VM, et al. An evaluation of systemic inflammatory response syndrome signs in the Sepsis Occurrence in Acutely Ill Patients (SOAP) study. Intensive Care Med [Internet]. 2006 [cited 2016 Jan 24];32:421–7. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16479382.
Morelli A, Ertmer C, Westphal M, Rehberg S, Kampmeier T, Ligges S, et al. Effect of heart rate control with esmolol on hemodynamic and clinical outcomes in patients with septic shock: a randomized clinical trial. JAMA [Internet]. 2013 [cited 2016 Sep 11];310:1683–91. Available from: http://www.ncbi.nlm.nih.gov/pubmed/24108526.
Park S, Kim D-G, Suh GY, Park WJ, Jang SH, Hwang Y Il, et al. Significance of new-onset prolonged sinus tachycardia in a medical intensive care unit: a prospective observational study. J Crit Care [Internet]. 2011 [cited 2016 Jan 24];26:534.e1–8. Available from: http://www.ncbi.nlm.nih.gov/pubmed/21376521.
Shander A, Knight K, Thurer R, Adamson J, Spence R. Prevalence and outcomes of anemia in surgery: a systematic review of the literature. Am J Med [Internet]. 2004 [cited 2014 Apr 28];116 Suppl:58S–69S. Available from: http://www.ncbi.nlm.nih.gov/pubmed/15050887.
Qiu M, Yuan Z, Luo H, Ruan D, Wang Z, Wang F, et al. Impact of pretreatment hematologic profile on survival of colorectal cancer patients. Tumour Biol [Internet]. 2010 [cited 2014 Apr 28];31:255–60. Available from: http://www.ncbi.nlm.nih.gov/pubmed/20336401.
Vignot S, Spano J-P. [Anemia and colorectal cancer]. Bull Cancer [Internet]. 2005 [cited 2014 Apr 28];92:432–8. Available from: http://www.ncbi.nlm.nih.gov/pubmed/15932806.
Halm EA, Wang JJ, Boockvar K, Penrod J, Silberzweig SB, Magaziner J, et al. The effect of perioperative anemia on clinical and functional outcomes in patients with hip fracture. J Orthop Trauma [Internet]. 2004 [cited 2014 Apr 28];18:369–74. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1454739&tool=pmcentrez&rendertype=abstract.
Beattie WS, Karkouti K, Wijeysundera DN, Tait G. Risk associated with preoperative anemia in noncardiac surgery: a single-center cohort study. Anesthesiology [Internet]. 2009 [cited 2014 Apr 28];110:574–81. Available from: http://www.ncbi.nlm.nih.gov/pubmed/19212255.
Hébert PC, Wells G, Blajchman MA, Marshall J, Martin C, Pagliarello G, et al. A multicenter, randomized, controlled clinical trial of transfusion requirements in critical care. N Engl J Med [Internet]. 1999 [cited 2016 Sep 11];340:409–17. Available from: http://www.nejm.org/doi/abs/10.1056/NEJM199902113400601.
Holst LB, Petersen MW, Haase N, Perner A, Wetterslev J. Restrictive versus liberal transfusion strategy for red blood cell transfusion: systematic review of randomised trials with meta-analysis and trial sequential analysis. BMJ [Internet]. Br Med J Publ Group; 2015 [cited 2016 Sep 11];350:h1354. Available from: http://www.ncbi.nlm.nih.gov/pubmed/25805204.
Lee H, Shon Y-J, Kim H, Paik H, Park H-P. Validation of the APACHE IV model and its comparison with the APACHE II, SAPS 3, and Korean SAPS 3 models for the prediction of hospital mortality in a Korean surgical intensive care unit. Korean J Anesthesiol [Internet]. 2014 [cited 2016 Jul 18];67:115–22. Available from: http://www.ncbi.nlm.nih.gov/pubmed/25237448.
Donnino MW, Salciccioli JD, Dejam A, Giberson T, Giberson B, Cristia C, et al. APACHE II scoring to predict outcome in post-cardiac arrest. Resuscitation [Internet]. 2013 [cited 2016 Aug 1];84:651–6. Available from: http://www.ncbi.nlm.nih.gov/pubmed/23178739.
Naeini AE, Abbasi S, Haghighipour S, Shirani K. Comparing the APACHE II score and IBM-10 score for predicting mortality in patients with ventilator-associated pneumonia. Adv Biomed Res [Internet]. Medknow Publications; 2015 [cited 2016 Aug 1];4:47. Available from: http://www.ncbi.nlm.nih.gov/pubmed/25789273.
Odor PM, Grocott MPW. From NELA to EPOCH and beyond: enhancing the evidence base for emergency laparotomy. Perioper. Med. (London, England) [Internet]. BioMed Central; 2016 [cited 2016 Nov 27];5:23. Available from: http://www.ncbi.nlm.nih.gov/pubmed/27594991.
AT conducted statistical analyses on the database, appraised the background literature, prepared the first draft of the manuscript and coordinated subsequent revisions; GMC prepared and quality-assured the database for analysis and contributed to revise the manuscript; AW contributed to drafting and reviewing the manuscript; ACG contributed to reviewing the manuscript; PH prepared and quality-assured the database for analysis and contributed to revise the manuscript; J-DC contributed to revise the manuscript; PAHH contributed to revise the manuscript; GHM contributed to revise the manuscript; JB conceived the study, contributed to drafting and reviewing the manuscript; FS conceived the study, contributed to reviewing the manuscript; CG conceived the study, contributed to quality assurance of the database, contributed to drafting and reviewing the manuscript; CH conceived the study, contributed to drafting and reviewing the manuscript; all authors read and approved the final manuscript.
Mr. Graham Paul Copeland, of the Department of Surgery, Warrington Hospital, Warrington, UK, provided us with very valuable insights into the development and evaluation of a scoring system.
The authors of this manuscript wish to thank all GenOSept and GAinS Investigators, as listed in Additional file 1.
The authors declare that they have no competing interest.
Availability of supporting data and materials
Reasonable requests to access the datasets analysed will be adjudicated by the GenOSept and GAinS management committees.
Consent for publication
Ethical approval and consent to participate
Ethics approval was obtained either nationally and/or locally. Written, informed consent for inclusion in the GenOSept or GAinS studies was obtained from all patients or a legal representative. Patients were recruited to GenOSept, GAinS or both studies. The studies were performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki and its later amendments. Patients included in the GenOSept FP cohort were recruited from 102 centres across 16 European countries, and those in the GAinS FP cohort were recruited from 51 UK centres between September 2005 and March 2015 (for ethical approval bodies, individual recruitment centres, chief and principal investigators, national coordinators and contributors see relevant lists in Additional file 1).
GenOSept (Genetics Of Sepsis and Septic Shock in Europe) is a pan-European part-FP6-funded study conceived by the European Critical Care Research Network of the European Society for Intensive Care Medicine to investigate the potential impact of genetic variation on the host response and outcomes in sepsis (https://www.genosept.eu/).
CIBERES is a Spanish research network which was used to identify investigators and contributed to funding through supporting logistics. A grant in partial support of FP6 projects was provided by the Spanish minister of Health.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.