The SORT is a risk prediction tool which has been developed and validated in a UK population of patients undergoing inpatient non-cardiac non-neurological surgery. It has a number of potential advantages over existing risk prediction tools, including parsimony (few variables), the fact that all data should be available pre-operatively (unlike, for example the P-POSSUM predictor) and that because no blood or radiological results are required, it should be possible for family doctors and preoperative assessment teams to calculate it, before the results of preoperative investigations may be available.
However, the strengths and limitations of all risk prediction tools should be considered when using the SORT or any other surgical risk prediction tool.
First, one should consider the accuracy of risk prediction tools. Accuracy is often considered in terms of discrimination and calibration. Discrimination (usually summarised using the Area Under the Receiver Operator Characteristic curve or AUROC) is a measure of the ability of the tool to "discriminate" between patients who do or don't develop the outcome of interest (e.g. death within 30 days of surgery). It is a summary of the sensitivity and specificity of a predictive tool. An AUROC of 0.5 indicates that the discriminatory accuracy of the tool is no better than "chance" (e.g. a coin flip). An AUROC of 0.51 - 0.7 is generally considered to be poor accuracy, 0.7 - 0.89 moderate accuracy, and >0.9 good accuracy1. Calibration is a measure of how well the predictive tool "fits" the sample data on which it is being tested, and compares observed and expected ratios across deciles of the sample group. Thus if a model is well-calibrated, a non-statistically significant result of O:E testing will be returned (using the Hosmer-Lemeshow chi-squared statistic, for example).
Second, clinicians and patients should consider that it may be impossible to ever define all the factors which may affect a patient's outcome from surgery, no matter how good and/or complex the risk prediction model. There are a number of reasons for this. For example, the healthcare profession is likely to be unaware of, or unable to measure in large enough validation cohorts, all the potential physiological risk factors which may influence patient outcome. One way of trying to circumvent this problem is to use surrogate or alternative markers of general patient health and well-being in risk prediction systems: for example, the American Society of Anesthesiologists' Physical Status score or ASA-PS (a simple score which reflects disease-related functional capacity), or other systems which estimate functional capacity, such as the Duke Activity Status Index, or cardiopulmonary exercise testing2. However, even the best risk prediction models usually have AUROC of around 0.9 or even 0.95 when tested in validation cohorts - implying that in 5 or 10% of cases, they may not accurately predict which patients are high risk.
This may be partly because the type of surgery, and technical issues regarding it, may have a significant impact on patient outcome, and these sorts of risk factors may not be fully defined in risk prediction modelling. For example, abdominal surgery of any type is likely to be more technically challenging (and therefore may take longer, and have a higher risk of technical complications) in a patient who has had multiple previous abdominal procedures than someone who has never previously had surgery.
Furthermore, we know that physiological and surgical risk factors are not the only contributors to patient outcome. The quality of care delivered to patients will affect outcome - one only has to look at studies which highlight variation in standards of care (and therefore patient outcomes) between and within institutions, to appreciate this3-7. Put simply, one might not be surprised to find that patients undergoing surgery in resource-limited settings, without access to the most modern or technologically advanced facilities, may not have as good outcomes as patients of similar fitness undergoing surgery in Europe or the US: thus if the same risk prediction model is tested in these two environments, the calibration (and quite possibly the discrimination) of a tool is likely to be better in one than the other. The traditional model of evaluating quality of healthcare was developed by Donabedian in the 1960s8. He proposed that structure (the healthcare system) and process (the method of healthcare delivery or "the treatment") would both influence patient outcome, and that clinicians, managers and policy makers should seek to establish which aspects of structure and process were key contributors to outcome in different settings. The recent National Emergency Laparotomy Audit (NELA) Organisational Report9 highlighted the variation between institutions in structures which are considered to be markers of high quality care for patients undergoing emergency general surgery in England and Wales - for example, surgical and anaesthetic staffing, provision of radiology and pathology services, and having pathways and protocols in place to guide management of high-risk surgical patients.(NELA organisational report available from here) Such variations have been shown to exist in many other healthcare settings, and reducing unplanned or unwanted variation is a key objective of quality improvement initiatives. Both NCEPOD and SOuRCe, along with many other research groups and organisations are committed to investigating and addressing unwarranted variation which may adversely impact patient outcome, and finding solutions to improve patient outcome. In particular, developing strategies for early identification and optimal management of postoperative complications when they occur are important goals. Significant or prolonged complications, which may be associated with reduced long-term survival 10, 11, 12 and health-related quality of life occur in over 10 times as many patients than short-term postoperative mortality. 12 When one considers the burden of surgery in the UK (over 3 million operations per annum) and worldwide (over 230 million operations per annum), it is easy to see the adverse impact that this level of morbidity may have on individual patients, their families and society in general.
Despite all of the above caveats, attempting to predict risk of adverse outcomes for individual patients is an important part of the delivery of high-quality perioperative healthcare. Understanding patient risk before surgery enables clinicians and patients to work together to develop strategies to mitigate said risk: for example through preoperative optimisation and through modifying treatment pathways to meet the individual patient need. It is also an important part of the preoperative consent process: the 2011 NCEPOD study which has subsequently led to the development of the SORT, found that during the peer review of case notes, only 37/496 (7.5%) patients who were viewed as high-risk by their anaesthetist had an estimate of the risk of death after surgery documented in their hospital notes13. Helping patients to fully understand the risks of the procedure which is planned is an important goal, and risk prediction tools such as the SORT may assist in this endeavour.
Director, UCL/UCLH Surgical Outcomes Research Centre