Predicting long-term mortality of patients with postoperative acute kidney injury following noncardiac general anesthesia surgery using machine learning
Article information
Abstract
Background
This study addresses the gap in knowledge regarding the long-term mortality implications of postoperative acute kidney injury (PO-AKI) utilizing advanced machine learning techniques to predict outcomes more accurately than traditional statistical models.
Methods
A retrospective cohort study was conducted using data from seven institutions between March 2009 and December 2019. Machine learning models were developed to predict all-cause mortality of PO-AKI patients using 23 preoperative variables and one postoperative variable. Model performance was compared to a traditional statistical approach with Cox regression analysis. The concordance index was used as a predictive performance metric to compare prediction capabilities among different models.
Results
Among 199,403 patients, 2,105 developed PO-AKI. During a median follow-up of 144 months (interquartile range, 99.61–170.71 months), 472 in-hospital deaths occurred. Subjects with PO-AKI had a significantly lower survival rate than those without PO-AKI (p < 0.001). For predicting mortality, the XGBoost with an accelerated failure time model had the highest concordance index (0.7521), followed by random survival forest (0.7371), multivariable Cox regression model (0.7318), survival support vector machine (0.7304), and gradient boosting (0.7277).
Conclusion
XGBoost with an accelerated failure time model was developed in this study to predict long-term mortality associated with PO-AKI. Its performance was superior to conventional models. The application of machine learning techniques may offer a promising approach to predict mortality following PO-AKI more accurately, providing a basis for developing targeted interventions and clinical guidelines to improve patient outcomes.
Introduction
Acute kidney injury (AKI) following general anesthesia surgery, known as postoperative AKI (PO-AKI), is a common complication, accounting for 30% to 40% of all hospital-acquired AKIs [1,2]. PO-AKI is not just a temporary condition. It can increase morbidity and in-hospital mortality to 3- to 9-fold, both immediately and in the long term [3–6]. Even patients whose kidney functions have completely recovered after PO-AKI still have a higher risk of death than those without AKI [7], which highlights the profound and lasting consequences of PO-AKI.
Most studies on PO-AKI have focused predominantly on short-term outcomes, leaving a significant gap in our understanding of long-term mortality implications of PO-AKI. This oversight in research highlights an urgent need for comprehensive studies that evaluate long-term effects of PO-AKI, providing a holistic view of patient outcomes and facilitating the development of effective strategies to improve long-term survival [8]. In Korea, epidemiological studies exploring the relationship between PO-AKI and increased mortality rate are limited. Related international studies are often limited to single-institution studies or those with a large number of patients having no long-term outcomes [3,4,6–8]. This limitation exposes a critical gap in our knowledge, as data from a single institution may not accurately represent the broader patient population. Therefore, external validation is essential to ensure the applicability of research findings across various healthcare settings. Furthermore, integration of advanced analytical methods, such as machine learning-based survival analysis, has been notably absent in PO-AKI mortality research. Current studies typically rely on traditional statistical approaches, which might not adequately capture complex interactions of variables influencing mortality of PO-AKI. To address this gap, this study aimed to develop more accurate predictive models for mortality following PO-AKI by applying machine learning techniques utilizing a more extensive and varied dataset from seven university hospitals. By focusing on noncardiac general anesthesia surgeries, we aim to provide a clearer understanding of the long-term mortality risks specific to this patient population.
Methods
Study design and cohort
This retrospective cohort study included adult patients aged 18 years and over who underwent noncardiac general anesthesia surgery from seven institutions between March 2009 and December 2019. Exclusion criteria were: patients under the age of 18 years at the time of surgery, those with non-anesthetic surgeries, those with surgeries lasting less than 1 hour or without a specified duration, those who underwent cardiac surgery, nephrectomy, kidney transplant surgery, or preoperative dialysis, those without preoperative or postoperative serum creatinine (sCr) value, those with preoperative estimated glomerular filtration rate (eGFR) below 15 mL/min/1.73 m², and those with an increase in sCr level exceeding 0.3 or 1.5 times within 2 weeks prior to surgery (Fig. 1). If subjects underwent multiple surgeries during the study period, only the first eligible surgery was included in the analysis. A total of 199,403 patients were included in the analysis. The PO-AKI group consisted of patients who developed PO-AKI, and the No-PO-AKI group consisted of those who did not develop PO-AKI.
This study was approved by the Institutional Review Board of College of Medicine, The Catholic University of Korea (No. XC22WIDI0022). The requirement of informed consent was waived due to the retrospective nature of this study.
Definition of postoperative acute kidney injury
PO-AKI was defined as the presence of either: an increase in sCr by 0.3 mg/dL or more (26.5 μmol/L or more) within 48 hours, or an increase in sCr to 1.5 times or more than the baseline of the prior 7 days according to the Kidney Disease: Improving Global Outcomes (KDIGO) guideline [9]. PO-AKI was classified according to the severity of AKI. Stage 1 AKI was identified as a 0.3 mg/dL elevation in sCr or an increase of 1.5 to 1.9 times from the baseline sCr level. Stage 2 was characterized by a 2.0 to 2.9 times rise in sCr level from baseline and stage 3 was defined as either a tripling or more of the baseline sCr level or the necessity to initiate dialysis [10]. Urine output criteria for diagnosis of AKI of KDIGO were not used as previous studies suggested that the threshold of oliguria for PO-AKI might be different from those of other AKIs [11,12] and a lack of urine output data in our database.
Data collection and cleansing
Data was extracted from the Clinical Data Warehouse of the Catholic Medical Center neuroUbiquitous (CMC nU) system, which is separately generated and managed redundantly from the electronic medical record systems of eight affiliated hospitals of College of Medicine, The Catholic University of Korea. Data on demographic characteristics, underlying comorbidities, preoperative and postoperative laboratory data, preoperative medication, and duration of operation, and department of surgery were collected. Body mass index (BMI) was calculated as a patient’s weight in kilograms divided by height in meters squared (kg/m2). The underlying diseases of subjects were determined using the International Classification of Disease, 10th Revision (ICD-10) codes of principal and secondary diagnoses. The presence of each comorbidity was determined based on the first three digits of the diagnosis code in the diagnostic table in the database. Comorbid diseases and ICD-10 codes are shown in Supplementary Table 1 (available online).
Preoperative laboratory values conducted closest to the day before surgery were collected, including blood levels of albumin, alanine aminotransferase, aspartate aminotransferase, blood urea nitrogen (BUN), calcium, chloride, creatinine, C-reactive protein, glucose, hemoglobin, potassium, lactate dehydrogenase (LDH), sodium, total protein, white blood cell count, sCr, and eGFR. Postoperative values for sCr and hemoglobin on the first day after surgery were also collected.
Preoperative medications included renin-angiotensin-aldosterone system inhibitors (angiotensin-converting enzyme inhibitor [ACEi] or angiotensin II type 1 receptor blocker [ARB]) or nonsteroidal anti-inflammatory drugs (NSAIDs). Specifically, variables for ACEi, ARB, and NSAIDs were determined based on their consumption within 2 weeks before the surgery. Preoperative eGFR was calculated from the CKD-EPI (Chronic Kidney Disease Epidemiology Collaboration) equation [13]. Data was extracted from the Clinical Data Warehouse of the CMC nU system, and processed using R version 3.6.3 (R Foundation for Statistical Computing) and Python version 3.8.5 (Python Software Foundation).
For deriving a minimal set of variables that could be clinically applied, variables with a p-value of 0.05 or higher and those with a missing rate of 30% or more were excluded from the list of utilized variables. As a result, a total of 39 variables, including both continuous and categorical variables, were utilized in the analysis (Supplementary Fig. 1, available online). Missing continuous variables were filled with median values. In cases where a patient underwent multiple surgeries during a single hospital stay, the initial surgery was selected for inclusion in our analysis [14].
Patient death was defined as any death which occurred at the hospitals included in this study; deaths that occurred during the admission period for surgery or after discharge. For survival analysis, the criteria for the time to event were established as follows. For patients in the No-PO-AKI group who were deceased, the interval was calculated from the date of surgery to the date of death. For patients in the No-PO-AKI group who survived, the duration was determined from the date of surgery to the date of their last visit. For patients in the PO-AKI group who were deceased, the time between the date of AKI onset and the date of death was calculated. For patients in the PO-AKI group who survived, the time was measured from the date of AKI onset to the date of their last visit. When comparing survival rates between the No-PO-AKI group and the PO-AKI group, the initiation of follow-up was different between the two groups. There was no choice for this discrepancy, since the main purpose of this study was comparing survivors and non-survivors after PO-AKI, and the No-PO-AKI group did not have an AKI event. In a previous study, survival models were initiated at the time of hospital discharge after surgery [7], however, we did not utilize this definition to focus on survival after PO-AKI.
Statistics analysis
Statistical tests were performed using R version 3.6.3. A comparison was made between survivors and non-survivors of the PO-AKI group. Continuous variables are presented as mean and standard deviation. They were compared using an independent t test. Categorical data are presented as percentages. They were compared using the chi-square test. Survival outcomes were depicted on patients who developed PO-AKI using the Kaplan-Meier curves. The hazard ratio (HR) was derived by a Cox proportional hazard (CPH) model [15] in the PO-AKI group. Univariable Cox regression analyses were done for 39 explanatory variables (32 continuous variables and seven categorical variables), including stages of AKI and smoking status. Results are presented as HR with 95% confidence intervals (CI). The p-values of <0.05 were considered significant. For multivariable Cox regression analyses, sex and age were utilized in a baseline model. Statistically significant variables (p < 0.05) in the univariable Cox regression analyses were sequentially added to the model in descending order of their HRs, adhering to a forward selection approach.
Machine learning model
Modeling was performed using Python version 3.8.5. In survival analysis, beyond the commonly used statistical-based multivariate Cox regression model, machine learning-based models such as survival support vector machine (SVM), gradient boosting model, random survival forest, and XGBoost with accelerated failure time (AFT) were applied. The concordance index (C-index) was used as a predictive performance metric to compare prediction capabilities among models. C-index values were compared by adding variables in the order of those indicating higher risk. An analysis was then conducted to identify key factors for predicting mortality. The C-index, a critical metric extensively employed for assessing the performance of survival models, serves as a pivotal measure for distinguishing between two survival distributions [16,17]. The C-index provides a quantitative measure of accuracy in predicting the relative risk and concordance of survival time [18], thereby facilitating the comparison between the order of predicted survival time and the order of observed survival time [19]. The dataset was split into a training set and a test set at a ratio of 7:3. The model was built using the training dataset (n = 1,474) and internal validation was performed using a stratified k-fold method with five splits. Its predictive performance was compared using the test dataset (n = 631) (Supplementary Fig. 1, available online). Demographic variables such as sex and age served as the baseline model. The predictive power was calculated by adding variables in order of their risk level derived from univariable Cox regression analyses.
In this study, we employed an innovative approach by integrating the AFT model with XGBoost, optimizing the loss function through strategic modulation of hyperparameters. The AFT model offers a parametric framework to analyze survival data, enabling direct modeling of survival times based on covariates. Known for its effectiveness in handling structured data, XGBoost enhances this approach by applying gradient boosting techniques to optimize the AFT’s loss function, achieved through either hyperparameter adjustment or loss function modification. This integration not only boosts the model’s performance but also provides a robust framework for analyzing survival data.
In the context of XGBoost with the AFT model, it is possible to construct models that exhibit significantly different average predicted survival time while still maintaining strong C-indexes. This implies that by adjusting the model’s assumptions regarding characteristics of data distribution, the AFT model can potentially offer a superior fit in scenarios where the proportional hazard assumption is violated [20]. Consequently, through adjustment of relevant parameters for scale, one can optimally reflect the actual distribution of survival time inherent in the data, thereby enabling an explicit estimation of average survival time. Therefore, by adjusting the ‘aft loss distribution scale’ parameter to values of 0.5, 1.0, and 1.5, we managed to modulate the scale of survival time distribution, essentially altering the spread of distribution to estimate the average survival time. This strategic adjustment allowed us to fine-tune the model to reflect the actual spread of survival time inherent more accurately in the dataset, facilitating a precise estimation of average survival duration.
Results
Baseline characteristics
Baseline characteristics of patients with PO-AKI were compared between survivors and non-survivors (Table 1). Survivors and non-survivors of patients with PO-AKI showed significant differences, with non-survivors exhibiting older age, higher systolic blood pressure (BP), lower diastolic BP, and higher prevalence of male and chronic kidney disease (CKD). Departments of surgery also differed between survivors and non-survivors. Preoperative hemoglobin, eGFR, and serum albumin levels were significantly lower, while levels of BUN, sCr, and C-reactive protein were significantly higher in non-survivors than in survivors.
Kaplan-Meier curve
The median follow-up period of total population (n = 199,403) was 144 months (interquartile range, 99.61–170.71 months). The survival rate was significantly lower in the PO-AKI group than in the No-PO-AKI group (p < 0.001) (Fig. 2A). Among 2,105 patients with PO-AKI, 472 (22.4%) deceased during follow-up. Among 472 patients with PO-AKI who deceased, 213 (45.1%) died within 50 days and 387 (82.0%) died within 2 years (Fig. 2B).
Univariable Cox proportional hazard analysis for all-cause mortality
In univariable CPH analysis conducted on patients who developed PO-AKI, age, sex, systolic BP, diastolic BP, operation hours, and comorbidities including CKD and cerebrovascular disease were statistically significant factors associated with mortality. All preoperative laboratory tests except white blood cell count showed significant associations with death. However, preoperative use of ACEi, ARB, or NSAIDs did not affect mortality. Postoperative hemoglobin level and AKI stage 2 and stage 3 (vs. stage 1) were significant factors associated with death (Table 2).
Model prediction performance
For all-cause mortality, based on model 7 which included 24 variables that were statistically significant at the 95% CI, C-index values demonstrating predictive accuracy were in the following descending order: XGBoost with AFT at 0.7521, random survival forest at 0.7271, multivariable Cox regression model at 0.7318, survival SVM at 0.7304, and gradient boosting at 0.7277 (Table 3).
For model validation using a stratified k-fold method with five splits, the results showed the average C- index across folds to be as follows: XGBoost with AFT at 0.7380, random survival forest at 0.7520, multivariable Cox regression model at 0.7494, survival SVM at 0.7496, and gradient boosting at 0.7421.
According to feature importance results for the XGBoost with AFT model, which demonstrated the highest C-index, the top 10 variables in descending order of importance were: diastolic BP, cerebrovascular disease, AKI stage, sodium, systolic BP, eGFR, C-reactive protein, chloride, LDH, and albumin (Supplementary Table 2, Supplementary Fig. 2; available online). CKD, which showed a high HR in the univariable Cox regression analysis, was absent from feature importance results because it occurred in only 11 of a total of 1,474 individuals in the training dataset. This omission might be attributed to the fact that features with very low variance or those that were rare in the dataset might not be selected for splits [21].
XGBoost’s utilization in survival analysis provides an optimal way to reflect the actual distribution of survival time inherent in the data by adjusting the scale of survival time distribution, i.e., the spread of the distribution. In our developed model, when the aft loss distribution scale was set to 1.0, the C-index reached 0.7521, predicting an average survival time of 10 years. Additionally, when the aft loss distribution scale was adjusted to 1.5, the C-index decreased to 0.7120, with the average survival time extending to 12 years. Conversely, reducing the aft loss distribution scale to 0.5 led to an improvement in the C-index to 0.7327, with a shorter average survival time of 6 years (Fig. 3).
Discussion
Using a multicenter database of 199,403 noncardiac surgeries, we developed a predictive model for mortality following PO-AKI using machine learning techniques. XGBoost with AFT model which included 24 variables achieved the highest predictive power for an average survival time of 10 years with a C-index of 0.7521. This model can be used not only to predict the long-term survival of patients with PO-AKI but also to discriminate high-risk patients and offer a more delicate management for them after general anesthesia surgery.
The incidence of PO-AKI varies depending on the type of surgery and urgency [2]. In this study, PO-AKI occurred in 1.05% of the total population, which was lower than those reported previously [3,6,22]. The reason might be because only noncardiac surgeries, nephrectomy cases, and subjects with postoperative sCr results within 1 week after surgery were included. Cardiac surgery and nephrectomy surgery cases were excluded because cardiac surgery may cause AKI through a different mechanism [2] and nephrectomy surgery, including those related to kidney cancer and other diseases requiring nephrectomy, were excluded as they inherently impact kidney function and may cause AKI [23]. Other cancer-related surgeries were not excluded from our study. In addition, urine output definition of the KDIGO criteria was not used in this study. Although urine output data were recorded during hospital stays, the data was not included in our database. The long-term, multi-institutional collection of such data was challenging and therefore not utilized in this study. Postoperative oliguria is common and does not always accompany a rise in sCr. In some cases, postoperative oliguria is thought to be a part of a physiologic response without truly reflecting kidney injury. It can be induced by antidiuretic hormone in response to pain, nausea, and surgical procedures [24,25]. On the contrary, some studies have shown associations of intraoperative oliguria with adverse outcomes and higher incidence of PO-AKI [12,26,27].
PO-AKI is associated with increased morbidity and mortality. It is associated with longer hospital length of stay [3,6], higher rates of readmission, progression to end-stage kidney disease within 1 year [3], and increased in-hospital mortality [3,4,6,8]. Most of the previous studies have focused on mortality within a few weeks to 1 year [3,4,6,8]. Bihorac et al. [7] have analyzed survival rates of 10,518 patients after surgery for more than 10 years. The present study included a larger number of subjects (n = 199,403) with a compatible follow-up period (median, 144 months; interquartile range, 99.61–170.71). It consistently demonstrated that the PO-AKI patients had significantly lower survival rates than those without PO-AKI. This implicates the need for early identification of high-risk PO-AKI patients to improve their long-term outcomes.
In this study, the XGBoost with AFT model showed higher predictive performance than random survival forest, gradient boosting, multivariable Cox regression model, and survival SVM. This XGBoost with AFT model included a total of 24 variables (23 preoperative variables and postoperative hemoglobin level). For model validation, the difference in the C-index between the cross-validation results and the test results was not significant. This indicates that the XGBoost with the AFT model demonstrated enhanced generalization performance, with its C-index increasing from 0.7380 in cross-validation to 0.7521 in actual predictions. Such consistency suggests that the model does not overfit the training data and possesses reliable predictive capabilities on new data [28]. Consequently, the XGBoost with AFT model should be considered the preferred choice when prioritizing predictive accuracy on unseen data.
Our XGBoost with AFT model identified diastolic BP, cerebrovascular disease, AKI stage, and biochemical parameters such as sodium, systolic BP, eGFR, C-reactive protein, chloride, LDH, and albumin as key predictors of long-term mortality. Since these factors were baseline variables at the time of surgery, it is difficult to clearly explain how these factors affected long-term mortality. However, these factors are related to hemodynamics, cerebrovascular vascular abnormality, severity of AKI, volume status, baseline kidney function, inflammation, and nutrition, all of which might have negatively influenced the recovery and comorbid conditions after PO-AKI. This was similarly shown in a previous study, which demonstrated that AKI severity affected earlier mortality, while comorbid conditions affected later mortality in AKI patients [29]. In addition, a higher systolic BP and a lower diastolic BP, which mean a wider pulse pressure, were associated with a high risk of mortality in this study. Pulse pressure was shown to be associated with increased cardiovascular and all-cause mortality in many studies [30]. Since pulse pressure is an indicator of vessel stiffness and is dependent on stroke volume, a wide pulse pressure can subsequently increase the risk of cardiovascular disease and death in PO-AKI patients. Notably, CKD, despite its high HR in univariable Cox regression, was not a top predictor in the feature importance results, likely due to its low prevalence in the training dataset. This highlights a potential limitation in multivariable predictive models where factors with low occurrence might not show strong predictive utility despite their significant univariable associations with mortality. Furthermore, the identified predictors have practical implications for clinical practice. By understanding these factors, clinicians can develop targeted interventions following clinical guidelines, which may reduce the principal causes of death in patients with PO-AKI, thereby improving patient management and prognostic accuracy.
The superior predictive power of machine learning models over traditional statistical approaches in forecasting mortality following PO-AKI might be attributed to several underlying mechanisms. For example, machine learning’s ability to analyze and interpret complex, non-linear relationships and interactions among many variables offers a significant advantage. Unlike traditional models that often rely on predefined assumptions about data distributions and relationships [28,31], machine learning algorithms can uncover hidden patterns in data without such constraints. This capability is particularly relevant in the context of PO-AKI, where the pathophysiology involves a multitude of factors ranging from the patient’s preoperative health status, and the nature of the surgery, to postoperative care [2], making the prediction of outcomes exceedingly complex. Previous studies have highlighted multifactorial risk factors associated with AKI and its subsequent impact on mortality [2]. Our study facilitates the development of models capable of more efficiently navigating the complexity associated with identifying patients at risk and implementing preventive measures.
This study has some limitations. First, patient deaths that occurred at other hospitals or elsewhere could not be counted in our analysis, as only deaths that occurred at the seven participating hospitals were identified in the study. This exclusion potentially overlooked a significant aspect of post-discharge outcomes. Such outcomes could provide a more comprehensive understanding of the long-term impact of AKI. Second, this study did not use the urine output KDIGO criteria for the definition of PO-AKI, which might have lowered the true incidence of PO-AKI in our study population. Third, intraoperative and postoperative factors other than sCr and hemoglobin, such as the type of surgery, intraoperative blood loss, and pertinent surgical characteristics, were not included in the risk prediction model, which might have also affected postoperative renal outcomes. Fourth, external validation was not performed in this study. However, we utilized extensive medical data resources spanning 10 years from the CMC nU system, which is separately generated and managed redundantly from the electronic medical records of seven hospitals from different regional locations. Although the CMC nU system is logically integrated by a central center, it is physically separated by region and pathway [32]. This integration allows the CMC nU system to function effectively as a source of external validation. Therefore, internal validation using a stratified k-fold method with our multicenter database provided validation efficacy comparable to external validation.
Lastly, the retrospective nature of our study and the potential for substantial confounding factors related to long-term mortality are notable limitations. Despite our efforts to adjust for covariates such as age, comorbidities, and AKI stage, residual confounding may persist. Factors like socioeconomic status, healthcare access, and variations in postoperative care likely play significant roles in long-term survival. Additionally, the inability to determine the exact causes of death further limits the comprehensiveness of our findings, as cause-of-death data were not available in our dataset. To address these concerns, we used robust techniques and comprehensive adjustments. However, the inherent limitations of retrospective studies mean not all confounders can be fully accounted for. Future research should explore these relationships and provide disease-related mortality, using longitudinal data and advanced modeling techniques.
Nonetheless, unlike other studies, which were limited either by the scope of their datasets or by the diversity of their patient populations studied, our research benefited from a comprehensive dataset from seven affiliated hospitals, enhancing the generalizability and applicability of our findings. Moreover, our study not only compared the predictive performances of machine learning models with traditional statistical models but also emphasized the development of practical guidelines for the prevention of mortality following PO-AKI. This dual focus on prediction and prevention sets our study apart, highlighting its relative strength and contributing valuable insights into the ongoing discussion about the best practices for managing AKI in postoperative patients.
Supplementary Materials
Supplementary data are available at Kidney Research and Clinical Practice online (https://doi.org/10.23876/j.krcp.24.106).
Notes
Conflicts of interest
Tae Hyun Ban is a Deputy Editor of Kidney Research and Clinical Practice and was not involved in the review process of this article. All authors have no other conflicts of interest to declare.
Funding
This research was supported by a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number: HI23C047600), and by the Clinical Trials Center of The Catholic University of Korea, Incheon St. Mary’s Hospital with the financial support of the Catholic Medical Center Research Foundation made in the program year of 2022.
Data sharing statement
The data presented in this study are available from the corresponding author upon reasonable request.
Authors’ contributions
Conceptualization: BYC, WC, JM, HEY, IYC
Data curation: BYC, WC, JM, BHC, ESK, SYH, TB, YKK, IYC
Formal analysis, Visualization: BYC
Funding acquisition, Supervision: HEY, IYC
Investigation, Software: BYC, WC
Methodology: JM, HEY
Project administration: HEY
Resources: BHC, ESK, SYH, TB, YKK
Writing–original draft: BYC
Writing–review & editing: BYC, HEY, IYC
All authors read and approved the final manuscript.