Logo image
Application of machine learning for the prediction of post-operative nausea and vomiting in adult surgical patients – A systematic review
Journal article   Open access   Peer reviewed

Application of machine learning for the prediction of post-operative nausea and vomiting in adult surgical patients – A systematic review

Santosh Patel and Franklin Dexter
Indian journal of anaesthesia, Vol.70(4), pp.516-525
04/01/2026
DOI: 10.4103/ija.ija_38_26
url
https://doi.org/10.4103/ija.ija_38_26View
Published (Version of record) Open Access

Abstract

Background and Aims: The clinical prediction of post-operative nausea and vomiting (PONV) is mainly based on scoring systems developed more than 2 decades ago. We systematically reviewed machine learning studies of PONV risk prediction. Methods: We searched databases including PubMed, Scopus, Web of Science, and Google Scholar for studies published till 14 September 2025. Using the area under the receiver operating characteristic curve and its standard error, we compared predictive performance with Apfel’s original 4-parameter pre-operative scoring system [area under curve (AUC) 0.68]. We assessed the quality of reporting of the studies using the Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis + Artificial Intelligence (TRIPOD+AI) framework. Results: Of 21 eligible studies, 16 were conducted in Asian countries. Three studies of mixed surgical populations reported an estimated AUC (0.714–0.814) numerically exceeding Apfel’s (AUC 0.68). These models included not only pre-operative but also intra-operative variables (e.g., anaesthetic drugs) for model development. None of the studies provided their models sufficient for implementation (e.g., computer code with estimated parameters or a web page for calculations). Furthermore, none specified how the standard errors were calculated, for assessment of their reliability compared with Apfel’s logistic regression model. Secondary analyses found that models for specific surgical populations reported larger observed AUCs than those for mixed populations. Conclusion: Although some ML algorithms reported higher discriminatory power than Apfel’s PONV risk prediction, none satisfied the TRIPOD+AI reporting criteria sufficient for clinical replacement by departments. Future research should prioritise open science principles to ensure that scientific advances can be tested for generalisability and efficacy in reducing PONV. The improved predictive performance may be realised for clinical decision-making soon before the end of surgery rather than prophylaxis chosen pre-operatively.
Machine Learning Artificial intelligence

Details

Metrics

1 Record Views
Logo image