A Comparative Study of Machine Learning and Deep Learning Approaches For Hotel Booking Cancellation Prediction


Creative Commons License

Alkan T. Y.

International Journal Of Scientific Research In Engineering & Technology, cilt.05, sa.03, ss.11-18, 2025 (Hakemli Dergi)

Özet

Hotel reservation cancellations pose significant operational and financial challenges for the hospitality industry. With the growing prevalence of online booking platforms and flexible cancellation policies, accurately predicting whether a reservation will be canceled has become increasingly critical for revenue management and resource optimization. This study investigates and compares a range of machine learning and deep learning models -including XGBoost, Random Forest, TabNet, PyTorch-based neural networks, and Logistic Regression- for their effectiveness in predicting booking cancellations using a publicly available dataset comprising 36,275 reservations. Each model was evaluated using 5-fold stratified cross-validation, with performance assessed via accuracy, F1 score, and area under the ROC curve (AUC). Ensemble methods (XGBoost and Random Forest) achieved the best predictive performance (AUC scores of 0.9526 and 0.9553, respectively), outperforming both traditional statistical models and deep learning alternatives. Analysis revealed that variables such as lead time, number of special requests, and market segment type are consistently strong predictors of cancellation behavior. The results highlight the potential of interpretable machine learning models to support proactive decision-making in hotel operations. By integrating these models into reservation systems, hotels can reduce revenue loss, better manage capacity, and personalize customer engagement strategies. This research offers a robust benchmarking framework and practical insights for applying predictive analytics in the hospitality domain.