Predictive Modeling of Delivery Delays in Transportation Using Machine Learning: A Comparative Study of Service Types

Agus Purnomo; Nava Gia Ginasta; Syafrianita Syafrianita; Syafrial Fachri Pane

doi:10.38035/dijemss.v7i2.5736

Authors

Agus Purnomo Department of Master of Logistics Management, Universitas Logistik Dan Bisnis Internasional, Indonesia
Nava Gia Ginasta Department of Digital Business, Universitas Logistik Dan Bisnis Internasional, Indonesia
Syafrianita Syafrianita Department of Transportation Management, Universitas Logistik Dan Bisnis Internasional, Indonesia
Syafrial Fachri Pane Department of Informatics Engineering, Universitas Logistik Dan Bisnis Internasional, Indonesia

DOI:

https://doi.org/10.38035/dijemss.v7i2.5736

Keywords:

Machine learning, Delivery delay prediction, Logistics performance, Random Forest, XGBoost

Abstract

Traditional predictive models such as linear regression often struggle to capture the nonlinear interactions among operational factors that cause delivery delays in multi-category courier services. This study addresses that gap by developing and comparing machine learning (ML) algorithms to predict delivery delays across different service types at PT Pos Indonesia. The primary objective is to identify the most accurate predictive model and the dominant variables influencing delays across high-speed (Same Day, Next Day) and economical delivery services. A quantitative experimental design was employed using operational data from PT Pos Indonesia, consisting of 10,999 records and 12 variables. Three ML algorithms Logistic Regression, Random Forest, and XGBoost were trained and evaluated using standardized preprocessing, feature encoding, and stratified data splitting. Results show that Random Forest and XGBoost outperform Logistic Regression, each achieving approximately 65% accuracy with an AUC of 0.73, indicating moderate yet consistent predictive capabilities. Feature importance analysis reveals that Discount_offered, Weight_in_gms, and Prior_purchases are the most influential predictors of delivery timeliness.This study provides theoretical and practical contributions by introducing the first comparative ML framework for delay prediction in a national logistics context. The findings offer actionable insights for optimizing scheduling, load balancing, and promotional strategies, while advancing the integration of AI-based predictive analytics within postal logistics operations.

References

Azizah, A., Santosa, W., & Dewayana, T. (2024). The role of information processing and digital supply chain technology in supply chain resilience through supply chain risk (management in manufacturing companies in dki jakarta). AJESH, 3(10), 2268-2281. https://doi.org/10.46799/ajesh.v3i10.420

Baryannis, G., Dani, S., & Antoniou, G. (2019). Predicting supply chain risks using machine learning: the trade-off between performance and interpretability. Future Generation Computer Systems, 101, 993–1004. https://doi.org/10.1016/j.future.2019.07.059

Chang, J., Basvoju, D., Vakanski, A., Charit, I., & Xian, M. (2025). Predictive modeling and uncertainty quantification of fatigue life in metal alloys using machine learning. arXiv Preprint. https://doi.org/10.48550/arXiv.2501.15057

Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 785–794. https://doi.org/10.1145/2939672.2939785

Chen, X., Zong, X., & Yue, H. (2024). Construction of e-commerce cloud-based logistics service platform by communication technology. Journal of Computational Methods in Sciences and Engineering, 24(4-5), 3015-3030. https://doi.org/10.3233/jcm-247545

Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21, 6. https://doi.org/10.1186/s12864-019-6413-7

Christopher, M., & Peck, H. (2004). Building the resilient supply chain. The International Journal of Logistics Management, 15(2), 1–14. https://doi.org/10.1108/09574090410700275

Draksler, T., Cimperman, M., & Obrecht, M. (2023). Data-driven supply chain operations—the pilot case of postal logistics and the cross-border optimization potential. Sensors, 23(3), 1624. https://doi.org/10.3390/s23031624

Erutjahjo, G., & Supriyanto, A. (2025). Prediksi tinggi gelombang laut di perairan Semarang–Demak dengan menggunakan Random Forest dan XGBoost. Jurnal Informatika: Jurnal Pengembangan IT, 10(4). https://doi.org/10.30591/jpit.v10i4.9315

Haider, M., Hussain, M., & Insany, G. P. (2025). Heart attack prediction using machine learning models: A comparative study of Naive Bayes, decision tree, random forest, and K-nearest neighbors. Engineering Proceedings. https://doi.org/10.3390/engproc2025107121

Hao, L., Zhang, J., Di, Y., Zheng, Q., & Zhang, P. (2025). Predicting a failure of postoperative thromboprophylaxis in non-small cell lung cancer: a stacking machine learning approach. Plos One, 20(4), e0320674. https://doi.org/10.1371/journal.pone.0320674

Ivanov, D., Dolgui, A., & Sokolov, B. (2019). The impact of digital technology and Industry 4.0 on the ripple effect and supply chain risk analytics. International Journal of Production Research, 57(3), 829–846. https://doi.org/10.1080/00207543.2018.1488086

Jefroy, N., Azarian, M., & Yu, H. (2022). Moving from industry 4.0 to industry 5.0: what are the implications for smart logistics?. Logistics, 6(2), 26. https://doi.org/10.3390/logistics6020026

Kablan, R., Abor, J. Y., & Nartey, E. (2023). Evaluation of stacked ensemble model performance to optimize base and meta learners. Heliyon, 9(5), e15914. https://doi.org/10.1016/j.heliyon.2023.e15914

Katsaliaki, K., Galetsi, P., & Kumar, S. (2021). Supply chain disruptions and resilience: a major review and future research agenda. Annals of Operations Research. https://doi.org/10.1007/s10479-020-03912-1

Khedr, A. M., et al. (2024). Enhancing supply chain management with deep learning and machine learning techniques. Journal of King Saud University – Computer and Information Sciences, 36(6), 102482. https://doi.org/10.1016/j.jksuci.2024.102482

Kristanti, P., Anshori, M., & Andriani, N. (2023). Enterprise resource planning (erp) in supply chain management (scm) operational performance: a systematic literature review. Jurnal Ilmiah Manajemen Kesatuan, 11(3), 991-996. https://doi.org/10.37641/jimkes.v11i3.2227

Kusrini, E., Sugito, E., Rahman, Z., Setiawan, T., & Hasibuan, R. (2020). Risk mitigation on product distribution and delay delivery : a case study in an Indonesian manufacturing company. Iop Conference Series Materials Science and Engineering, 722(1), 012015. https://doi.org/10.1088/1757-899x/722/1/012015

Li, B., Zhang, X., Ban, Y., Xu, X., Su, W., Chen, J., … & Zhou, S. (2022). Construction of a smart supply chain for sand factory using the edge-computing-based deep learning algorithm. Scientific Programming, 2022, 1-15. https://doi.org/10.1155/2022/9607755

Li, W., Liu, Y., Liu, W., Tang, Z., Dong, S., Li, W., … & Yin, C. (2022). Machine learning-based prediction of lymph node metastasis among osteosarcoma patients. Frontiers in Oncology, 12. https://doi.org/10.3389/fonc.2022.797103

Li, Z. and Luo, F. (2024). Application and value of internet of things in enterprise supply chain management. Journal of Computational Methods in Sciences and Engineering, 24(4-5), 2689-2703. https://doi.org/10.3233/jcm-247499

Liebal, U., Phan, A., Sudhakar, M., Raman, K., & Blank, L. (2020). Machine learning applications for mass spectrometry-based metabolomics. Metabolites, 10(6), 243. https://doi.org/10.3390/metabo10060243

Liu, Q. (2024). Logistics distribution route optimization in artificial intelligence and internet of things environment. Decision Making Applications in Management and Engineering, 7(2), 221-239. https://doi.org/10.31181/dmame7220241072

Liu, Z., Zhao, Y., Guo, C., & Xin, Z. (2024). Research on the impact of digital-real integration on logistics industrial transformation and upgrading under green economy. Sustainability, 16(14), 6173. https://doi.org/10.3390/su16146173

Lundberg, S. M., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems (NeurIPS), 30. https://proceedings.neurips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf

Mukhsaf, M. H., Li, W., & Jani, G. H. (2025). Optimizing Methanol Injection Quantity for Gas Hydrate Inhibition Using Machine Learning Models. Applied Sciences, 15(6), 3229. https://doi.org/10.3390/app15063229

O’Brien, C., Goldstein, B., Shen, Y., Phelan, M., Lambert, C., Bedoya, A., … & Steorts, R. (2020). Development, implementation, and evaluation of an in-hospital optimized early warning score for patient deterioration. MDM Policy & Practice, 5(1). https://doi.org/10.1177/2381468319899663

Oloyede, J., & Luz, A. (2025). Predictive modeling for diabetes mellitus: Evaluating machine learning approaches on big data. Preprints. https://doi.org/10.20944/preprints202501.1703.v1

Palakshappa, A., Maradithaya, S., & Charunayana, V. (2025). A machine learning method to improve supplier delivery appointments in supply chain industries. Brazilian Journal of Operations & Production Management, 22(1). https://doi.org/10.14488/bjopm.2040.2025

Pan, R., Huang, Y., & Xiao, X. (2021). Evaluating consumers’ willingness to pay for delay compensation services in intra-city delivery—a value optimization study using choice. Information, 12(3), 127. https://doi.org/10.3390/info12030127

Pan, Y., Wang, X., & Ye, Q. (2024). Enhancing supply chain management through artificial intelligence: a case study of jd logistics. Advances in Economics Management and Political Sciences, 109(1), 116-121. https://doi.org/10.54254/2754-1169/109/2024bj0127

Ravindiran, G., Ramesh, M., Kumar, A., & Suresh, S. (2025). Ensemble stacking of machine learning models for air quality prediction. iScience, 34(2), 108945. https://doi.org/10.1016/j.isci.2025.108945

Revathi, M., Lakshmi, T., & Goud, K. (2024). Impact of blockchain technology in supply chain system. Int Res J Adv Engg Mgt, 2(05), 1670-1672. https://doi.org/10.47392/irjaem.2024.0238

Rokoss, A., Deringer, J., & Stark, R. (2024). Case study on delivery time determination using a machine learning approach. Journal of Intelligent Manufacturing, 35(9), 3279–3293. https://doi.org/10.1007/s10845-023-02290-2

Sabovčik, F., Ntalianis, E., Cauwenberghs, N., & Kuznetsova, T. (2022). Improving predictive performance in incident heart failure using machine learning and multi-center data. Frontiers in Cardiovascular Medicine, 9. https://doi.org/10.3389/fcvm.2022.1011071

Safdar, M. S. B., & Magdieva, S. S. (2025). Early detection and intervention for children's mental health issues using machine learning. International Journal of Preventive Medicine and Health, 5(2). https://doi.org/10.54105/ijpmh.b1049.05020125

Shaw, R., Lokshin, A., Miller, M., Messerlian-Lambert, G., & Moore, R. (2022). Stacking machine learning algorithms for biomarker-based preoperative diagnosis of a pelvic mass. Cancers, 14(5), 1291. https://doi.org/10.3390/cancers14051291

Sheffi, Y. (2007). The Resilient Enterprise: Overcoming Vulnerability for Competitive Advantage. MIT Press.

Sombultawee, K., Lenuwat, P., Aleenajitpong, N., & Boon‐itt, S. (2022). Covid-19 and supply chain management: a review with bibliometric. Sustainability, 14(6), 3538. https://doi.org/10.3390/su14063538

Stroumpoulis, A. and Kopanaki, E. (2022). Theoretical perspectives on sustainable supply chain management and digital transformation: a literature review and a conceptual framework. Sustainability, 14(8), 4862. https://doi.org/10.3390/su14084862

Sun, F. and Shi, G. (2021). Study on the application of big data techniques for the third-party logistics using novel support vector machine algorithm. Journal of Enterprise Information Management, 35(4/5), 1168-1184. https://doi.org/10.1108/jeim-02-2021-0076

Venkatesa, P.N.B., Kalpana M., Balakrishnan N., Shantha S.M., Gitanjali J., Kavitha S., Karthiba L., Balamurugan V., Suresh A., Rajavel M., Dhivya R., Syndhiya R., (2025). A comparative analysis and prediction of carbon emission in India using machine learning models. Global Nest Journal, 27(1). https://doi.org/10.30955/gnj.06020

Wang, D., Li, J., Sun, Y., Ding, X., Zhang, X., Liu, S., … & Sun, T. (2021). A machine learning model for accurate prediction of sepsis in icu patients. Frontiers in Public Health, 9. https://doi.org/10.3389/fpubh.2021.754348

Wan, X., Li, X., Xiong, L., Xu, Y., & Tian, J. (2025). Comparison and optimization strategies of Airbnb rental prediction models: An empirical study based on linear regression, XGBoost and random forest. Advances in Economics, Management and Political Sciences. https://doi.org/10.54254/2754-1169/2025.lh27240

Wang, Y., Yuan, X., & Zhang, Y. (2023). A systematic comparison of machine learning algorithms to develop and validate prediction model to predict heart failure risk in middle-aged and elderly patients with periodontitis (nhanes 2009 to 2014). Medicine, 102(34), e34878. https://doi.org/10.1097/md.0000000000034878

Xiong, T. (2024). The analysis of the impact of different supply chain factors using statistical perspective. Applied and Computational Engineering, 87(1), 241-249. https://doi.org/10.54254/2755-2721/87/20241618

Yaseen‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬‬, Z., Ali, Z., Salih, S., & Al‐Ansari, N. (2020). Prediction of risk delay in construction projects using a hybrid artificial intelligence model. Sustainability, 12(4), 1514. https://doi.org/10.3390/su12041514

Zahoor, K., Bawany, N., & Qamar, T. (2024). Evaluating text classification with explainable artificial intelligence. Iaes International Journal of Artificial Intelligence (Ij-Ai), 13(1), 278. https://doi.org/10.11591/ijai.v13.i1.pp278-286

Zegeye, A., Tilahun, B., Fekadie, M., Addisu, E., Wassie, B., Alelign, B., Sharew, M., Baykemagn, N. D., Kebede, A., & Yehuala, T. Z. (2025). Predicting home delivery and identifying its determinants among women aged 15–49 years in sub-Saharan African countries using demographic and health surveys 2016–2023: A machine learning algorithm. BMC Public Health, 25, Article 21334. https://doi.org/10.1186/s12889-025-21334-1

Zhang, D. (2024). Ai integration in supply chain and operations management: enhancing efficiency and resilience. Applied and Computational Engineering, 90(1), 8-13. https://doi.org/10.54254/2755-2721/90/2024melb0060

Zhang, M., Qi, Y., & Guo, Y. (2020). Optimizing multi-level express delivery networks under service-level constraints. Transportation Research Part E: Logistics and Transportation Review, 137, 101926. https://doi.org/10.1016/j.tre.2020.101926

Zhu, J. (2024). Analysis on the application of artificial intelligence in the field of logistics. Frontiers in Business Economics and Management, 14(3), 66-68. https://doi.org/10.54097/3pd01b02

Predictive Modeling of Delivery Delays in Transportation Using Machine Learning: A Comparative Study of Service Types

Authors

DOI:

Keywords:

Abstract

References

Downloads

Published

How to Cite

Issue

Section

License

Most read articles by the same author(s)

akreditasi

callreviewer

menu

template

flagcounter

tools

Current Issue

EDITORIAL OFFICE

PUBLISHER

CONTACT INFO

DIJEMSS INDEX