Predictive Modeling of Delivery Delays in Transportation Using Machine Learning: A Comparative Study of Service Types
DOI:
https://doi.org/10.38035/dijemss.v7i2.5736Keywords:
Machine learning, Delivery delay prediction, Logistics performance, Random Forest, XGBoostAbstract
Traditional predictive models such as linear regression often struggle to capture the nonlinear interactions among operational factors that cause delivery delays in multi-category courier services. This study addresses that gap by developing and comparing machine learning (ML) algorithms to predict delivery delays across different service types at PT Pos Indonesia. The primary objective is to identify the most accurate predictive model and the dominant variables influencing delays across high-speed (Same Day, Next Day) and economical delivery services. A quantitative experimental design was employed using operational data from PT Pos Indonesia, consisting of 10,999 records and 12 variables. Three ML algorithms Logistic Regression, Random Forest, and XGBoost were trained and evaluated using standardized preprocessing, feature encoding, and stratified data splitting. Results show that Random Forest and XGBoost outperform Logistic Regression, each achieving approximately 65% accuracy with an AUC of 0.73, indicating moderate yet consistent predictive capabilities. Feature importance analysis reveals that Discount_offered, Weight_in_gms, and Prior_purchases are the most influential predictors of delivery timeliness.This study provides theoretical and practical contributions by introducing the first comparative ML framework for delay prediction in a national logistics context. The findings offer actionable insights for optimizing scheduling, load balancing, and promotional strategies, while advancing the integration of AI-based predictive analytics within postal logistics operations.
References
Azizah, A., Santosa, W., & Dewayana, T. (2024). The role of information processing and digital supply chain technology in supply chain resilience through supply chain risk (management in manufacturing companies in dki jakarta). AJESH, 3(10), 2268-2281. https://doi.org/10.46799/ajesh.v3i10.420
Baryannis, G., Dani, S., & Antoniou, G. (2019). Predicting supply chain risks using machine learning: the trade-off between performance and interpretability. Future Generation Computer Systems, 101, 993–1004. https://doi.org/10.1016/j.future.2019.07.059
Chang, J., Basvoju, D., Vakanski, A., Charit, I., & Xian, M. (2025). Predictive modeling and uncertainty quantification of fatigue life in metal alloys using machine learning. arXiv Preprint. https://doi.org/10.48550/arXiv.2501.15057
Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 785–794. https://doi.org/10.1145/2939672.2939785
Chen, X., Zong, X., & Yue, H. (2024). Construction of e-commerce cloud-based logistics service platform by communication technology. Journal of Computational Methods in Sciences and Engineering, 24(4-5), 3015-3030. https://doi.org/10.3233/jcm-247545
Chicco, D., & Jurman, G. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics, 21, 6. https://doi.org/10.1186/s12864-019-6413-7
Christopher, M., & Peck, H. (2004). Building the resilient supply chain. The International Journal of Logistics Management, 15(2), 1–14. https://doi.org/10.1108/09574090410700275
Draksler, T., Cimperman, M., & Obrecht, M. (2023). Data-driven supply chain operations—the pilot case of postal logistics and the cross-border optimization potential. Sensors, 23(3), 1624. https://doi.org/10.3390/s23031624
Erutjahjo, G., & Supriyanto, A. (2025). Prediksi tinggi gelombang laut di perairan Semarang–Demak dengan menggunakan Random Forest dan XGBoost. Jurnal Informatika: Jurnal Pengembangan IT, 10(4). https://doi.org/10.30591/jpit.v10i4.9315
Haider, M., Hussain, M., & Insany, G. P. (2025). Heart attack prediction using machine learning models: A comparative study of Naive Bayes, decision tree, random forest, and K-nearest neighbors. Engineering Proceedings. https://doi.org/10.3390/engproc2025107121
Hao, L., Zhang, J., Di, Y., Zheng, Q., & Zhang, P. (2025). Predicting a failure of postoperative thromboprophylaxis in non-small cell lung cancer: a stacking machine learning approach. Plos One, 20(4), e0320674. https://doi.org/10.1371/journal.pone.0320674
Ivanov, D., Dolgui, A., & Sokolov, B. (2019). The impact of digital technology and Industry 4.0 on the ripple effect and supply chain risk analytics. International Journal of Production Research, 57(3), 829–846. https://doi.org/10.1080/00207543.2018.1488086
Jefroy, N., Azarian, M., & Yu, H. (2022). Moving from industry 4.0 to industry 5.0: what are the implications for smart logistics?. Logistics, 6(2), 26. https://doi.org/10.3390/logistics6020026
Kablan, R., Abor, J. Y., & Nartey, E. (2023). Evaluation of stacked ensemble model performance to optimize base and meta learners. Heliyon, 9(5), e15914. https://doi.org/10.1016/j.heliyon.2023.e15914
Katsaliaki, K., Galetsi, P., & Kumar, S. (2021). Supply chain disruptions and resilience: a major review and future research agenda. Annals of Operations Research. https://doi.org/10.1007/s10479-020-03912-1
Khedr, A. M., et al. (2024). Enhancing supply chain management with deep learning and machine learning techniques. Journal of King Saud University – Computer and Information Sciences, 36(6), 102482. https://doi.org/10.1016/j.jksuci.2024.102482
Kristanti, P., Anshori, M., & Andriani, N. (2023). Enterprise resource planning (erp) in supply chain management (scm) operational performance: a systematic literature review. Jurnal Ilmiah Manajemen Kesatuan, 11(3), 991-996. https://doi.org/10.37641/jimkes.v11i3.2227
Kusrini, E., Sugito, E., Rahman, Z., Setiawan, T., & Hasibuan, R. (2020). Risk mitigation on product distribution and delay delivery : a case study in an Indonesian manufacturing company. Iop Conference Series Materials Science and Engineering, 722(1), 012015. https://doi.org/10.1088/1757-899x/722/1/012015
Li, B., Zhang, X., Ban, Y., Xu, X., Su, W., Chen, J., … & Zhou, S. (2022). Construction of a smart supply chain for sand factory using the edge-computing-based deep learning algorithm. Scientific Programming, 2022, 1-15. https://doi.org/10.1155/2022/9607755
Li, W., Liu, Y., Liu, W., Tang, Z., Dong, S., Li, W., … & Yin, C. (2022). Machine learning-based prediction of lymph node metastasis among osteosarcoma patients. Frontiers in Oncology, 12. https://doi.org/10.3389/fonc.2022.797103
Li, Z. and Luo, F. (2024). Application and value of internet of things in enterprise supply chain management. Journal of Computational Methods in Sciences and Engineering, 24(4-5), 2689-2703. https://doi.org/10.3233/jcm-247499
Liebal, U., Phan, A., Sudhakar, M., Raman, K., & Blank, L. (2020). Machine learning applications for mass spectrometry-based metabolomics. Metabolites, 10(6), 243. https://doi.org/10.3390/metabo10060243
Liu, Q. (2024). Logistics distribution route optimization in artificial intelligence and internet of things environment. Decision Making Applications in Management and Engineering, 7(2), 221-239. https://doi.org/10.31181/dmame7220241072
Liu, Z., Zhao, Y., Guo, C., & Xin, Z. (2024). Research on the impact of digital-real integration on logistics industrial transformation and upgrading under green economy. Sustainability, 16(14), 6173. https://doi.org/10.3390/su16146173
Lundberg, S. M., & Lee, S.-I. (2017). A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems (NeurIPS), 30. https://proceedings.neurips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf
Mukhsaf, M. H., Li, W., & Jani, G. H. (2025). Optimizing Methanol Injection Quantity for Gas Hydrate Inhibition Using Machine Learning Models. Applied Sciences, 15(6), 3229. https://doi.org/10.3390/app15063229
O’Brien, C., Goldstein, B., Shen, Y., Phelan, M., Lambert, C., Bedoya, A., … & Steorts, R. (2020). Development, implementation, and evaluation of an in-hospital optimized early warning score for patient deterioration. MDM Policy & Practice, 5(1). https://doi.org/10.1177/2381468319899663
Oloyede, J., & Luz, A. (2025). Predictive modeling for diabetes mellitus: Evaluating machine learning approaches on big data. Preprints. https://doi.org/10.20944/preprints202501.1703.v1
Palakshappa, A., Maradithaya, S., & Charunayana, V. (2025). A machine learning method to improve supplier delivery appointments in supply chain industries. Brazilian Journal of Operations & Production Management, 22(1). https://doi.org/10.14488/bjopm.2040.2025
Pan, R., Huang, Y., & Xiao, X. (2021). Evaluating consumers’ willingness to pay for delay compensation services in intra-city delivery—a value optimization study using choice. Information, 12(3), 127. https://doi.org/10.3390/info12030127
Pan, Y., Wang, X., & Ye, Q. (2024). Enhancing supply chain management through artificial intelligence: a case study of jd logistics. Advances in Economics Management and Political Sciences, 109(1), 116-121. https://doi.org/10.54254/2754-1169/109/2024bj0127
Ravindiran, G., Ramesh, M., Kumar, A., & Suresh, S. (2025). Ensemble stacking of machine learning models for air quality prediction. iScience, 34(2), 108945. https://doi.org/10.1016/j.isci.2025.108945
Revathi, M., Lakshmi, T., & Goud, K. (2024). Impact of blockchain technology in supply chain system. Int Res J Adv Engg Mgt, 2(05), 1670-1672. https://doi.org/10.47392/irjaem.2024.0238
Rokoss, A., Deringer, J., & Stark, R. (2024). Case study on delivery time determination using a machine learning approach. Journal of Intelligent Manufacturing, 35(9), 3279–3293. https://doi.org/10.1007/s10845-023-02290-2
Sabovčik, F., Ntalianis, E., Cauwenberghs, N., & Kuznetsova, T. (2022). Improving predictive performance in incident heart failure using machine learning and multi-center data. Frontiers in Cardiovascular Medicine, 9. https://doi.org/10.3389/fcvm.2022.1011071
Safdar, M. S. B., & Magdieva, S. S. (2025). Early detection and intervention for children's mental health issues using machine learning. International Journal of Preventive Medicine and Health, 5(2). https://doi.org/10.54105/ijpmh.b1049.05020125
Shaw, R., Lokshin, A., Miller, M., Messerlian-Lambert, G., & Moore, R. (2022). Stacking machine learning algorithms for biomarker-based preoperative diagnosis of a pelvic mass. Cancers, 14(5), 1291. https://doi.org/10.3390/cancers14051291
Sheffi, Y. (2007). The Resilient Enterprise: Overcoming Vulnerability for Competitive Advantage. MIT Press.
Sombultawee, K., Lenuwat, P., Aleenajitpong, N., & Boon‐itt, S. (2022). Covid-19 and supply chain management: a review with bibliometric. Sustainability, 14(6), 3538. https://doi.org/10.3390/su14063538
Stroumpoulis, A. and Kopanaki, E. (2022). Theoretical perspectives on sustainable supply chain management and digital transformation: a literature review and a conceptual framework. Sustainability, 14(8), 4862. https://doi.org/10.3390/su14084862
Sun, F. and Shi, G. (2021). Study on the application of big data techniques for the third-party logistics using novel support vector machine algorithm. Journal of Enterprise Information Management, 35(4/5), 1168-1184. https://doi.org/10.1108/jeim-02-2021-0076
Venkatesa, P.N.B., Kalpana M., Balakrishnan N., Shantha S.M., Gitanjali J., Kavitha S., Karthiba L., Balamurugan V., Suresh A., Rajavel M., Dhivya R., Syndhiya R., (2025). A comparative analysis and prediction of carbon emission in India using machine learning models. Global Nest Journal, 27(1). https://doi.org/10.30955/gnj.06020
Wang, D., Li, J., Sun, Y., Ding, X., Zhang, X., Liu, S., … & Sun, T. (2021). A machine learning model for accurate prediction of sepsis in icu patients. Frontiers in Public Health, 9. https://doi.org/10.3389/fpubh.2021.754348
Wan, X., Li, X., Xiong, L., Xu, Y., & Tian, J. (2025). Comparison and optimization strategies of Airbnb rental prediction models: An empirical study based on linear regression, XGBoost and random forest. Advances in Economics, Management and Political Sciences. https://doi.org/10.54254/2754-1169/2025.lh27240
Wang, Y., Yuan, X., & Zhang, Y. (2023). A systematic comparison of machine learning algorithms to develop and validate prediction model to predict heart failure risk in middle-aged and elderly patients with periodontitis (nhanes 2009 to 2014). Medicine, 102(34), e34878. https://doi.org/10.1097/md.0000000000034878
Xiong, T. (2024). The analysis of the impact of different supply chain factors using statistical perspective. Applied and Computational Engineering, 87(1), 241-249. https://doi.org/10.54254/2755-2721/87/20241618
Yaseen, Z., Ali, Z., Salih, S., & Al‐Ansari, N. (2020). Prediction of risk delay in construction projects using a hybrid artificial intelligence model. Sustainability, 12(4), 1514. https://doi.org/10.3390/su12041514
Zahoor, K., Bawany, N., & Qamar, T. (2024). Evaluating text classification with explainable artificial intelligence. Iaes International Journal of Artificial Intelligence (Ij-Ai), 13(1), 278. https://doi.org/10.11591/ijai.v13.i1.pp278-286
Zegeye, A., Tilahun, B., Fekadie, M., Addisu, E., Wassie, B., Alelign, B., Sharew, M., Baykemagn, N. D., Kebede, A., & Yehuala, T. Z. (2025). Predicting home delivery and identifying its determinants among women aged 15–49 years in sub-Saharan African countries using demographic and health surveys 2016–2023: A machine learning algorithm. BMC Public Health, 25, Article 21334. https://doi.org/10.1186/s12889-025-21334-1
Zhang, D. (2024). Ai integration in supply chain and operations management: enhancing efficiency and resilience. Applied and Computational Engineering, 90(1), 8-13. https://doi.org/10.54254/2755-2721/90/2024melb0060
Zhang, M., Qi, Y., & Guo, Y. (2020). Optimizing multi-level express delivery networks under service-level constraints. Transportation Research Part E: Logistics and Transportation Review, 137, 101926. https://doi.org/10.1016/j.tre.2020.101926
Zhu, J. (2024). Analysis on the application of artificial intelligence in the field of logistics. Frontiers in Business Economics and Management, 14(3), 66-68. https://doi.org/10.54097/3pd01b02
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Agus Purnomo, Nava Gia Ginasta, Syafrianita Syafrianita, Syafrial Fachri Pane

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish their manuscripts in this journal agree to the following conditions:
- The copyright on each article belongs to the author(s).
- The author acknowledges that the Dinasti International Journal of Education Management and Social Science (DIJEMSS) has the right to be the first to publish with a Creative Commons Attribution 4.0 International license (Attribution 4.0 International (CC BY 4.0).
- Authors can submit articles separately, arrange for the non-exclusive distribution of manuscripts that have been published in this journal into other versions (e.g., sent to the author's institutional repository, publication into books, etc.), by acknowledging that the manuscript has been published for the first time in the Dinasti International Journal of Education Management and Social Science (DIJEMSS).









































