Conferences CIMPA, 18th International Federation of Classification Societies

Font Size: 
Improving Employee Attrition with Data Analysis and Machine Learning
Sergio Ramirez Rodriguez, Alvaro Guevara Villalobos

Last modified: 2024-05-15

Abstract


The current business environment, characterized by intense competitionin globalized markets and the imperative need for innovation to ensure a competitiveadvantage, has underscored the importance of data analytics as a fundamental toolfor optimizing resources, increasing profits, and enhancing performance indicatorsin companies. One crucial indicator for any company is the minimization of employeeattrition, especially in sectors like Business Process Outsourcing (BPOs).High attrition leads to significant increases in recruitment costs, time required tofind replacements, as well as expenses associated with training new employees andthe loss of accumulated knowledge by departing staff, among other negative effects. Therefore, it is essential for companies to mitigate this risk by using data analyticsand machine learning tools to develop predictive models to identify the likelihoodof employee attrition within a specific period.

Various models, including XGBoost, Random Forest, Logistic Regression, and Consensus,were evaluated, comparing metrics such as area under the ROC curve, overallaccuracy, sensitivity, and specificity. XGBoost emerged as superior due to its adeptpredictive capacity, leveraging an ensemble approach with decision trees, regularizationtechniques to forestall overfitting, hyperparameter optimization for optimalconfiguration, and scalability for handling large and complex datasets. Additionally,the model’s sensitivity was assessed through stress tests on both observations andpredictor variables. Since implementation, the model has yielded millions of dollarsin the aforementioned savings.


Keywords


Employee Attrition, Machine Learning, XGBoost, Random Forest, Consensus

References


Chen, T., Guestrin, C. (2016) Xgboost: A scalable tree boosting system.  doi:10.1145/2939672.29397852.

Quin, C., et al. (2021) Xgboost optimized by adaptive particle swarm optimization for credit scoring.Mathematical Problems in Engineering, 2021. doi: 10.1155 /2021/66555103.

Wang, C., Deng, C., Wang, S. (2020) Imbalance-xgboost: leveraging weighted and focal losses forbinary label-imbalanced classification with xgboost. doi: 10.1016/j.patrec.2020.05.035