Penerapan Bootstrapping untuk Ketidakseimbangan Kelas dan Weighted Information Gain untuk Feature Selection pada Algoritma Support Vector Machine untuk Prediksi Loyalitas Pelanggan

Abdul Razak Naufal, Romi Satria Wahono, Abdul Syukur

Abstract


Prediksi loyalitas pelanggan merupakan sebuah strategi bisnis yang penting bagi industri telekomunikasi modern untuk memenangkan persaingan global, karena untuk mendapatkan pelanggan baru biayanya lebih mahal lima sampai enam kali lipat daripada mempertahankan pelanggan yang sudah ada. Klasifikasi loyalitas pelanggan bertujuan untuk mengidentifikasi pelanggan yang cenderung beralih ke perusahaan kompetitor yang sering disebut customer churn. Algoritma Support Vector Machine (SVM) adalah algoritma klasifikasi yang juga berfungsi untuk memprediksi loyalitas pelanggan. Penerapan algoritma SVM dalam memprediksi loyalitas pelanggan mempunyai kelemahan yang mempengaruhi keakuratan dalam memprediksi loyalitas pelanggan yaitu sulitnya pemilihan fungsi kernel dan penentuan nilai parameternya. Dataset yang besar pada umumnya mengandung ketidakseimbangan kelas (class imbalance), yaitu adanya perbedaan yang signifikan antar jumlah kelas, yang mana kelas negatif lebih besar daripada kelas positif. Dalam penelitian ini diusulkan metode resampling bootstrapping untuk mengatasi ketidakseimbangan kelas. Selain itu dataset juga mengandung fitur yang tidak relevan sehingga dalam pemilihan fitur dalam penelitian ini digunakan metode dua fitur seleksi yaitu Forward Selection (FS) dan Weighted Information Gain (WIG). FS berfungsi untuk menghilangkan fitur yang paling tidak relevan serta membutuhkan waktu komputasi yang relatif pendek dibandingkan dengan backward elimination dan stepwise selection. WIG digunakan untuk memberi nilai bobot pada setiap atribut, karena WIG lebih cocok digunakan dalam memilih fitur terbaik daripada Principal Component Analysis (PCA) yang biasa digunakan untuk mereduksi data yang berdimensi tinggi. Tujuan pembobotan ini untuk merangking atribut yang memenuhi kriteria (threshold) yang ditentukan dipertahankan untuk digunakan oleh algoritma SVM.  Sedangkan untuk pemilihan parameter algoritma SVM dengan menggunakan metode grid search. Metode grid search dapat mencari nilai parameter terbaik dengan memberi range nilai parameter. Grid search juga sangat handal jika diaplikasikan pada dataset yang mempunyai atribut sedikit daripada menggunakan random search. Hasil eksperimen dari beberapa kombinasi parameter dapat disimpulkan bahwa prediksi loyalitas pelanggan dengan menggunakan sampel bootstrapping, FS-WIG serta grid search lebih akurat dibanding dengan metode individual SVM.

Full Text:

PDF

References


Bergstra, J., & Bengio, Y. (2012). Random Search for Hyper-Parameter Optimization. Journal ofMachine Learning Research, 13, 281–305.

Bramer, M. (2007). Principles of Data Mining. Springer. Retrieved from http://link.springer.com/article/10.2165/00002018-200730070-00010

Burez, J., & Van den Poel, D. (2009). Handling class imbalance in customer churn prediction. Expert Systems with Applications, 36(3), 4626–4636. doi:10.1016/j.eswa.2008.05.027

Charu C. Aggarwal, P. S. Y. (2008). Privacy Preserving Data Mining (Vol. 19). Springer US. doi:10.1007/978-0-387-29489-6

Chen, H., Zhang, J., Xu, Y., Chen, B., & Zhang, K. (2012). Performance comparison of artificial neural network and logistic regression model for differentiating lung nodules on CT scans. Expert Systems with Applications, 39(13), 11503–11509. doi:10.1016/j.eswa.2012.04.001

Chen, Z., & Fan, Z. (2013). Knowledge-Based Systems Dynamic customer lifetime value prediction using longitudinal data : An improved multiple kernel SVR approach. Knowledge-Based Systems, 43, 123–134. doi:10.1016/j.knosys.2013.01.022

Chen, Z.-Y., Fan, Z.-P., & Sun, M. (2012). A hierarchical multiple kernel support vector machine for customer churn prediction using longitudinal behavioral data. European Journal of Operational Research, 223(2), 461–472. doi:10.1016/j.ejor.2012.06.040

Cortes, C., & Vapnik, V. (1995). Support vector machine. In Machine learning (pp. 1303–1308). doi:10.1007/978-0-387-73003-5_299

Coussement, K., & Van den Poel, D. (2008). Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques. Expert Systems with Applications, 34(1), 313–327. doi:10.1016/j.eswa.2006.09.038

Efron, B., & Tibshirani, R. (1998). An introduction to the bootstrap. New York: Chapman & Hall Book. Retrieved from http://books.google.com/books?hl=en&lr=&id=gLlpIUxRntoC&oi=fnd&pg=PR14&dq=An+Introduction+to+the+Bootstrap&ots=A8wrX6QbF7&sig=6gK8Gx-KtVcUXJM7qSFv92zi3eM

Farvaresh, H., & Sepehri, M. M. (2011). A data mining framework for detecting subscription fraud in telecommunication. Engineering Applications of Artificial Intelligence, 24(1), 182–194. doi:10.1016/j.engappai.2010.05.009

Gallager, R. (2001). Claude E. Shannon: A retrospective on his life, work, and impact. Information Theory, IEEE Transactions on, 47(7), 2681–2695. Retrieved from http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=959253

Hamel, L. (2009). Knowledge discovery with support vector machines. John Wiley& Sons, Inc. Retrieved from http://books.google.com/books?hl=en&lr=&id=WaUnU4pEVVUC&oi=fnd&pg=PT10&dq=Knowledge+Discovery+with+Support+Vector+Machine&ots=U9cp-ZSxZ3&sig=XN99rPTt36-mZO-PpHdhwbhJ9-I

Han, S. H., Lu, S. X., & Leung, S. C. H. (2012). Segmentation of telecom customers based on customer value by decision tree model. Expert Systems with Applications, 39(4), 3964–3973. doi:10.1016/j.eswa.2011.09.034

Han, J., & Kamber, M. (2012). Data Mining : Concepts and Techniques (3nd Editio.). Morgan Kaufmann Publishers.

Huang, B., Kechadi, M. T., & Buckley, B. (2012). Customer churn prediction in telecommunications. Expert Systems with Applications, 39(1), 1414–1425. doi:10.1016/j.eswa.2011.08.024

Idris, A., Rizwan, M., & Khan, A. (2012). Churn prediction in telecom using Random Forest and PSO based data balancing in combination with various feature selection strategies. Computers & Electrical Engineering, 38(6), 1808–1819. doi:10.1016/j.compeleceng.2012.09.001

Jadhav, R., & Pawar, U. (2011). Churn Prediction in Telecommunication Using Data Mining Technology. International Journal of Advanced Computer Science and Applications, 2(2), 17–19. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.190.5029&rep=rep1&type=pdf#page=30

Khoshgoftaar, T. M., & Gao, K. G. K. (2009). Feature Selection with Imbalanced Data for Software Defect Prediction. 2009 International Conference on Machine Learning and Applications. doi:10.1109/ICMLA.2009.18

Kriyantono, R. (2008). Teknik Praktis Riset Komunikasi. Jakarta: Kencana. Retrieved from http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:Teknik+Praktis+Riset+Komunikasi#0

Larose, D. T. (2007). Data Mining Methods and Models. Canada: John Wiley & Sons, Inc.

Lin, L., Ravitz, G., Shyu, M. L., & Chen, S. C. (2008). Effective feature space reduction with imbalanced data for semantic concept detection. Proceedings - IEEE International Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing, 262–269. doi:10.1109/SUTC.2008.66

Maldonado, S., Weber, R., & Famili, F. (2014). Feature selection for high-dimensional class-imbalanced data sets using Support Vector Machines. Information Sciences, 286, 228–246. doi:10.1016/j.ins.2014.07.015

Mozer, M. C., Wolniewicz, R., Grimes, D. B., Johnson, E., & Kaushansky, H. (2000). Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry. IEEE Transactions on Neural Networks / a Publication of the IEEE Neural Networks Council, 11(3), 690–6. doi:10.1109/72.846740

Nie, G., Rowe, W., Zhang, L., Tian, Y., & Shi, Y. (2011). Credit card churn forecasting by logistic regression and decision tree. Expert Systems with Applications, 38(12), 15273–15285. doi:10.1016/j.eswa.2011.06.028

Novakovic, J. (2010). The Impact of Feature Selection on the Accuracy of Naive Bayes Classifier. In 18th Telecommunications forum TELFOR (Vol. 2, pp. 1113–1116).

Nugroho, A. S. (2008). Support Vector Machine: Paradigma Baru dalam Softcomputing. Neural Networks, 92–99.

Pan, S., Iplikci, S., Warwick, K., & Aziz, T. Z. (2012). Parkinson’s

Disease tremor classification – A comparison between Support Vector Machines and neural networks. Expert Systems with Applications, 39(12), 10764–10771. doi:10.1016/j.eswa.2012.02.189

Richeldi, M., & Perrucci, A. (2002). Churn analysis case study. Telecom Italian Lab. Torino. Retrieved from http://www-ai.cs.uni-dortmund.de:8080/PublicPublicationFiles/richeldi_perrucci_2002b.pdf

Richter, Y., Yom-Tov, E., & Slonim, N. (2010). Predicting customer churn in mobile networks through analysis of social groups. In Proceedings of the 2010 SIAM International Conference on Data Mining (SDM 2010) (pp. 732–741). doi:10.1137/1.9781611972801.64

Rynkiewicz, J. (2012). General bound of overfitting for MLP regression models. Neurocomputing, 90, 106–110. doi:10.1016/j.neucom.2011.11.028

Sharma, A., & Panigrahi, P. (2011). A neural network based approach for predicting customer churn in cellular network services. International Journal of Computer Applications (0975-8887), 27(11), 26–31. doi:10.5120/3344-4605

Tian, W., Song, J., Li, Z., & de Wilde, P. (2014). Bootstrap techniques for sensitivity analysis and model selection in building thermal performance analysis. Applied Energy, 135, 320–328. doi:10.1016/j.apenergy.2014.08.110

Tsai, C.-F., & Lu, Y.-H. (2009). Customer churn prediction by hybrid neural networks. Expert Systems with Applications, 36(10), 12547–12553. doi:10.1016/j.eswa.2009.05.032

Vapnik, V. (1998). The Nature of Statistical Learning Theory. Technometrics. John Wiley & Sons, Inc. Retrieved from http://www.tandfonline.com/doi/pdf/10.1080/00401706.1996.10484565

Verbeke, W., Dejaeger, K., Martens, D., Hur, J., & Baesens, B. (2012). New insights into churn prediction in the telecommunication sector: A profit driven data mining approach. European Journal of Operational Research, 218(1), 211–229. doi:10.1016/j.ejor.2011.09.031

Wahono, R. S., Herman, N. S., & Ahmad, S. (2014). Neural Network Parameter Optimization Based on Genetic Algorithm for Software Defect Prediction. Advanced Science Letters, 20(10), 1951–1955. doi:10.1166/asl.2014.5641

Witten, I. H., Frank, E., & Hall, M. A. (2011). Data Mining Practical Machine Learning Tools and Techniques (3rd ed.). USA: Morgan Kaufmann Publishers.

Wu, Q. (2011). Hybrid forecasting model based on support vector machine and particle swarm optimization with adaptive and Cauchy mutation. Expert Systems with Applications, 38(8), 9070–9075. doi:10.1016/j.eswa.2010.11.093

Wu, Xindong & Kumar, V. (2009). The Top Ten Algorithm in Data Mining. Boca Raton: Taylor & Francis Group, LLC.

Xia, G., & Jin, W. (2008). Model of Customer Churn Prediction on Support Vector Machine. Systems Engineering - Theory & Practice, 28(1), 71–77. doi:10.1016/S1874-8651(09)60003-X

Xu, J., Tang, Y. Y., Zou, B., Xu, Z., Li, L., & Lu, Y. (2014). Generalization performance of Gaussian kernels SVMC based on Markov sampling. Neural Networks : The Official Journal of the International Neural Network Society, 53, 40–51. doi:10.1016/j.neunet.2014.01.013

Yap, B. W., Rani, K. A., Rahman, H. A., Fong, S., Khairudin, Z., & Abdullah, N. N. (2014). An Application of Oversampling, Undersampling, Bagging and Boosting in Handling Imbalanced Datasets. Proceedings of the First International Conference on Advanced and Information Engineering, 285, 429–436. doi:10.1007/978-981-4585-18-7_2

Yu, X., Guo, S., Guo, J., & Huang, X. (2011). An extended support vector machine forecasting framework for customer churn in e-commerce. Expert Systems with Applications, 38(3), 1425–1430. doi:10.1016/j.eswa.2010.07.049

Zhou, S.-S., Liu, H.-W., & Ye, F. (2009). Variant of Gaussian kernel and parameter setting method for nonlinear SVM. Neurocomputing, 72(13-15), 2931–2937. doi:10.1016/j.neucom.2008.07.016


Refbacks

  • There are currently no refbacks.




Journal of Intelligent Systems(JIS, ISSN 2356-3982)
Copyright © 2020IlmuKomputer.Com. All rights reserved.