Stroke Disease Prediction Using Support Vector Machine Method
DOI:
https://doi.org/10.62951/icistech.v5i1.274Keywords:
Stroke, Prediction, Support Vector Machine, SMOTEAbstract
Stroke is one of the leading causes of death globally and is particularly prevalent in Indonesia. Early prediction of stroke is critical to reducing the risk of long-term disability and mortality. This study aims to build a stroke prediction model using the Support Vector Machine (SVM) classification method. The dataset used is sourced from Kaggle, containing 5,110 records with class imbalance. To address the imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) was applied during preprocessing. The study evaluates model performance across multiple data splits (70:30, 80:20, 90:10) and k-fold cross-validation values (k=5, 7, 10). The SVM was tested with various kernel types—linear, polynomial, and radial basis function (RBF)—along with parameter tuning for C, gamma, and degree. The results show that the polynomial kernel yielded the highest prediction accuracy of 92%. The model performance was evaluated using accuracy, precision, recall, and F1-score metrics.
References
[1] Agustiyawan and E. Prabowo, "Pembekalan kemampuan deteksi dini dan asesmen stroke," *J. Pengabdian Masyarakat Multidisiplin*, vol. 4, no. 1, pp. 1–5, 2020.
[2] H. Lin et al., "Evaluation of machine learning methods to stroke outcome prediction using a nationwide disease registry," *Comput. Methods Programs Biomed.*, vol. 190, p. 105338, 2020. doi: 10.1016/j.cmpb.2019.105338.
[3] Y. Hung et al., "Comparing deep neural network and other machine learning algorithms for stroke prediction in a large-scale population-based electronic medical claims database," in *Proc. 39th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc. (EMBC)*, Jeju, Korea, 2017, pp. 3110–3113. doi: 10.1109/EMBC.2017.8037515 .
[4] F. Fachruddin, E. Rasywir, and Y. Pratama, "Increasing the accuracy of brain stroke classification using random forest algorithm with mutual information feature selection," *J. RESTI (Rekayasa Sist. dan Teknol. Inform.)*, vol. 8, no. 4, pp. 555–562, 2024 .
[5] G. Sailasya and G. L. A. Kumari, "Analyzing the performance of stroke prediction using ML classification algorithms," *Int. J. Adv. Comput. Sci. Appl.*, vol. 12, no. 6, pp. 539–545, 2021. doi: 10.14569/IJACSA.2021.0120662 .
[6] K. M. Park et al., "Interpretable machine learning for prediction of clinical outcomes in acute ischemic stroke," *Front. Neurol.*, vol. 14, p. 1234046, 2023. doi: 10.3389/fneur.2023.1234046 .
[7] L. A. Martini, G. A. Pradipta, and R. R. Huizen, "Analysis of the impact of data oversampling on the support vector machine method for stroke disease classification," *J. Electr. Electron. Eng. Med. Inform.*, vol. 4, no. 2, pp. 96–105, 2022. doi: 10.35882/jeeemi.v4i2.698 .
[8] L. Despitasari, "Hubungan hipertensi dengan kejadian stroke berulang pada penderita pasca stroke," *MIDWINERSLION: J. Kesehatan STIKes Buleleng*, vol. 5, no. 1, pp. 124–131, 2020.
[9] M. Huda, *Algoritma Data Mining: Analisis Data Dengan Komputer*. Yogyakarta: Bisakimia, 2019.
[10] S. Jung et al., "Predicting ischemic stroke in patients with atrial fibrillation using machine learning," *Front. Biosci.-Landmark*, vol. 27, no. 3, p. 80, 2022. doi: 10.31083/j.fbl2703080.
[11] S. Sahriar et al., "Unlocking stroke prediction: Harnessing projection-based statistical feature extraction with ML algorithms," *Heliyon*, vol. 10, no. 5, p. e27411, 2024. doi: 10.1016/j.heliyon.2024.e27411 .
[12] S. Susilawati and S. K. Nurhayati, "Faktor resiko kejadian stroke," *J. Ilm. Keperawatan Sei Betik*, vol. 14, no. 1, pp. 41–48, 2018.
[13] T. Liu, W. Fan, and C. Wu, "A hybrid machine learning approach to cerebral stroke prediction based on an imbalanced medical dataset," *Artif. Intell. Med.*, vol. 101, p. 101723, 2019. doi: 10.1016/j.artmed.2019.101723.
[14] Y. He et al., "An exploration on the machine-learning-based stroke prediction model," *Front. Neurol.*, vol. 15, p. 1372431, 2024. doi: 10.3389/fneur.2024.1372431 .
[15] Z. Rustam, Arfiani, and J. Pandelaki, "Cerebral infarction classification using multiple support vector machine with information gain feature selection," *Bull. Electr. Eng. Inform.*, vol. 9, no. 4, pp. 1578–1584, 2020. doi: 10.11591/eei.v9i4.1997.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Proceeding of The International Conference of Inovation, Science, Technology, Education, Children, and Health

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.