An efficient approach for breast cancer classification using machine learning

Authors

  • Vedatrayee Chatterjee Department of Computer Science & Engineering, Asansol Engineering College, Asansol, India
  • Arnab Maitra Department of Computer Science & Engineering, Asansol Engineering College, Asansol, India
  • Soubhik Ghosh Department of Computer Science & Engineering, Asansol Engineering College, Asansol, India
  • Hritik Banerjee Department of Computer Science & Engineering, Asansol Engineering College, Asansol, India
  • Subhadeep Puitandi Department of Computer Science & Engineering, Asansol Engineering College, Asansol, India
  • Ankita Mukherjee Department of Computer Science & Engineering, Asansol Engineering College, Asansol, India

DOI:

https://doi.org/10.31181/jdaic10028012024c

Keywords:

Breast Cancer, Dataset, Machine Learning, Gradient Boosting Algorithm, Random Forest Algorithm

Abstract

Breast cancer, a life-threatening disease affecting millions worldwide, poses significant challenges due to its time-consuming manual determination process, potential risks, and human errors. It is a condition where cells of the breast develop unnaturally and uncontrollably, resulting in a mass called a tumor. If lumps in the breast are not addressed, they can spread to other regions of the body, including the bones, liver, and lungs.

Early diagnosis is crucial for effective treatment and improved patient outcomes. In this research paper, we focus on employing machine learning models to achieve quick identification of breast cancer tumors as benign or malignant. The primary objective is to develop a decision-making visualization pattern using swarm plots and heat maps. To accomplish this, we utilized the Light GBM (Gradient Boosting Machine) algorithm and compared its performance against other established machine learning models, namely Logistic Regression, Gradient Boosting Algorithm, Random Forest Algorithm, and XG Boost Algorithm. Ultimately, our study demonstrates that the Light GBM Algorithm exhibits the highest accuracy of 96.98% in distinguishing between benign and malignant breast tumors.

Downloads

Download data is not yet available.

References

Asri, H., Mousannif, H., Al Moatassime, H., & Noel, T. (2016). Using Machine Learning Algorithms for Breast Cancer Risk Prediction and Diagnosis. Procedia Computer Science, 83, 1064-1069.

Bardou, D., Zhang, K., & Ahmad, S. M. (2018). Classification of Breast Cancer Based on Histology Images Using Convolutional Neural Networks. IEEE Access, 6, 24680-24693.

Bazazeh, D. & Shubair, R. (2016). Comparative study of machine learning algorithms for breast cancer detection and diagnosis. Proceedings of the 5th International Conference on Electronic Devices, Systems and Applications (ICEDSA) (pp. 1-4). Ras Al Khaimah, United Arab Emirates: IEEE.

Derangula, A., Karri, P. K., & Edara, S. R. (2021). Feature Selection of Breast Cancer Data Using Gradient Boosting Techniques of Machine Learning. International Journal of Scientific Research in Computer Science and Engineering, 9(3), 7-15.

Hassan, M., Hassan, M, Yasmin, F., Khan, A. R., Zaman, S., Galibuzzaman, Islam, K.K., & Bairagi, A. K. (2023). A comparative assessment of machine learning algorithms with the Least Absolute Shrinkage and Selection Operator for breast cancer detection and prediction. Decision Analytics Journal, 7, 100245.

Joshi, A., & Mehta, A. (2017). Comparative Analysis of Various Machine Learning Techniques for Diagnosis of Breast Cancer. International Journal on Emerging Technologies, 8(1), 522-526.

Lotfnezhad Afshar, H., Jabbari, N., Khalkhali, H.R., & Esnaashari O. (2021). Prediction of Breast Cancer Survival by Machine Learning Methods: An Application of Multiple Imputation. Iranian Journal of Public Health, 50(3), 598-605.

Nahid, A-A., & Kong, Y. (2017). Involvement of Machine Learning for Breast Cancer Image Classification: A Survey. Computational and Mathematical Methods in Medicine, 2017, 3781951.

Octaviani, T. L., & Rustam, Z. (2019). Random Forest for Breast Cancer Prediction. Proceedings of the 4th international symposium on current progress in mathematics and sciences (ISCPMS2018), AIP Conference Proceedings, 2168, 020050-1–020050-6. Depok, Indonesia: AIP Publishing.

Simon, M. S., Hastert, T. A., Barac, A., Banack, H.R., Caan, B.J., Chlebowski, R.T., Foraker, R., Hovsepyan, G., Liu, S., Luo, J., Manson, J.E., Neuhouser, M. L., Okwuosa, T. M., Pan, K., Qi, L., Ruterbusch, J. J., Shadyab, A. H., Thomson, C. A., Wactawski-Wende, J., Waheed, N., & Beebe-Dimmer, J. L. (2021). Cardiometabolic risk factors and survival after cancer in the Women's Health Initiative. Cancer, 127(4), 598-608.

Sultana, J., & Jilani, A. K. (2018). Predicting Breast Cancer Using Logistic Regression and Multi-Class Classifiers. International Journal of Engineering & Technology, 7, 22–22.

Vyas, S., Chauhan, A., Rana, D., & Ansari, N. (2022). Breast Cancer Detection Using Machine Learning Techniques. International Journal for Research in Applied Science and Engineering Technology, 10(5), 3232-3237.

Published

28.01.2024

How to Cite

Chatterjee, V., Maitra, A., Ghosh, S., Banerjee, H., Puitandi, S., & Mukherjee, A. (2024). An efficient approach for breast cancer classification using machine learning. Journal of Decision Analytics and Intelligent Computing, 4(1), 32–46. https://doi.org/10.31181/jdaic10028012024c