2P R Pote Patil College of Engineering and Management Amravati, Maharashtra, 444604, India
Abstract
This study focuses on predicting cardiovascular diseases and forecasting the survival rate of patients using machine learning techniques. In this paper publicly available dataset is sued that contain the several risk factors for cardiovascular disease such as age, anemia, hypertension, diabetes, smoking habits, gender, blood pressure, glucose levels, and alcohol consumption. The dataset is preprocessed to extract relevant features for predicting survival rates based on risk parameters. The ensemble model is designed based on combining the several machine learning algorithms such as Logistic Regression, Random Forest, XGBoost, and Naive Bayes to classify the cardiovascular disease. The study also applies the Kaplan-Meier estimator to predict the survival rate of patient based on continuous variables. The final results indicate that age, serum creatinine, and ejection fraction significantly correlate with the death event. At the same time, smoking, sex, platelets, and diabetes show uncertain statistical significance. The study provides insights into the impact of various factors on cardiovascular disease survival rates. The novelty of this work lies in its integration of survival analysis with ensemble learning for robust prediction and interpretability in clinical applications.
