๐ง๐ถ๐๐ฎ๐ป๐ถ๐ฐ ๐ฆ๐๐ฟ๐๐ถ๐๐ฎ๐น ๐ฃ๐ฟ๐ฒ๐ฑ๐ถ๐ฐ๐๐ถ๐ผ๐ป ๐ฃ๐ฟ๐ผ๐ท๐ฒ๐ฐ๐
The Titanic dataset is a classic for ML learners. It seems simple. It is not. It teaches a full data science workflow.
The goal is simple. Predict who survived the shipwreck. I used age, gender, and ticket fare.
Here is the process:
- Cleaned missing data in age and cabin columns.
- Used EDA to find patterns.
- Found women and first-class passengers survived more.
- Created new features like Family Size and Titles.
- Built a pipeline for scaling and encoding.
The Model:
- Tested several algorithms.
- Chose XGBoost for tabular data.
- Used Optuna for tuning parameters.
- Measured success with accuracy scores.
The Insight:
- Used SHAP to explain predictions.
- This removes the black box.
- You see why the model predicts survival.
This project covers raw data to final interpretation. It is a great way to learn practical ML.
Source: https://dev.to/argha_sarkar/titanic-survival-prediction-using-machine-learning-complete-data-science-project-1hd2 Optional learning community: https://t.me/GyaanSetuAi