Handling Data Imbalance for Data Science Projects

In data science projects, while working with classification problems it is quite common to encounter imbalanced data. Also, when you evaluate the model using metrics like accuracy_score(), you may get good accuracy since the the model may predict the dominant class (Class - No) really well even though it fails to predict the class with less numbers. This can be dangerous in case of medical use cases. This article is intended to help you understand how data imbalance can be handled when you work in real-world use cases in data science.

Continue Reading →