Data Leakage
Data leakage occurs when a predictive model uses data in the training phase that are unavailable when the model is in production. Consider the example below (source: Mostafa Saad Ibrahim): The main concern about the data is related to splitting it, where images of the same animal might have occurred in both train and test …