Hi Summer! A general question about predictive modeling: I wonder whether it might be more advantageous to merge training dataset and testing dataset then select random subsets for training and testing from the larger dataset, so that more diverse data could be used for model development. This is of course balanced against the advantage of model evaluation on such a distinct dataset, which provides a fair estimate of model performance.
Do you think it would improve model performance with more diverse data though?
I.e develop a better performing model
Ok, that’s helpful
I’m really naive with predictive model, but I wonder if you could use these other datasets in which you validated the models to also improve the model - using something like “transfer learning"??
But since they are applied on the variables, even though they were collected differently, the variables are treated similarly when applied?
I can discuss with Summer offline maybe :)
Thanks. I may not have parsed question adequately. If we have time at end Q&A, we can open up your mic if you want to ask directly