Logo

Stanford Center for Biomedical Informatics Research Colloquia - Shared screen with speaker view
kate vitale
31:07
Hi Summer! A general question about predictive modeling: I wonder whether it might be more advantageous to merge training dataset and testing dataset then select random subsets for training and testing from the larger dataset, so that more diverse data could be used for model development. This is of course balanced against the advantage of model evaluation on such a distinct dataset, which provides a fair estimate of model performance.
kate vitale
34:38
Do you think it would improve model performance with more diverse data though?
kate vitale
35:11
I.e develop a better performing model
kate vitale
35:50
Ok, that’s helpful
kate vitale
46:40
I’m really naive with predictive model, but I wonder if you could use these other datasets in which you validated the models to also improve the model - using something like “transfer learning"??
kate vitale
48:18
But since they are applied on the variables, even though they were collected differently, the variables are treated similarly when applied?
kate vitale
50:04
I can discuss with Summer offline maybe :)
jonathan chen
50:51
Thanks. I may not have parsed question adequately. If we have time at end Q&A, we can open up your mic if you want to ask directly
wendi knapp
01:14:10
thanks!