Is there a rule-of-thumb for how to best divide data into training and validation sets? Is an even 50/50 split advisable