When convergence threshold (the value to test predefined in the problem to test when a converge happens) is set to be small enough, small number of training samples is enough for the learning program to solve for a good enough hypothisis. There is a tradeoff effect here.
Or we can say when there is enough training samples, the convergence threshold can be set higher compared to otherwise, so computing time can be decreased.
When sample size is 10 and convergence threshold is set to be 0.000001 for J of theta, result :
When sample size is 10 and convergence threshold is set to be 0.0001 for J of theta, result :
When sample size is 100 and convergence threshold is set to be 0.0001 for J of theta, result :