二分类情况:
H(t, o) = - t(log(o)) + (1-t)*log(1-o)
多分类情况
http://www.stat.cmu.edu/~cshalizi/350/lectures/26/lecture-26.pdf
deeplearning.net/software/theano/library/tensor/nnet/nnet.html?highlight=cross%20entropy#theano.tensor.nnet.nnet.binary_crossentropy