我正在试着训练一个自定义的NER模型来识别41个实体(训练集大约有6000行)
java -cp stanford-ner.jar edu.stanford.nlp.ie.crf.CRFClassifier -prop austen.prop
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at edu.stanford.nlp.optimization.AbstractCachingDiffFunction.ensure(AbstractCachingDiffFunction.java:136)
at edu.stanford.nlp.optimization.AbstractCachingDiffFunction.derivativeAt(AbstractCachingDiffFunction.java:151)
at edu.stanford.nlp.optimization.QNMinimizer.evaluateFunction(QNMinimizer.java:1150)
at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:898)
at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:856)
at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:850)
at edu.stanford.nlp.optimization.QNMinimizer.minimize(QNMinimizer.java:93)
at edu.stanford.nlp.ie.crf.CRFClassifier.trainWeights(CRFClassifier.java:1935)
at edu.stanford.nlp.ie.crf.CRFClassifier.train(CRFClassifier.java:1742)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.train(AbstractSequenceClassifier.java:785)
at edu.stanford.nlp.ie.AbstractSequenceClassifier.train(AbstractSequenceClassifier.java:756)
at edu.stanford.nlp.ie.crf.CRFClassifier.main(CRFClassifier.java:3011)
我尝试在java命令中添加-Xmx4096m,以将最大堆空间指定为4GB(这是我的机器中的最大可用空间),但仍然没有成功。
当我试图为20个实体(1500行)训练一个模型时,同样的命令工作得完美无瑕,没有任何堆空间错误
这个堆空间是与RAM有关还是与可用空间有关?