1. Run java byte program
command: java XXXX
There is no need to include the ".class" affix to the program name when calling this java command. If so, an error will be thrown.
2. Character encoding convert in Linux
no.1 Check file encoding:
file --mime-encoding filename
no.2 Check system availalbe encoding sets
iconv -l
no.3 convert
iconv -f old_encoding -t new_encoding filename
3. Java IO bufferedwriter
The write action performed on the objective file will not be committed until buffered object was properly closed, which means calling close() method on a bufferedwriter object is a must.
4. Java IO R/W utf-8 text file
Constructor OutputStreamWriter accept encoding parameter, which means it can be used to wrap FileOutputStream constructor and handle different kinds of encodings
5. ICTCLAS2015 user defined dictionary path
Data/FieldDict.pdat, Data/FieldDict.pos
6. ICTCLAS2015 user dictionary must be encoded in ANSI form to be correctly imported into Data/FieldDict.pdat and Data/FieldDict.pos files
7. python ctypes.c_char_p()
This function requires the memory address of the object to be successfully called, note in ctypes, the memory address is represented by the (python) id of an object
8. HanLp segment.seg() function will, by default, remove all the "\n" characters in the text strings. The method of how to change this setting is unknown.
9. Java:
If a package statement is not used then the class, interfaces, enumerations and annotation types will be put into an unnamed package
10. Python Requests Module Timeout
request.get("XXX", timeout = 1)