先上结果:
模型\量化 | q4_0 | q4_1 | q5_0 | q5_1 | q8_0 | fp16 |
---|---|---|---|---|---|---|
llama-7b | 6.157 | 6.0915 | 5.9846 | 5.948 | 5.9063 | 5.68 |
llama-13b | 5.385 | 5.3608 | 5.285 | 5.2702 | 5.2547 | 5.09 |
llama-30b | 4.2707 | - | - | - | - | 4.1000 |
alpaca-30b | 4.4521 | - | - | - | - | - |
llama-2-7b | 5.9675 | 6.0398 | 5.8328 | 5.8435 | 5.7897 | - |
llama-2-7b-chat | 7.7641 | 7.7853 | 7.5055 | 7.5392 | 7.5014 | - |
llama-2-13b | 5.2172 | 5.2115 | 5.1343 | 5.1289 | 5.1005 | - |
llama-2-13b-chat | 6.6296 | 6.7059 | 6.5336 | 6.5771 | 6.5361 |