文章地址: https://arxiv.org/pdf/2303.08774v1.pdf GitHub - openai/evals: Evals is a framework for evaluating OpenAI models and an open-source registry of benchmarks.