文章目录
人工智能模型中的“它”是数据集。 The “it” in AI models is the dataset.
I’ve been at OpenAI for almost a year now. In that time, I’ve trained a lot of generative models. More than anyone really has any right to train. As I’ve spent these hours observing the effects of tweaking various model configurations and hyperparameters, one thing that has struck me is the similarities in between all the training runs.
我在 OpenAI 工作已经快一年了。那段时间,我训练了很多生成模型。比任何人都更有权利接受训练。当我花了几个小时观察调整各种模型配置和超参数的效果时,令我印象深刻的一件事是所有训练运行之间的相似性。
It’s becoming awfully clear to me that these models are truly approximating their datasets to an incredible degree. What that means is not only that they learn what it means to be a dog or a cat, but the interstitial frequencies between distributions that don’t matter, like what photos humans are likely to take or words humans commonly write down.
我越来越