1、what is LLM
- LLM is an instance of Foundation Models, which are pre-trained on large amount of unlabeled and self-supervised data.
2、How do they work
L L M = d a t a + a r c h i t e c t u r e + t r a i n i n g LLM=data+architecture+training LLM=data+architecture+training
- architecture just like “transformer” for GPT
- During training, the model learns how to predict the next word in a sentence. In beginning, GPT think that after “The sky is” is “Bug” (a random result). But, with each iteration, the model adjusts its inter parameters, to reduce the difference between its predictions and the actual outcomes.
- Now, the model can be fine tuned on a smaller, more specific dataset. Here, the model can refine its understanding of a specific task more accurately.