全网国内外总结Prompt&LLM论文,开源数据&模型,AIGC应用(持续更新,收藏查看)

全网国内外总结Prompt&LLM论文,开源数据&模型,AIGC应用(持续更新,收藏查看)

在这里插入图片描述

目录顺序如下

  1. 国内外,垂直领域大模型
  2. Agent和指令微调等训练框架
  3. 开源指令,预训练,rlhf,对话,agent训练数据梳理
  4. AIGC相关应用
  5. prompt写作指南和5星博客等资源梳理
  6. Prompt和LLM论文细分方向梳理

在这里插入图片描述

Prompt是在自然语言处理(NLP)中,尤其是在使用预训练语言模型时,用来引导模型生成特定类型输出的一段文本或问题。有效的Prompt设计可以显著提高模型的表现,使其更好地适应特定任务。Prompt可以是具体的指令、问题或者示例,它们充当了模型输入的一部分,帮助模型理解所需完成的任务。

大型语言模型,如GPT-3、BERT和T5,是通过在大量文本数据上进行预训练而得到的复杂神经网络模型。这些模型能够捕捉语言的深层次结构和语义,从而在多种NLP任务中取得卓越表现,包括文本生成、翻译、摘要、问答等。LLM通常需要大量的计算资源进行训练,因此对硬件和数据集的规模有较高要求。

开源数据和模型是指那些公开可用、可以自由使用和修改的数据集和预训练模型。开源数据集,如维基百科、Common Crawl和ImageNet,为训练和测试模型提供了丰富的资源。开源模型,如Hugging Face的Transformers库,提供了预训练模型的实现,允许研究者和开发者在现有工作的基础上进行创新和应用开发。

AIGC(人工智能生成内容)指的是利用人工智能技术自动生成内容的应用。这些内容可以是文本、图像、音频或视频。AIGC应用的例子包括:

  • 文本生成:如自动写作助手、新闻生成、社交媒体内容创作等。
  • 艺术创作:如AI绘画、音乐创作、诗歌生成等。
  • 媒体编辑:如视频剪辑、图片修复、语音合成等。
  • 游戏开发:自动生成游戏关卡、角色对话和故事情节。

AIGC技术的发展为内容创作带来了革命性的变化,使得个人和企业能够以更低的成本和更高的效率创造新的内容。

Prompt、LLM、开源数据与模型以及AIGC应用共同推动了人工智能在内容生成领域的进步。Prompt技术提高了LLM在特定任务上的表现,开源数据与模型降低了研究和开发的门槛,而AIGC应用则展示了人工智能在创造性任务中的潜力。随着技术的不断进步,我们可以期待未来会有更多创新的AIGC应用出现,进一步改变我们与信息和媒体的互动方式。

学习预热

一分钟上手系列:https://blog.csdn.net/u014374009/category_12451843.html

LLMS

模型评测

榜单结果
AlpacaEval:LLM-based automatic evaluation 开源模型王者vicuna,openchat, wizardlm
Huggingface Open LLM LeaderboardMMLU只评估开源模型,Falcon夺冠,在Eleuther AI4个评估集上评估的LLM模型榜单,vicuna夺冠
https://opencompass.org.cn/上海人工智能实验室推出的开源榜单
Berkley出品大模型排位赛榜有准中文榜单Elo评分机制,GPT4自然是稳居第一,GPT4>Claude>GPT3.5>Vicuna>others
CMU开源聊天机器人评测应用ChatGPT>Vicuna>others;在对话场景中训练可能很重要
Z-Bench中文真格基金评测国产中文模型的编程可用性还相对较低,大家水平差不太多,两版ChatGLM提升明显
Chain-of-thought评估GSM8k, MATH等复杂问题排行榜
InfoQ 大模型综合能力评估面向中文,ChatGPT>文心一言> Claude>星火
ToolBench: 工具调用评估榜单工具微调模型和ChatGPT进行对比,提供评测脚本
AgentBench: 推理决策评估榜单清华联合多高校推出不同任务环境,例如购物,家居,操作系统等场景下模型推理决策能力
FlagEval智源出品主观+客观LLM评分榜单
Bird-Bench更贴合真实世界应用的超大数据库,需要领域知识的NL2SQL榜单,模型追赶人类尚有时日
kola以世界知识为核心的评价基准,包括已知的百科知识和未知的近90天网络发布内容,评价知识记忆,理解,应用和创造能力
CEVAL中文知识评估,覆盖52个学科,机器评价主要为多项选择
CMMLU67个主题中文知识和推理能力评估,多项选择机器评估
LLMEval3复旦推出的知识问答榜单,涵盖大学作业和考题,题库尽可能来自非互联网避免模型作弊
QuantBenchAI驱动投资的量化榜单

一分钟上手系列:https://blog.csdn.net/u014374009/category_12451843.html

国外开源模型

模型链接模型描述
OpenSora没等来OpenAI却等来了OpenSora这个梗不错哦
GROK马斯克开源Grok-1:3140亿参数迄今最大,权重架构全开放
Gemma谷歌商场开源模型2B,7B免费商用,开源第一易主了
Mixtral法国“openai”开源基于MegaBlocks训练的MOE模型8*7B 32K
Mistral7B法国“openai”开源Mistral,超过llama2当前最好7B模型
Dolphin-2.2.1-Mistral-7B基于Mistral7B使用dolphin数据集微调
LLama2Open Meta带着可商用开源的羊驼2模型来了~
LLaMAMeta开源指令微调LLM,规模70 亿到 650 亿不等
WizardLM微软新发布13B,登顶AlpacaEval开源模型Top3,使用ChatGPT对指令进行复杂度进化微调LLama2
FalconFalcon由阿联酋技术研究所在超高质量1万亿Token上训练得到1B,7B,40B开源,免费商用!土豪们表示钱什么的格局小了
VicunaAlpaca前成员等开源以LLama13B为基础使用ShareGPT指令微调的模型,提出了用GPT4来评测模型效果
OpenChat80k ShareGPT对话微调LLama-2 13B开源模型中的战斗机
GuanacoLLama 7B基座,在alpaca52K数据上加入534K多语言指令数据微调
MPTMosaicML开源的预训练+指令微调的新模型,可商用,支持84k tokens超长输入
RedPajamaRedPajama项目既开源预训练数据后开源3B,7B的预训练+指令微调模型
koala使用alpaca,HC3等开源指令集+ ShareGPT等ChatGPT数据微调llama,在榜单上排名较高
ChatLLaMA基于RLHF微调了LLaMA
Alpaca斯坦福开源的使用52k数据在7B的LLaMA上微调得到,
Alpaca-loraLORA微调的LLaMA
DromedaryIBM self-aligned model with the LLaMA base
ColossalChatHPC-AI Tech开源的Llama+RLHF微调
MiniGPT4Vicuna+BLIP2 文本视觉融合
StackLLamaLLama使用Stackexchange数据+SFT+RL
CerebrasCerebras开源了1亿到130亿的7个模型,从预训练数据到参数全开源
Dolly-v2可商用 7b指令微调开源模型在GPT-J-6B上微调
OpenChatKitopenai研究员打造GPT-NoX-20B微调+6B审核模型过滤
MetaLM微软开源的大规模自监督预训练模型
Amazon Titan亚马逊在aws上增加自家大模型
OPT-IMLMeta复刻GPT3,up to 175B, 不过效果并不及GPT3
BloomBigScience出品,规模最大176B
BloomZBigScience出品, 基于Bloom微调
Galacia和Bloom相似,更针对科研领域训练的模型
T0BigScience出品,3B~11B的在T5进行指令微调的模型
EXLLamaPython/C++/CUDA implementation of Llama for use with 4-bit GPTQ weight
LongChatllama-13b使用condensing rotary embedding technique微调的长文本模型
MPT-30BMosaicML开源的在8Ktoken上训练的大模型

国内开源模型

模型链接模型描述
Baichuan2百川第二代,提供了7B/13B Base和chat的版本
Baichuan百川智能开源7B大模型可商用免费
ziya2基于Llama2训练的ziya2它终于训练完了
ziyaIDEA研究院在7B/13B llama上继续预训练+SFT+RM+PPO+HFTT+COHFT+RBRS
Qwen1.5通义千问升级1.5,支持32K上文
Qwen1-7B+14B+70B阿里开源,可商用,通义千问7B,14B,70B Base和chat模型
InternLM2 7B+20B商汤的书生模型2支持200K
Orion-14B-LongChat猎户星空多语言模型支持320K
ChatGLM3ChatGLM3发布,支持工具调用等更多功能,不过泛化性有待评估
ChatGLM232K长文本,FlashAttention+Multi-Query Attenion的显存优化,更强推理能力,哈哈不过很多简单问题也硬要COT,中英平行能力似乎略有下降的ChatGLM2,但是免费商用!
ChatGLM清华开源的、支持中英双语的对话语言模型,使用了代码训练,指令微调和RLHF。chatglm2支持超长文本,可免费商用啦!
Yuan-2.0浪潮发布Yuan2.0 2B,51B,102B
YI-200K元一智能开源超长200K的6B,34B模型
YI元一智能开源34B,6B模型
XVERSE-256K元象发布13B免费商用大模型,虽然很长但是
XVERSE元象发布13B免费商用大模型
DeepSeek-MOE深度求索发布的DeepSeekMoE 16B Base和caht模型
DeepSeek深度求索发布的7B,67B大模型
LLama2-chinese没等太久中文预训练微调后的llama2它来了~
YuLan-chat2高瓴人工智能基于Llama-2中英双语继续预训练+指令微调/对话微调
BlueLMVivo人工智能实验室开源大模型
zephyr-7BHuggingFace 团队基于 UltraChat 和 UltraFeedback 训练了 Zephyr-7B 模型
XWin-LMllama2 + SFT + RLHF
Skywork昆仑万维集团·天工团队开源13B大模型可商用
Chinese-LLaMA-Alpaca哈工大中文指令微调的LLaMA
Moss为复旦正名!开源了预训练,指令微调的全部数据和模型。可商用
InternLM书生浦语在过万亿 token 数据上训练的多语千亿参数基座模型
Aquila2智源更新Aquila2模型系列包括全新34B
Aquila智源开源7B大模型可商用免费
UltraLM系列面壁智能开源UltraLM13B,奖励模型UltraRM,和批评模型UltraCM
PandaLLMLLAMA2上中文wiki继续预训练+COIG指令微调
XVERSE据说中文超越llama2的元象开源模型13B模型
BiLLaLLama词表·扩充预训练+预训练和任务1比1混合SFT+指令样本SFT三阶段训练
Phoenix港中文开源凤凰和奇美拉LLM,Bloom基座,40+语言支持
Wombat-7B达摩院开源无需强化学习使用RRHF对齐的语言模型, alpaca基座
TigerBot虎博开源了7B 180B的模型以及预训练和微调语料
Luotuo中文指令微调的LLaMA,和ChatGLM
OpenBuddyLlama 多语言对话微调模型
Chinese VincunaLLama 7B基座,使用Belle+Guanaco数据训练
LinlyLlama 7B基座,使用belle+guanaco+pclue+firefly+CSL+newscommentary等7个指令微调数据集训练
Firefly中文2.6B模型,提升模型中文写作,古文能力,待开源全部训练代码,当前只有模型
Baize使用100k self-chat对话数据微调的LLama
BELLE使用ChatGPT生成数据对开源模型进行中文优化
Chatyuanchatgpt出来后最早的国内开源对话模型,T5架构是下面PromptCLUE的衍生模型
PromptCLUE多任务Prompt语言模型
PLUG阿里达摩院发布的大模型,提交申请会给下载链接
CPM2.0智源发布CPM2.0
GLM清华发布的中英双语130B预训练模型
BayLing基于LLama7B/13B,增强的语言对齐的英语/中文大语言模型

一分钟上手系列:https://blog.csdn.net/u014374009/category_12451843.html

国内外免费试用的大模型应用

模型链接模型描述
PPLX-7B/70BPerplexity.ai的Playground支持他们自家的PPLX模型和众多SOTA大模型,Gemma也支持了
kimi ChatMoonshot超长文本LLM 可输入20W上文, 文档总结无敌
讯飞星火科大讯飞
文心一言百度
通义千问阿里
百川百川
ChatGLM智谱轻言
DeepSeek深度求索
360智脑360
悟空字节跳动

垂直领域模型&进展

领域模型链接模型描述
医疗MedGPT医联发布的
医疗MedPalmGoogle在Faln-PaLM的基础上通过多种类型的医疗QA数据进行prompt-tuning指令微调得到,同时构建了MultiMedQA
医疗ChatDoctor110K真实医患对话样本+5KChatGPT生成数据进行指令微调
医疗Huatuo Med-ChatGLM医学知识图谱和chatgpt构建中文医学指令数据集+医学文献和chatgpt构建多轮问答数据
医疗Chinese-vicuna-medChinese-vicuna在cMedQA2数据上微调
医疗OpenBioMed清华AIR开源轻量版BioMedGPT, 知识图谱&20+生物研究领域多模态预训练模型
医疗DoctorGLMChatDoctor+MedDialog+CMD 多轮对话+单轮指令样本微调GLM
医疗MedicalGPT-zh自建的医学数据库ChatGPT生成QA+16个情境下SELF构建情景对话
医疗PMC-LLaMA医疗论文微调Llama
医疗PULSEBloom微调+继续预训练
医疗NHS-LLMChatgpt生成的医疗问答,对话,微调模型
医疗神农医疗大模型以中医知识图谱的实体为中心生成的中医知识指令数据集11w+,微调LLama-7B
医疗岐黄问道大模型3个子模型构成,已确诊疾病的临床治疗模型+基于症状的临床诊疗模型+中医养生条理模型,看起来是要ToB落地
医疗Zhongjing基于Ziya-LLama+医疗预训练+SFT+RLHF的中文医学大模型
医疗MeChat心理咨询领域,通过chatgpt改写多轮对话56k
医疗SoulChat心理咨询领域中文长文本指令与多轮共情对话数据联合指令微调 ChatGLM-6B
医疗MindChatMindChat-Baichuan-13B,Qwen-7B,MindChat-InternLM-7B使用不同基座在模型安全,共情,人类价值观对其上进行了强化
医疗DISC-MedLLM疾病知识图谱构建QA对+QA对转化成单论对话+真实世界数据重构+人类偏好数据筛选,SFT微调baichuan
法律LawGPT-zh利用ChatGPT清洗CrimeKgAssitant数据集得到52k单轮问答+我们根据中华人民共和国法律手册上最核心的9k法律条文,利用ChatGPT联想生成具体的情景问答+知识问答使用ChatGPT基于文本构建QA对
法律LawGPT基于llama+扩充词表二次预训练+基于法律条款构建QA指令微调
法律Lawyer Llama法律指令微调数据集:咨询+法律考试+对话进行指令微调
法律LexiLaw法律指令微调数据集:问答+书籍概念解释,法条内容进行指令微调
法律ChatLaw北大推出的法律大模型,应用形式很新颖类似频道内流一切功能皆融合在对话形式内
法律录问模型在baichuan基础上40G二次预训练+100K指令微调,在知识库构建上采用了Emb+意图+关键词联想结合的方案
金融FinChat.io使用最新的财务数据,电话会议记录,季度和年度报告,投资书籍等进行训练
金融OpenGPT领域LLM指令样本生成+微调框架
金融乾元BigBang金融2亿模型金融领域预训练+任务微调
金融度小满千亿金融大模型在Bloom-176B的基础上进行金融+中文预训练和微调
金融bondGPTGPT4在细分债券市场的应用开放申请中
金融IndexGPTJPMorgan在研的生成式投资顾问
金融恒生LightGPT金融领域继续预训练+插件化设计
金融知彼阿尔法企查查商查大模型
金融AlphaBox熵简科技发布大模型金融应用,多文档问答+会议转录+文档编辑
金融曹植达观发布金融大模型融合data2text等金融任务,赋能报告写作
金融聚宝盆基于 LLaMA 系基模型经过中文金融知识指令精调/指令微调(Instruct-tuning) 的微调模型
金融PIXIU整理了多个金融任务数据集加入了时间序列数据进行指令微调
金融ChatFund韭圈儿发布的第一个基金大模型,看起来是做了多任务指令微调,和APP已有的数据功能进行了全方位的打通,从选基,到持仓分析等等
金融FinGPT金融传统任务微调 or chatgpt生成金融工具调用
金融CFGPT金融预训练+指令微调+RAG等检索任务增强
金融况客FOF智能投顾基金大模型应用,基金投顾,支持nl2sql类的数据查询,和基金信息对比查询等
金融DISC-FinLLM复旦发布多微调模型组合金融系统,包括金融知识问答,金融NLP任务,金融计算,金融检索问答
金融InvestLMCFA考试,SEC, StackExchange投资问题等构建的金融指令微调LLaMA-65+
金融HithinkGPT同花顺发布金融大模型问财,覆盖查询,分析,对比,解读,预测等多个问题领域
金融无涯Infinity星环科技发布的金融大模型
金融妙想东方财富自研金融大模型开放试用
金融DeepMoney基于yi-34b-200k使用金融研报进行微调
编程Starcoder80种编程语言+Issue+Commit训练得到的编程大模型
编程ChatSQL基于ChatGLM实现NL2sql
编程codegeex13B预训练+微调多语言变成大模型
编程codegeex2Chatglm2的基础上CodeGeeX2-6B 进一步经过了 600B 代码数据预训练
编程stabelcode560B token多语言预训练+ 120,000 个 Alpaca指令对齐
编程SQLCoder在StarCoder的基础上微调15B超越gpt3.5
数学MathGPT是好未来自主研发的,面向全球数学爱好者和科研机构,以解题和讲题算法为核心的大模型。
数学MammoTH通过COT+POT构建了MathInstruct数据集微调llama在OOD数据集上超越了WizardLM
数学MetaMath模型逆向思维解决数学问题,构建了新的MetaMathQA微调llama2
交通TransGPTLLama-7B+34.6万领域预训练+5.8万条领域指令对话微调(来自文档问答)
交通TrafficGPTChatGPT+Prompt实现规划,调用交通流量领域专业TFM模型,TFM负责数据分析,任务执行,可视化等操作
科技Mozi红睡衣预训练+论文QA数据集 + ChatGPT扩充科研对话数据
天文StarGLM天文知识指令微调,项目进行中后期考虑天文二次预训练+KG
写作阅文-网文大模型介绍签约作者内测中,主打的内容为打斗场景,剧情切换,环境描写,人设,世界观等辅助片段的生成
写作MediaGPTLLama-7B扩充词表+指令微调,指令来自国内媒体专家给出的在新闻创作上的80个子任务
电商EcomGPT电商领域任务指令微调大模型,指令样本250万,基座模型是Bloomz
植物科学PLLaMa基于Llama使用植物科学领域学术论文继续预训练+sft扩展的领域模型
评估Auto-J上交开源了价值评估对齐13B模型
评估JudgeLM智源开源了 JudgeLM 的裁判模型,可以高效准确地评判各类大模型
评估CritiqueLLM智谱AI发布评分模型CritiqueLLM,支持含参考文本/无参考文本的评估打分

新时代RAG Embedding模型

模型链接模型描述
Jina-CobertJian AI开源中英德,8192 Token长文本Embedding
BGE-M3智源开源多语言,稀疏+稠密表征,8192 Token长文本Embedding
BCE网易开源更适配RAG任务的Embedding模型

Tool and Library

推理框架

工具描述链接
FlexFlow:模型部署推理框架https://github.com/flexflow/FlexFlow
Medusa:针对采样解码的推理加速框架,可以和其他策略结合https://github.com/FasterDecoding/Medusa
FlexGen: LLM推理 CPU Offload计算架构https://github.com/FMInference/FlexGen
VLLM:超高速推理框架Vicuna,Arena背后的无名英雄,比HF快24倍,支持很多基座模型https://github.com/vllm-project/vllm
Streamingllm: 新注意力池Attention方案,无需微调拓展模型推理长度,同时为推理提速https://github.com/mit-han-lab/streaming-llm
llama2.c: llama2 纯C语言的推理框架https://github.com/karpathy/llama2.c

指令微调,预训练,rlhf框架

工具描述链接
LoRA:Low-Rank指令微调方案https://github.com/tloen/alpaca-lora
peft:parameter-efficient prompt tunnging工具集https://github.com/huggingface/peft
RL4LMs:AllenAI的RL工具https://github.com/allenai/RL4LMs
RLLTE:港大,大疆等联合开源RLLTE开源学习框架https://github.com/RLE-Foundation/rllte
trl:基于Transformer的强化训练框架https://github.com/lvwerra/trl
trlx:分布式训练trlhttps://github.com/CarperAI/trlx
北大开源河狸项目可复现RLHF,支持多数LLM,提供RLHF数据https://github.com/PKU-Alignment/safe-rlhf
RL4LMs:AllenAI的RL工具https://github.com/allenai/RL4LMs
LMFlow:港科大实验室开源的大模型微调框架,支持以上多数开源模型的指令微调和RLHFhttps://github.com/OptimalScale/LMFlow
hugNLP:基于Huggingface开发继承Prompt技术,预训练和是指输入等多种方案https://github.com/wjn1996/HugNLP
Deepspeed:针对RL训练和推理的整合优化https://github.com/microsoft/DeepSpeed
Uerpy:预训练框架支持lm,mlm,unilm等https://github.com/dbiir/UER-py
TecentPretrain: Uerpy的重构版本支持llama预训练https://github.com/Tencent/TencentPretrain/tree/main
lamini: 整合指令数据生成,SFT,RLHF的工具库https://github.com/lamini-ai/lamini/
Chain-of-thought-hub:模型推理能力评估平台https://github.com/FranxYao/chain-of-thought-hub
EasyEdit:浙大开源支持多种模型,多种方案的模型知识精准编辑器https://github.com/zjunlp/EasyEdit
OpenDelta:集成了各种增量微调方案的开源实现https://github.com/thunlp/OpenDelta
Megablocks:MOE训练框架https://github.com/stanford-futuredata/megablocks
Tutel:MOE训练框架https://github.com/microsoft/tutel
TradingGym:参考openai gym的股票交易强化学习模拟器https://github.com/astrologos/tradinggym
LongLora: 长文本微调框架https://github.com/dvlab-research/LongLoRA
LlamaGym:在线RL微调框架https://github.com/KhoomeiK/LlamaGym

Auto/Multi Agent

工具描述链接
AutoGen:微软开源多Agent顶层框架https://github.com/microsoft/autogen
CrewAI: 比chatDev流程定义更灵活的多智能体框架https://github.com/joaomdmoura/CrewAI
ChatDev: 面壁智能开源多智能体协作的虚拟软件公司https://github.com/OpenBMB/ChatDev
Generative Agents:斯坦福AI小镇的开源代码https://github.com/joonspk-research/generative_agents
BabyAGI:自执行LLM Agenthttps://github.com/yoheinakajima/babyagi
AutoGPT:自执行LLM Agenthttps://github.com/Torantulino/Auto-GPT
AutoGPT-Plugins:提供众多Auo-GPT官方和第三方的插件https://github.com/Significant-Gravitas/Auto-GPT-Plugins
XAgent: 面壁智能开源双循环AutoGPThttps://github.com/OpenBMB/XAgent
MetaGPT: 覆盖软件公司全生命流程,例如产品经理等各个职业的AutoGPThttps://github.com/geekan/MetaGPT
ResearchGPT: 论文写作领域的AutoGPT,融合论文拆解+网络爬虫https://github.com/assafelovic/gpt-researcher
MiniAGI:自执行LLM Agenthttps://github.com/muellerberndt/mini-agi
AL Legion: 自执行LLM Agenthttps://github.com/eumemic/ai-legion
AgentVerse:多模型交互环境https://github.com/OpenBMB/AgentVerse
AgentSims: 给定一个社会环境,评估LLM作为智能体的预定任务目标完成能力的沙盒环境https://github.com/py499372727/AgentSims/
GPTRPG:RPG环境 AI Agent游戏化https://github.com/dzoba/gptrpg
GPTeam:多智能体交互https://github.com/101dotxyz/GPTeam
GPTEngineer:自动工具构建和代码生成https://github.com/AntonOsika/gpt-engineer
WorkGPT:类似AutoGPThttps://github.com/team-openpm/workgpt
AI-Town: 虚拟世界模拟器https://github.com/a16z-infra/ai-town
webarena:网络拟真环境,可用于自主智能体的测试,支持在线购物,论坛,代码仓库etchttps://github.com/web-arena-x/webarena
MiniWoB++:100+web交互操作的拟真环境https://github.com/Farama-Foundation/miniwob-plusplus

Agent工具框架类

工具描述链接
OpenAgents: 开源版ChatGPT-Plus搭建框架https://github.com/xlang-ai/OpenAgents
langchain:LLM Agent框架https://github.com/hwchase17/langchain
llama index:LLM Agent框架https://github.com/jerryjliu/llama_index
Langroid: LLM Agent框架https://github.com/langroid/langroid
Ragas: 评估检索增强LLM效果的框架,基于大模型prompt评估事实性,召回相关性,召回内容质量,回答相关性等https://github.com/explodinggradients/ragas#fire-quickstart
fastRAG:检索框架,包括多索引检索,KG构建等基础功能https://github.com/IntelLabs/fastRAG/tree/main
langflow:把langchain等agent组件做成了可拖拽式的UIhttps://github.com/logspace-ai/langflow
PhiData:把工具调用抽象成function call的Agent框架https://github.com/phidatahq/phidata
Haystack: LLM Agent 框架,pipeline的设计模式个人感觉比langchain更灵活更简洁https://github.com/deepset-ai/haystack
EdgeChain: 通过Jsonnet配置文件实现LLM Agenthttps://github.com/arakoodev/EdgeChains/tree/main
semantic-kernel:整合大模型和编程语言的SDKhttps://github.com/microsoft/semantic-kernel
BMTTools: 清华出品多工具调用开源库,提供微调数据和评估ToolBenchhttps://github.com/OpenBMB/BMTools
Jarvis: 大模型调用小模型框架,给小模型一个未来!https://github.com/search?q=jarvis
LLM-ToolMaker:让LLM自己制造Agenthttps://github.com/ctlllll/LLM-ToolMaker
Gorilla: LLM调用大量APIhttps://github.com/ShishirPatil/gorilla
wenda:闻达小模型整合搜索用于知识融入https://github.com/l15y/wenda
Alexandria: 从Arix论文开始把整个互联网变成向量索引,可以免费下载https://alex.macrocosm.so/download
RapidAPI: 统一这个世界的所有API,最大API Hub,有调用成功率,latency等,是真爱!https://rapidapi.com/hub
Open-Interpreter:命令行聊天框架https://github.com/KillianLucas/open-interpreter
AnythingLLM: langchain推出的支持本地部署开源模型的框架https://github.com/Mintplex-Labs/anything-llm
PromptFlow:微软推出的大模型应用框架https://github.com/microsoft/promptflow
Coze:字节跳动推出的个性化Agent定制应用支持多个大模型丰富插件集使用https://www.coze.com/username?redirect=/explore
Anakin:和Coze类似的Agent定制应用,插件支持较少但workflow使用起来更简洁https://app.anakin.ai/discover
TaskingAI:API-Oriented的类似langchain的大模型应用框架https://www.tasking.ai/

其他垂直领域Agent

工具描述链接
Deep-KE:基于LLM对数据进行智能解析实现知识抽取https://github.com/zjunlp/DeepKE
IncarnaMind:多文档RAG方案,动态chunking的方案可以借鉴https://github.com/junruxiong/IncarnaMind
Vectra:平台化的LLM Agent搭建方案,从索引构建,内容召回排序,到事实检查的LLM生成https://vectara.com/tour-vectara/
Data-Copilot:时间序列等结构化数据分析领域的Agent解决方案https://github.com/zwq2018/Data-Copilot
DB-GPT: 以数据库为基础的GPT实验项目,使用本地化的GPT大模型与您的数据和环境进行交互https://db-gpt.readthedocs.io/projects/db-gpt-docs-zh-cn/zh_CN/latest/index.html
guardrails:降低模型幻觉的python框架,promp模板+validation+修正https://github.com/shreyar/guardrails
guidance:微软新开源框架,同样是降低模型幻觉的框架,prompt+chain的升级版加入逐步生成和思维链路https://github.com/guidance-ai/guidance
SolidGPT: 上传个人数据,通过命令交互创建项目PRD等https://github.com/AI-Citizen/SolidGPT
HR-Agent: 类似HR和员工交互,支持多工具调用https://github.com/stepanogil/autonomous-hr-chatbot
BambooAI:数据分析Agenthttps://github.com/pgalko/BambooAI
AlphaCodium:通过Flow Engineering完成代码任务https://github.com/Codium-ai/AlphaCodium

Training Data

数据类型数据描述数据链接
指令微调self-instruct,GPT3自动生成&过滤得到指令集https://github.com/yizhongw/self-instruct
指令微调Standford Alpaca:52K text-davinci-003生成的self-instruct指令数据集https://github.com/tatsu-lab/stanford_alpaca
指令微调GPT4-for-LLM 中文+英文+对比指令https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM
指令微调GPTTeacher更多样的通用指令,角色扮演和代码指令https://github.com/teknium1/GPTeacher/tree/main
指令微调中文翻译Alpaca还有一些其他指令数据集https://github.com/hikariming/alpaca_chinese_dataset https://github.com/carbonz0/alpaca-chinese-dataset
指令微调alpaca指令GPT4生成,和以上几版对比显著质量更高,回复更长https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM/tree/main
指令微调Guanaco数据:对Alphca指令重写后以不同语言生成总共534K,有对话和非对话类型,还有补充的QA生成样本https://huggingface.co/datasets/JosephusCheung/GuanacoDataset
指令微调OIG中文指令包括翻译alpaca+natural+unnatural,多轮对话,考试,leetcode指令https://github.com/BAAI-Zlab/COIG
指令微调Vicuna训练使用的样本,用API获取了sharegpt上用户和chatgpt对话历史,部分网友整理到了HFhttps://github.com/domeccleston/sharegpt https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/tree/main
指令微调HC3指令数据中英文,包括金融,开放QA,百科,DBQA,医学等包含人工回复https://huggingface.co/datasets/Hello-SimpleAI/HC3-Chinese/tree/main
指令微调MOSS开源的SFT数据包含使用plugin的对话数据https://huggingface.co/datasets/Hello-SimpleAI/HC3-Chinese/tree/main
指令微调InstructWild数据:用四处爬取的chatgpt指令作为种子self-instruct扩充生成,中英双语https://github.com/XueFuzhao/InstructionWild/tree/main/data
指令微调BELLE100万指令数据,参考Alpaca用ChatGPT生成,有数学,多轮对话,校色对话等等https://github.com/LianjiaTech/BELLE
指令微调PromptCLUE多任务提示数据集:模板构建,只包含标准NLP任务https://github.com/CLUEbenchmark/pCLUE
指令微调TK-Instruct微调用的指令数据集, 全人工标注1600+NLP任务https://instructions.apps.allenai.org/
指令微调T0微调用的指令数据集(P3)https://huggingface.co/datasets/bigscience/P3
指令微调p3衍生的46种多语言数据集(xmtf)https://github.com/bigscience-workshop/xmtf
指令微调Unnatural Instruction使用GPT3生成后改写得到240khttps://github.com/orhonovich/unnatural-instructions
指令微调alpaca COT对多个数据源进行了清理并统一格式放到的了HF, 重点是人工整理的COT数据https://github.com/PhoebusSi/Alpaca-CoT
指令微调人工编写包含23种常见的中文NLP任务的指令数据,中文写作方向https://github.com/yangjianxin1/Firefly
指令微调Amazon COT指令样本包括各类QA,bigbench,math等https://github.com/amazon-science/auto-cot
指令微调CSL包含 396,209 篇中文核心期刊论文元信息 (标题、摘要、关键词、学科、门类)可做预训练可构建NLP指令任务https://github.com/ydli-ai/CSL
指令微调alpaca code 20K代码指令数据https://github.com/sahil280114/codealpaca#data-release
指令微调GPT4Tools 71K GPT4指令样本https://github.com/StevenGrove/GPT4Tools
指令微调GPT4指令+角色扮演+代码指令https://github.com/teknium1/GPTeacher
指令微调Mol-Instructions 2043K 分子+蛋白质+生物分子文本指令,覆盖分子设计、蛋白质功能预测、蛋白质设计等任务https://github.com/zjunlp/Mol-Instructions
数学腾讯人工智能实验室发布网上爬取的数学问题APE210khttps://github.com/Chenny0808/ape210k
数学猿辅导 AI Lab开源小学应用题Math23Khttps://github.com/SCNU203/Math23k/tree/main
数学grade school math把OpenAI的高中数学题有改造成指令样本有2-8步推理过程https://huggingface.co/datasets/qwedsacf/grade-school-math-instructions
数学数学问答数据集有推理过程和多项选择https://huggingface.co/datasets/math_qa/viewer/default/test?row=2
数学AMC竞赛数学题https://huggingface.co/datasets/competition_math
数学线性代数等纯数学计算题https://huggingface.co/datasets/math_dataset
代码APPS从不同的开放访问编码网站Codeforces、Kattis 等收集的问题https://opendatalab.org.cn/APPS
代码Lyra代码由带有嵌入式 SQL 的 Python 代码组成,经过仔细注释的数据库操作程序,配有中文评论和英文评论。https://opendatalab.org.cn/Lyra
代码Conala来自StackOverflow问题,手动注释3k,英文https://opendatalab.org.cn/CoNaLa/download
代码code-alpaca ChatGPT生成20K代码指令样本https://github.com/sahil280114/codealpaca.git
代码32K, 四种不同类型、不同难度的代码相关中文对话数据,有大模型生成,https://github.com/zxx000728/CodeGPT
对话LAION 策划的开放指令通用数据集中手动选择的组件子集 已开源40M 3万个,100M在路上https://github.com/LAION-AI/Open-Instruction-Generalist
对话Baize基于Chat GPT构建的self-chat数据https://github.com/project-baize/baize-chatbot/tree/main/data
对话FaceBook开源BlenderBot训练对话数据~6Khttps://huggingface.co/datasets/blended_skill_talk
对话AllenAI开源38.5万个对话高质量数据集SODAhttps://realtoxicityprompts.apps.allenai.org/
对话InstructDial在单一对话任务类型上进行指令微调https://github.com/prakharguptaz/Instructdial
对话Ultra Chat 两个独立的 ChatGPT Turbo API 进行对话,从而生成多轮对话数据https://github.com/thunlp/UltraChat
对话Awesome Open-domain Dialogue Models提供多个开放域对话数据https://github.com/cingtiye/Awesome-Open-domain-Dialogue-Models#%E4%B8%AD%E6%96%87%E5%BC%80%E6%94%BE%E5%9F%9F%E5%AF%B9%E8%AF%9D%E6%95%B0%E6%8D%AE%E9%9B%86
对话Salesforce开源超全DialogStudiohttps://github.com/salesforce/DialogStudio
对话基于事实Reference的多轮问答中文数据,已开源5万,之后会开源更多https://github.com/sufengniu/RefGPT
RLFH北大河狸开源RLHF数据集10K,1M需要申请https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF-10K
RLHFAnthropic hh-rlhf数据集https://huggingface.co/datasets/Anthropic/hh-rlhf
RLHFStack-exchange上问题对应多个答案,每个答案有打分https://huggingface.co/datasets/HuggingFaceH4/stack-exchange-preferences/tree/main
RLHFFacebook Bot Adversarial Dialogues数据集5Khttps://github.com/facebookresearch/ParlAI
RLHFAllenAI Real Toxicity promptshttps://github.com/facebookresearch/ParlAI
RLHFOpenAssistant Conversations 160K消息,13500人工生成, 英文为主https://huggingface.co/datasets/OpenAssistant/oasst1
RLHF知乎问答偏好数据集https://huggingface.co/datasets/liyucheng/zhihu_rlhf_3k
RLHFhh-rlhf中文翻译偏好数据https://huggingface.co/datasets/liswei/rm-static-zhTW
RLHF面壁智能开源大规模偏好数据,基于64Kprompt使用不同模型生成4个回答使用GPT-4评估https://github.com/OpenBMB/UltraFeedback
评估集BigBench(Beyond the Imitation Game Benchmark)https://github.com/google/BIG-bench
评估集Complex QA:用于ChatGPT的评测指令集https://github.com/tan92hl/Complex-Question-Answering-Evaluation-of-ChatGPT
评估集Langchain开源评估数据集https://huggingface.co/LangChainDatasets
评估集2010-2022年全国高考卷的题目https://github.com/OpenLMLab/GAOKAO-Bench
评估集中文通用大模型综合性评测基准SuperCLUEhttps://github.com/CLUEbenchmark/SuperCLUE
英文预训练RedPajama开源的复刻llama的预训练数据集,1.21万亿Tokenhttps://github.com/togethercomputer/RedPajama-Data
英文预训练Cerebras基于RedPajama进行清洗去重后得到的高质量数据集, 6270亿Tokenhttps://huggingface.co/datasets/cerebras/SlimPajama-627B/tree/main/train
英文预训练Pile 22个高质量数据集混合的预训练数据集800G,全量开放下载https://pile.eleuther.ai/
通用预训练UER整理CLUECorpusSmall+News Commentary中英https://github.com/dbiir/UER-py/wiki/%E9%A2%84%E8%AE%AD%E7%BB%83%E6%95%B0%E6%8D%AE
中文预训练智源人工智能开源的wudao 200G预训练数据https://github.com/BAAI-WuDao/WuDaoMM
中文预训练里屋社区发起开源力量收集中文互联网语料集MNBVC目标是对标ChatGPT的40Thttps://github.com/esbatmop/MNBVC
中文预训练复旦开源15万中文图书下载和抽取方案https://github.com/FudanNLPLAB/CBook-150K
中文预训练书生万卷数据集来自公开网页多模态数据集,包括文本,图文和视频,其中文本1T,图文150Ghttps://opendatalab.org.cn/OpenDataLab/WanJuan1_dot_0
中文预训练昆仑天工开源3.2TB中英语料https://github.com/SkyworkAI/Skywork
中文预训练浪潮开源的用于Yuan1.0训练的预训练中文语料https://www.airyuan.cn/home
领域预训练度小满开源60G金融预训练语料https://github.com/Duxiaoman-DI/XuanYuan
领域预训练首个中文科学文献数据集CSL,也有多种NLP任务数据https://github.com/ydli-ai/CSL
平行语料news-commentary中英平行语料,用于中英间知识迁移https://data.statmt.org/news-commentary/v15/training/
多源数据集整合opendatalab整合了预训练阶段的多个数据源https://opendatalab.org.cn/?industry=9821&source=JUU3JTlGJUE1JUU0JUI5JThF
Tool-搜索增强webCPM开源的和搜索工具进行交互问答的数据集,包括网页抽取式摘要,多事实内容回答等人工标注数据https://github.com/thunlp/WebCPM
Tool-多工具BmTools开源的多工具调用指令数据集https://github.com/OpenBMB/BMTools
Tool-多工具AgentInstruct包含6项Agent任务,包括REACT式COT标注https://thudm.github.io/AgentTuning/
Tool-多工具MSAgent-Bench 大模型调用数据集 598k训练数据https://modelscope.cn/datasets/damo/MSAgent-Bench/summary
Tool-多工具MOSS开源的知识搜索,文生图,计算器,解方程等4个插件的30万条多轮对话数据https://github.com/OpenLMLab/MOSS#%E6%95%B0%E6%8D%AE
NL2SQLDB-GPT-Hub梳理了多源text-to-sql数据集https://github.com/eosphoros-ai/DB-GPT-Hub
长文本清华开源的长文本对齐数据集LongAlign-10khttps://huggingface.co/datasets/THUDM/LongAlign-10k

AIGC

搜索

通用搜索
  • 秘塔搜索: 融合了脑图,表格多模态问答的搜索应用
  • You.COM : 支持多种检索增强问答模式
  • Walles.AI: 融合了图像聊天,文本聊天,chatpdf,web-copilot等多种功能的智能助手
  • webpilot.ai 比ChatGPT 自带的 Web Browsing更好用的浏览器检索插件,更适用于复杂搜索场景,也开通api调用了
  • New Bing:需要科学上网哦
  • Perplexity.ai: 同样需要科学上网,感觉比Bing做的更好的接入ChatGPT的神奇搜索引擎,在Bing之外还加入了相关推荐和追问
  • sider.ai: 支持多模型浏览器插件对话和多模态交互操作
代码搜索
  • devv.ai: 基于微调llama2 + RAG搭建的属于程序员的搜索引擎
  • Phind: 面向开发人员的AI搜索引擎
知识管理
  • glean: 企业知识搜索和项目管理类的搜索初创公司,帮助员工快速定位信息,帮助公司整合信息
  • Mem: 个人知识管理,例如知识图谱,已获openai融资

ChatDoc

  • Kimi-Chat: 长长长长文档理解无敌的Kimi-Chat,单文档总结多文档结构化对比,无所不能,多长都行!
  • ChatDoc:ChatPDF升级版,需要科学上网,增加了表格类解析,支持选择区域的问答,在PDF识别上做的很厉害
  • AskyourPdf: 同样是上传pdf进行问答和摘要的应用
  • DocsGPT: 比较早出来的Chat DOC通用方案
  • ChatPDF: 国内的ChatPDF, 上传pdf后,会给出文章的Top5可能问题,然后对话式从文档中进行问答和检索,10s读3万字
  • AlphaBox: 从个人文件夹管理出发的文档问答工具

论文研究: 日度更新,观点总结,

  • SCISPACE: 论文研究的白月光,融合了全库搜索问答,以及个人上传PDF构建知识库问答。同样支持相关论文发现,和论文划词解读。并且解读内容可以保存到notebook中方便后续查找,可以说是产品和算法强强联合了。
  • Consensus: AI加持的论文搜素,多论文总结,观点对比工具。产品排名巨高,但个人感觉搜索做的有提升空间
  • Aminer: 论文搜索,摘要,问答,搜索关键词符号化改写;但论文知识库问答有些幻觉严重
  • cool.paper: 苏神开发的基于kimi的论文阅读网站
  • OpenRead: 国内产品,面向论文写作,阅读场景,可以帮助生成文献综述,以及提供和NotionAI相似的智能Markdown用于写作
  • ChatPaper: 根据输入关键词,自动在arxiv上下载最新的论文,并对论文进行摘要总结,可以在huggingface上试用
  • researchgpt: 和ChatPDF类似,支持arivx论文下载,加载后对话式获取论文重点
  • ChatGPT-academic: 又是一个基于gradio实现的paper润色,摘要等功能打包的实现,不少功能可以借鉴
  • BriefGPT: 日更Arxiv论文,并对论文进行摘要,关键词抽取,帮助研究者了解最新动态, UI不错哟

写作效率工具类

  • 赛博马良:题如其名,可定制AI员工24小时全网抓取关注的创作选题,推送给小编进行二次创作
  • Miracleplus: 全AI Agent负责运营的Hacker News网站
  • ChatMind: chatgpt生成思维导图,模板很丰富,泛化性也不错,已经被XMind收购了
  • 范文喵写作: 范文喵写作工具,选题,大纲,写作全流程
  • WriteSonic:AI写作,支持对话和定向创作如广告文案,商品描述, 支持Web检索是亮点,支持中文
  • copy.ai: WriteSonic竞品,亮点是像论文引用一样每句话都有对应网站链接,可以一键复制到右边的创作Markdown,超级好用!
  • NotionAI:智能Markdown,适用真相!在创作中用command调用AI辅助润色,扩写,检索内容,给创意idea
  • Hix-AI: 同时提供copilot模式和综合写作模式
  • AI-Write: 个人使用感较好的流程化写作工具
  • Jasper: 同上,全是竞品哈哈
  • copy.down: 中文的营销文案生成,只能定向创作,支持关键词到文案的生成
  • Weaver AI: 波形智能开发的内容创作app,支持多场景写作
  • ChatExcel: 指令控制excel计算,对熟悉excel的有些鸡肋,对不熟悉的有点用
  • mindShow:免费+付费的PPT制作工具,自定义PPT模板还不够好

金融垂直领域

  • Alpha: ChatGPT加持的金融app,支持个股信息查询,资产分析诊断,财报汇总etc
  • Composer:量化策略和AI的结合,聊天式+拖拽式投资组合构建和回测
  • Finalle.ai: 实时金融数据流接入大模型
  • ScopeChat:虚拟币应用,整个对话类似ChatLaw把工具组件嵌入了对话中
  • AInvest:个股投资,融合BI分析,广场讨论区(有演变成雪球热度指数的赶脚)
  • Reportify: 金融领域公司公告,新闻,电话会的问答和摘要总结

私人助理&聊天

  • Mr.-Ranedeer-: 基于prompt和GPT-4的强大能力提供个性化学习环境,个性化出题+模型解答
  • AI Topiah: 聆心智能AI角色聊天,和路飞唠了两句,多少有点中二之魂在燃烧
  • chatbase: 情感角色聊天,还没尝试
  • Vana: virtual DNA, 通过聊天创建虚拟自己!概念很炫

Agent

  • NexusGPT: AutoGPT可以出来工作了,第一个全AI Freelance平台
  • cognosys: 全网最火的web端AutoGPT,不过咋说呢试用了下感觉下巴要笑掉了,不剧透去试试你就知道
  • godmode:可以进行人为每一步交互的的AutoGPT
  • agentgpt: 基础版AutoGPT

视频拆条总结

  • Eightify: chrome插件,节省观看长视频的时间,立即获取关键思想,分模块总结+时间戳摘要
  • BibiGPT: Bilibli视频内容一键总结,多模态文档

代码copilot & BI工具

  • AutoDev: AI编程辅助工具
  • AlphaCodium: Flow Engineering提高代码整体通过率
  • Codium: 开源的编程Copilot来啦
  • Copilot: 要付费哟
  • Fauxpilot: copilot本地开源替代
  • Codeium: Copilot替代品,有免费版本支持各种plugin !
  • ai2sql: text2sql老牌公司,相比sqltranslate功能更全面,支持SQL 语法检查、格式化和生成公式
  • chat2query: text2sql 相比以上两位支持更自然的文本指令,以及更复杂的数据分析类的sql生成
  • OuterBase: text2sql 设计风格很吸睛!电子表格结合mysql和dashboard,更适合数据分析宝宝
  • Chat2DB:智能的通用数据库SQL客户端和报表工具
  • ChatBI:网易数帆发布ChatBI对话数据分析平台
  • Kyligence Copilot:Kyligence发布一站式指标平台的 AI 数智助理,支持对话式指标搜索,异动归因等等
  • Wolverine: 代码自我debug的python脚本

多模态生成

  • dreamstudio.ai: 开创者,Stable Difussion, 有试用quota
  • midjourney: 开创者,艺术风格为主
  • Dall.E: 三巨头这就凑齐了
  • ControlNet: 为绘画创作加持可控性
  • gemo.ai: 多模态聊天机器人,包括文本,图像,视频生成
  • storybird: 根据提示词生成故事绘本,还可以售卖
  • Magnific.ai: 两个人的团队做出的AI图片精修师
  • Morph Studio: Stability AI入场视频制作
  • Gamma: PPT制作神器,ProductHunt月度排名Number1

Resources

GPTs应用导航

Prompt和其他教程类

书籍和博客类

会议&访谈类

一分钟上手系列:https://blog.csdn.net/u014374009/category_12451843.html

Papers

paper List

  • https://github.com/dongguanting/In-Context-Learning_PaperList
  • https://github.com/thunlp/PromptPapers
  • https://github.com/Timothyxxx/Chain-of-ThoughtsPapers
  • https://github.com/thunlp/ToolLearningPapers
  • https://github.com/MLGroupJLU/LLM-eval-survey
  • https://github.com/thu-coai/PaperForONLG

综述

  • A Survey of Large Language Models
  • Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing ⭐️
  • Paradigm Shift in Natural Language Processing
  • Pre-Trained Models: Past, Present and Future
  • What Language Model Architecture and Pretraining objects work best for zero shot generalization ⭐️
  • Towards Reasoning in Large Language Models: A Survey
  • Reasoning with Language Model Prompting: A Survey ⭐️
  • An Overview on Language Models: Recent Developments and Outlook ⭐️
  • A Survey of Large Language Models[6.29更新版]
  • Unifying Large Language Models and Knowledge Graphs: A Roadmap
  • Augmented Language Models: a Survey ⭐️
  • Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey
  • Challenges and Applications of Large Language Models
  • The Rise and Potential of Large Language Model Based Agents: A Survey
  • Large Language Models for Information Retrieval: A Survey
  • AI Alignment: A Comprehensive Survey
  • Trends in Integration of Knowledge and Large Language Models: A Survey and Taxonomy of Methods, Benchmarks, and Applications
  • Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook
  • A Survey on Language Models for Code
  • Model-as-a-Service (MaaS): A Survey

大模型能力探究

  • In Context Learning
    • LARGER LANGUAGE MODELS DO IN-CONTEXT LEARNING DIFFERENTLY
    • How does in-context learning work? A framework for understanding the differences from traditional supervised learning
    • Why can GPT learn in-context? Language Model Secretly Perform Gradient Descent as Meta-Optimizers ⭐️
    • Rethinking the Role of Demonstrations What Makes incontext learning work? ⭐️
    • Trained Transformers Learn Linear Models In-Context
    • In-Context Learning Creates Task Vectors
  • 涌现能力
    • Sparks of Artificial General Intelligence: Early experiments with GPT-4
    • Emerging Ability of Large Language Models ⭐️
    • LANGUAGE MODELS REPRESENT SPACE AND TIME
    • Are Emergent Abilities of Large Language Models a Mirage?
  • 能力评估
    • IS CHATGPT A GENERAL-PURPOSE NATURAL LANGUAGE PROCESSING TASK SOLVER?
    • Can Large Language Models Infer Causation from Correlation?
    • Holistic Evaluation of Language Model
    • Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
    • Theory of Mind May Have Spontaneously Emerged in Large Language Models
    • Beyond The Imitation Game: Quantifying And Extrapolating The Capabilities Of Language Models
    • Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations
    • Demystifying GPT Self-Repair for Code Generation
    • Evidence of Meaning in Language Models Trained on Programs
    • Can Explanations Be Useful for Calibrating Black Box Models
    • On the Robustness of ChatGPT: An Adversarial and Out-of-distribution Perspective
    • Language acquisition: do children and language models follow similar learning stages?
  • 领域能力
    • Capabilities of GPT-4 on Medical Challenge Problems
    • Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine

Prompt Tunning范式

  • Tunning Free Prompt
    • GPT2: Language Models are Unsupervised Multitask Learners
    • GPT3: Language Models are Few-Shot Learners ⭐️
    • LAMA: Language Models as Knowledge Bases?
    • AutoPrompt: Eliciting Knowledge from Language Models
  • Fix-Prompt LM Tunning
    • T5: Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
    • PET-TC(a): Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference ⭐️
    • PET-TC(b): PETSGLUE It’s Not Just Size That Matters Small Language Models are also few-shot learners
    • GenPET: Few-Shot Text Generation with Natural Language Instructions
    • LM-BFF: Making Pre-trained Language Models Better Few-shot Learners ⭐️
    • ADEPT: Improving and Simplifying Pattern Exploiting Training
  • Fix-LM Prompt Tunning
    • Prefix-tuning: Optimizing continuous prompts for generation
    • Prompt-tunning: The power of scale for parameter-efficient prompt tuning ⭐️
    • P-tunning: GPT Understands Too ⭐️
    • WARP: Word-level Adversarial ReProgramming
  • LM + Prompt Tunning
    • P-tunning v2: Prompt Tuning Can Be Comparable to Fine-tunning Universally Across Scales and Tasks
    • PTR: Prompt Tuning with Rules for Text Classification
    • PADA: Example-based Prompt Learning for on-the-fly Adaptation to Unseen Domains
  • Fix-LM Adapter Tunning
    • LORA: LOW-RANK ADAPTATION OF LARGE LANGUAGE MODELS ⭐️
    • LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning
    • Parameter-Efficient Transfer Learning for NLP
    • INTRINSIC DIMENSIONALITY EXPLAINS THE EFFECTIVENESS OF LANGUAGE MODEL FINE-TUNING

主流LLMS和预训练

  • GLM-130B: AN OPEN BILINGUAL PRE-TRAINED MODEL
  • PaLM: Scaling Language Modeling with Pathways
  • PaLM 2 Technical Report
  • GPT-4 Technical Report
  • Backpack Language Models
  • LLaMA: Open and Efficient Foundation Language Models
  • Llama 2: Open Foundation and Fine-Tuned Chat Models
  • Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning
  • OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch
  • Mistral 7B
  • Ziya2: Data-centric Learning is All LLMs Need
  • MEGABLOCKS: EFFICIENT SPARSE TRAINING WITH MIXTURE-OF-EXPERTS
  • TUTEL: ADAPTIVE MIXTURE-OF-EXPERTS AT SCALE
  • Phi1- Textbooks Are All You Need ⭐️
  • Phi1.5- Textbooks Are All You Need II: phi-1.5 technical report
  • Gemini: A Family of Highly Capable Multimodal Models
  • In-Context Pretraining: Language Modeling Beyond Document Boundaries
  • LLAMA PRO: Progressive LLaMA with Block Expansion
  • QWEN TECHNICAL REPORT

指令微调&对齐 (instruction_tunning)

  • 经典方案
    • Flan: FINETUNED LANGUAGE MODELS ARE ZERO-SHOT LEARNERS ⭐️
    • Flan-T5: Scaling Instruction-Finetuned Language Models
    • ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
    • Instruct-GPT: Training language models to follow instructions with human feedback ⭐️
    • T0: MULTITASK PROMPTED TRAINING ENABLES ZERO-SHOT TASK GENERALIZATION
    • Natural Instructions: Cross-Task Generalization via Natural Language Crowdsourcing Instructions
    • Tk-INSTRUCT: SUPER-NATURALINSTRUCTIONS: Generalization via Declarative Instructions on 1600+ NLP Tasks
    • ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-shot Generalization
    • Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor
    • INSTRUCTEVAL Towards Holistic Evaluation of Instrucion-Tuned Large Language Models
  • 更少,质量更高、更多样的指令数据带来质变
    • LIMA: Less Is More for Alignment ⭐️
    • Maybe Only 0.5% Data is Needed: A Preliminary Exploration of Low Training Data Instruction Tuning
    • AlpaGasus: Training A Better Alpaca with Fewer Data
    • InstructionGPT-4: A 200-Instruction Paradigm for Fine-Tuning MiniGPT-4
    • Instruction Mining: High-Quality Instruction Data Selection for Large Language Models
    • Visual Instruction Tuning with Polite Flamingo
    • Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases
    • Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
  • 新对齐/微调方案
    • WizardLM: Empowering Large Language Models to Follow Complex Instructions
    • Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning
    • Self-Alignment with Instruction Backtranslation ⭐️
    • Mixture-of-Experts Meets Instruction Tuning:A Winning Combination for Large Language Models
    • Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks
    • PROMPT2MODEL: Generating Deployable Models from Natural Language Instructions
    • OpinionGPT: Modelling Explicit Biases in Instruction-Tuned LLMs
    • Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
    • Human-like systematic generalization through a meta-learning neural network
    • Magicoder: Source Code Is All You Need
    • Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
    • Generative Representational Instruction Tuning
  • 指令数据生成
    • APE: LARGE LANGUAGE MODELS ARE HUMAN-LEVEL PROMPT ENGINEERS ⭐️
    • SELF-INSTRUCT: Aligning Language Model with Self Generated Instructions ⭐️
    • iPrompt: Explaining Data Patterns in Natural Language via Interpretable Autoprompting
    • Flipped Learning: Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
    • Fairness-guided Few-shot Prompting for Large Language Models
    • Instruction induction: From few examples to natural language task descriptions .
    • SELF-QA Unsupervised Knowledge Guided alignment.
    • GPT Self-Supervision for a Better Data Annotator
    • The Flan Collection Designing Data and Methods
    • Self-Consuming Generative Models Go MAD
    • InstructEval: Systematic Evaluation of Instruction Selection Methods
    • Overwriting Pretrained Bias with Finetuning Data
    • Improving Text Embeddings with Large Language Models
  • 如何降低通用能力损失
    • How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition
    • TWO-STAGE LLM FINE-TUNING WITH LESS SPECIALIZATION AND MORE GENERALIZATION
  • 微调经验/实验报告
    • BELLE: Exploring the Impact of Instruction Data Scaling on Large Language Models: An Empirical Study on Real-World Use Cases
    • Baize: Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data
    • A Comparative Study between Full-Parameter and LoRA-based Fine-Tuning on Chinese Instruction Data for Large LM
    • Exploring ChatGPT’s Ability to Rank Content: A Preliminary Study on Consistency with Human Preferences
    • Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation
  • Others
    • Crosslingual Generalization through Multitask Finetuning
    • Cross-Task Generalization via Natural Language Crowdsourcing Instructions
    • UNIFIEDSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models
    • PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
    • ROLELLM: BENCHMARKING, ELICITING, AND ENHANCING ROLE-PLAYING ABILITIES OF LARGE LANGUAGE MODELS

对话模型

  • LaMDA: Language Models for Dialog Applications
  • Sparrow: Improving alignment of dialogue agents via targeted human judgements ⭐️
  • BlenderBot 3: a deployed conversational agent that continually learns to responsibly engage
  • How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation
  • DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI
  • Enhancing Chat Language Models by Scaling High-quality Instructional Conversations
  • DiagGPT: An LLM-based Chatbot with Automatic Topic Management for Task-Oriented Dialogue

思维链 (prompt_chain_of_thought)

  • 基础&进阶用法
    • [zero-shot-COT] Large Language Models are Zero-Shot Reasoners ⭐️
    • [few-shot COT] Chain of Thought Prompting Elicits Reasoning in Large Language Models ⭐️
    • SELF-CONSISTENCY IMPROVES CHAIN OF THOUGHT REASONING IN LANGUAGE MODELS
    • LEAST-TO-MOST PROMPTING ENABLES COMPLEX REASONING IN LARGE LANGUAGE MODELS ⭐️
    • Tree of Thoughts: Deliberate Problem Solving with Large Language Models
    • Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models
    • Decomposed Prompting A MODULAR APPROACH FOR Solving Complex Tasks
    • Successive Prompting for Decomposing Complex Questions
    • Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework
    • Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Large Language Models
    • Tree-of-Mixed-Thought: Combining Fast and Slow Thinking for Multi-hop Visual Reasoning
    • LAMBADA: Backward Chaining for Automated Reasoning in Natural Language
    • Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models
    • Graph of Thoughts: Solving Elaborate Problems with Large Language Models
    • Progressive-Hint Prompting Improves Reasoning in Large Language Models
    • LARGE LANGUAGE MODELS CAN LEARN RULES
    • DIVERSITY OF THOUGHT IMPROVES REASONING ABILITIES OF LARGE LANGUAGE MODELS
    • From Complex to Simple: Unraveling the Cognitive Tree for Reasoning with Small Language Models
    • Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
    • LARGE LANGUAGE MODELS AS OPTIMIZERS
  • 分领域COT [Math, Code, Tabular, QA]
    • Solving Quantitative Reasoning Problems with Language Models
    • SHOW YOUR WORK: SCRATCHPADS FOR INTERMEDIATE COMPUTATION WITH LANGUAGE MODELS
    • Solving math word problems with processand outcome-based feedback
    • CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning
    • T-SciQ: Teaching Multimodal Chain-of-Thought Reasoning via Large Language Model Signals for Science Question Answering
    • LEARNING PERFORMANCE-IMPROVING CODE EDITS
    • Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning
    • Tab-CoT: Zero-shot Tabular Chain of Thought
    • Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
  • 原理分析
    • Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters ⭐️
    • TEXT AND PATTERNS: FOR EFFECTIVE CHAIN OF THOUGHT IT TAKES TWO TO TANGO
    • Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective
    • Large Language Models Can Be Easily Distracted by Irrelevant Context
    • Chain-of-Thought Reasoning Without Prompting
  • 小模型COT蒸馏
    • Specializing Smaller Language Models towards Multi-Step Reasoning ⭐️
    • Teaching Small Language Models to Reason
    • Large Language Models are Reasoning Teachers
    • Distilling Reasoning Capabilities into Smaller Language Models
    • The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning
  • COT样本自动构建/选择
    • STaR: Self-Taught Reasoner Bootstrapping ReasoningWith Reasoning
    • AutoCOT:AUTOMATIC CHAIN OF THOUGHT PROMPTING IN LARGE LANGUAGE MODELS
    • Large Language Models Can Self-Improve
    • Active Prompting with Chain-of-Thought for Large Language Models
    • COMPLEXITY-BASED PROMPTING FOR MULTI-STEP REASONING
  • others
    • OlaGPT Empowering LLMs With Human-like Problem-Solving abilities
    • Challenging BIG-Bench tasks and whether chain-of-thought can solve them
    • Large Language Models are Better Reasoners with Self-Verification
    • ThoughtSource A central hub for large language model reasoning data
    • Two Failures of Self-Consistency in the Multi-Step Reasoning of LLMs

RLHF

  • Deepmind
    • Teaching language models to support answers with verified quotes
    • sparrow, Improving alignment of dialogue agents via targetd human judgements ⭐️
    • STATISTICAL REJECTION SAMPLING IMPROVES PREFERENCE OPTIMIZATION
    • Reinforced Self-Training (ReST) for Language Modeling
    • SLiC-HF: Sequence Likelihood Calibration with Human Feedback
    • CALIBRATING SEQUENCE LIKELIHOOD IMPROVES CONDITIONAL LANGUAGE GENERATION
    • REWARD DESIGN WITH LANGUAGE MODELS
    • Final-Answer RL Solving math word problems with processand outcome-based feedback
    • Solving math word problems with process- and outcome-based feedback
  • openai
    • PPO: Proximal Policy Optimization Algorithms ⭐️
    • Deep Reinforcement Learning for Human Preference
    • Fine-Tuning Language Models from Human Preferences
    • learning to summarize from human feedback
    • InstructGPT: Training language models to follow instructions with human feedback ⭐️
    • Scaling Laws for Reward Model Over optimization ⭐️
    • WEAK-TO-STRONG GENERALIZATION: ELICITING STRONG CAPABILITIES WITH WEAK SUPERVISION ⭐️
    • PRM:Let’s verify step by step
  • Anthropic
    • A General Language Assistant as a Laboratory for Alignmen
    • Red Teaming Language Models to Reduce Harms Methods,Scaling Behaviors and Lessons Learned
    • Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback ⭐️
    • Constitutional AI Harmlessness from AI Feedback ⭐️
    • Pretraining Language Models with Human Preferences
    • The Capacity for Moral Self-Correction in Large Language Models
    • Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Trainin
  • AllenAI, RL4LM:IS REINFORCEMENT LEARNING (NOT) FOR NATURAL LANGUAGE PROCESSING BENCHMARKS
  • 改良方案
    • RRHF: Rank Responses to Align Language Models with Human Feedback without tears
    • Chain of Hindsight Aligns Language Models with Feedback
    • AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
    • RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
    • RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
    • Training Socially Aligned Language Models in Simulated Human Society
    • RAIN: Your Language Models Can Align Themselves without Finetuning
    • Generative Judge for Evaluating Alignment
    • PEERING THROUGH PREFERENCES: UNRAVELING FEEDBACK ACQUISITION FOR ALIGNING LARGE LANGUAGE MODELS
    • SALMON: SELF-ALIGNMENT WITH PRINCIPLE-FOLLOWING REWARD MODELS
    • Large Language Model Unlearning ⭐️
    • ADVERSARIAL PREFERENCE OPTIMIZATION ⭐️
    • Preference Ranking Optimization for Human Alignment
    • A Long Way to Go: Investigating Length Correlations in RLHF
    • ENABLE LANGUAGE MODELS TO IMPLICITLY LEARN SELF-IMPROVEMENT FROM DATA
    • REWARD MODEL ENSEMBLES HELP MITIGATE OVEROPTIMIZATION
    • LEARNING OPTIMAL ADVANTAGE FROM PREFERENCES AND MISTAKING IT FOR REWARD
    • ULTRAFEEDBACK: BOOSTING LANGUAGE MODELS WITH HIGH-QUALITY FEEDBACK
    • MOTIF: INTRINSIC MOTIVATION FROM ARTIFICIAL INTELLIGENCE FEEDBACK
    • STABILIZING RLHF THROUGH ADVANTAGE MODEL AND SELECTIVE REHEARSAL
    • Shepherd: A Critic for Language Model Generation
    • LEARNING TO GENERATE BETTER THAN YOUR LLM
    • Fine-Grained Human Feedback Gives Better Rewards for Language Model Training
    • Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision
    • Direct Preference Optimization: Your Language Model is Secretly a Reward Model
    • HIR The Wisdom of Hindsight Makes Language Models Better Instruction Followers
  • RL探究
    • UNDERSTANDING THE EFFECTS OF RLHF ON LLM GENERALISATION AND DIVERSITY
    • A LONG WAY TO GO: INVESTIGATING LENGTH CORRELATIONS IN RLHF
    • THE TRICKLE-DOWN IMPACT OF REWARD (IN-)CONSISTENCY ON RLHF
    • Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
    • HUMAN FEEDBACK IS NOT GOLD STANDARD
    • CONTRASTIVE POST-TRAINING LARGE LANGUAGE MODELS ON DATA CURRICULUM

LLM Agent 让模型使用工具 (llm_agent)

  • A Survey on Large Language Model based Autonomous Agents
  • PERSONAL LLM AGENTS: INSIGHTS AND SURVEY ABOUT THE CAPABILITY, EFFICIENCY AND SECURITY
  • 基于prompt通用方案
    • ReAct: SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELS ⭐️
    • Self-ask: MEASURING AND NARROWING THE COMPOSITIONALITY GAP IN LANGUAGE MODELS ⭐️
    • MRKL SystemsA modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning
    • PAL: Program-aided Language Models
    • ART: Automatic multi-step reasoning and tool-use for large language models
    • ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models ⭐️
    • Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions
    • Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models ⭐️
    • Faithful Chain-of-Thought Reasoning
    • Reflexion: Language Agents with Verbal Reinforcement Learning ⭐️
    • Verify-and-Edit: A Knowledge-Enhanced Chain-of-Thought Framework
    • RestGPT: Connecting Large Language Models with Real-World RESTful APIs
    • ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models
    • InstructTODS: Large Language Models for End-to-End Task-Oriented Dialogue Systems
    • TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents
    • ControlLLM: Augment Language Models with Tools by Searching on Graphs
    • Reflexion: an autonomous agent with dynamic memory and self-reflection
    • AutoAgents: A Framework for Automatic Agent Generation
    • GitAgent: Facilitating Autonomous Agent with GitHub by Tool Extension
    • PreAct: Predicting Future in ReAct Enhances Agent’s Planning Ability
    • TOOLLLM: FACILITATING LARGE LANGUAGE MODELS TO MASTER 16000+ REAL-WORLD APIS
  • 基于微调通用方案
    • TALM: Tool Augmented Language Models
    • Toolformer: Language Models Can Teach Themselves to Use Tools ⭐️
    • Tool Learning with Foundation Models
    • Tool Maker:Large Language Models as Tool Maker
    • TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs
    • AgentTuning: Enabling Generalized Agent Abilities for LLMs
    • SWIFTSAGE: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks
    • FireAct: Toward Language Agent Fine-tuning
    • Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning
    • REST MEETS REACT: SELF-IMPROVEMENT FOR MULTI-STEP REASONING LLM AGENT
    • Efficient Tool Use with Chain-of-Abstraction Reasoning
  • 调用模型方案
    • HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace
    • Gorilla:Large Language Model Connected with Massive APIs ⭐️
    • OpenAGI: When LLM Meets Domain Experts
  • 垂直领域
    • WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
    • ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
    • ChemCrow Augmenting large language models with chemistry tools
    • Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow
    • Demonstration of InsightPilot: An LLM-Empowered Automated Data Exploration System
    • GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information
    • PointLLM: Empowering Large Language Models to Understand Point Clouds
    • Interpretable Long-Form Legal Question Answering with Retrieval-Augmented Large Language Models
    • Generating Explanations in Medical Question-Answering by Expectation Maximization Inference over Evidence
    • CarExpert: Leveraging Large Language Models for In-Car Conversational Question Answering
    • A Multimodal Foundation Agent for Financial Trading: Tool-Augmented, Diversified, and Generalist
  • 评估
    • Evaluating Verifiability in Generative Search Engines
    • Mind2Web: Towards a Generalist Agent for the Web
    • Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions
    • API-Bank: A Benchmark for Tool-Augmented LLMs
    • ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
    • Automatic Evaluation of Attribution by Large Language Models
    • Benchmarking Large Language Models in Retrieval-Augmented Generation
    • ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems
  • MultiAgent
    • Generative Agents: Interactive Simulacra of Human Behavior ⭐️
    • AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents
    • CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society ⭐️
    • Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf
    • Communicative Agents for Software Development ⭐️
    • METAAGENTS: SIMULATING INTERACTIONS OF HUMAN BEHAVIORS FOR LLM-BASED TASK-ORIENTED COORDINATION VIA COLLABORATIVE GENERATIVE AGENTS
    • LET MODELS SPEAK CIPHERS: MULTIAGENT DEBATE THROUGH EMBEDDINGS
    • MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning
    • War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars
    • More Agents Is All You Need
  • 自主学习和探索
    • AppAgent: Multimodal Agents as Smartphone Users
    • Investigate-Consolidate-Exploit: A General Strategy for Inter-Task Agent Self-Evolution
    • LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error
  • 其他
    • LLM+P: Empowering Large Language Models with Optimal Planning Proficiency
    • Inference with Reference: Lossless Acceleration of Large Language Models
    • RecallM: An Architecture for Temporal Context Understanding and Question Answering
    • LLaMA Rider: Spurring Large Language Models to Explore the Open World

RAG

  • WebGPT:Browser-assisted question-answering with human feedback
  • WebGLM: Towards An Efficient Web-Enhanced Question Answering System with Human Preferences
  • WebCPM: Interactive Web Search for Chinese Long-form Question Answering ⭐️
  • REPLUG: Retrieval-Augmented Black-Box Language Models ⭐️
  • Query Rewriting for Retrieval-Augmented Large Language Models
  • RETA-LLM: A Retrieval-Augmented Large Language Model Toolkit
  • Atlas: Few-shot Learning with Retrieval Augmented Language Models
  • RRAML: Reinforced Retrieval Augmented Machine Learning
  • Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation
  • PDFTriage: Question Answering over Long, Structured Documents
  • SELF-RAG: LEARNING TO RETRIEVE, GENERATE, AND CRITIQUE THROUGH SELF-REFLECTION ⭐️
  • Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading ⭐️
  • Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP
  • Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks
  • Active Retrieval Augmented Generation
  • kNN-LM Does Not Improve Open-ended Text Generation
  • Can Retriever-Augmented Language Models Reason? The Blame Game Between the Retriever and the Language Model
  • Query2doc: Query Expansion with Large Language Models ⭐️
  • RLCF:Aligning the Capabilities of Large Language Models with the Context of Information Retrieval via Contrastive Feedback
  • Augmented Embeddings for Custom Retrievals
  • DORIS-MAE: Scientific Document Retrieval using Multi-level Aspect-based Queries
  • Learning to Filter Context for Retrieval-Augmented Generation
  • THINK-ON-GRAPH: DEEP AND RESPONSIBLE REASON- ING OF LARGE LANGUAGE MODEL ON KNOWLEDGE GRAPH
  • RA-DIT: RETRIEVAL-AUGMENTED DUAL INSTRUCTION TUNING
  • Query Expansion by Prompting Large Language Models ⭐️
  • CHAIN-OF-NOTE: ENHANCING ROBUSTNESS IN RETRIEVAL-AUGMENTED LANGUAGE MODELS
  • IAG: Induction-Augmented Generation Framework for Answering Reasoning Questions
  • T2Ranking: A large-scale Chinese Benchmark for Passage Ranking
  • Factuality Enhanced Language Models for Open-Ended Text Generation
  • FRESHLLMS: REFRESHING LARGE LANGUAGE MODELS WITH SEARCH ENGINE AUGMENTATION
  • KwaiAgents: Generalized Information-seeking Agent System with Large Language Models
  • Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence
  • Complex Claim Verification with Evidence Retrieved in the Wild
  • Retrieval-Augmented Generation for Large Language Models: A Survey
  • Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy
  • ChatQA: Building GPT-4 Level Conversational QA Models
  • RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
  • Benchmarking Large Language Models in Retrieval-Augmented Generation
  • HyDE:Precise Zero-Shot Dense Retrieval without Relevance Labels
  • PROMPTAGATOR : FEW-SHOT DENSE RETRIEVAL FROM 8 EXAMPLES
  • SYNERGISTIC INTERPLAY BETWEEN SEARCH AND LARGE LANGUAGE MODELS FOR INFORMATION RETRIEVAL
  • T-RAG: Lessons from the LLM Trenches
  • RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation
  • ASK THE RIGHT QUESTIONS:ACTIVE QUESTION REFORMULATION WITH REINFORCEMENT LEARNING [传统方案参考]
  • Query Expansion Techniques for Information Retrieval a Survey [传统方案参考]
  • Learning to Rewrite Queries [传统方案参考]
  • Managing Diversity in Airbnb Search[传统方案参考]
  • 新向量模型用于Recall和Ranking

LLM+KG

  • 综述类
  • KG用于大模型推理
    • Using Large Language Models for Zero-Shot Natural Language Generation from Knowledge Graphs
    • MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models
    • Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering
    • Domain Specific Question Answering Over Knowledge Graphs Using Logical Programming and Large Language Models
    • BRING YOUR OWN KG: Self-Supervised Program Synthesis for Zero-Shot KGQA
    • StructGPT: A General Framework for Large Language Model to Reason over Structured Data
  • 大模型用于KG构建
    • Enhancing Knowledge Graph Construction Using Large Language Models
    • LLM-assisted Knowledge Graph Engineering: Experiments with ChatGPT
    • ITERATIVE ZERO-SHOT LLM PROMPTING FOR KNOWLEDGE GRAPH CONSTRUCTION
    • Exploring Large Language Models for Knowledge Graph Completion

Humanoid Agents

  • HABITAT 3.0: A CO-HABITAT FOR HUMANS, AVATARS AND ROBOTS
  • Humanoid Agents: Platform for Simulating Human-like Generative Agents
  • Voyager: An Open-Ended Embodied Agent with Large Language Models
  • Shaping the future of advanced robotics
  • AUTORT: EMBODIED FOUNDATION MODELS FOR LARGE SCALE ORCHESTRATION OF ROBOTIC AGENTS
  • ROBOTIC TASK GENERALIZATION VIA HINDSIGHT TRAJECTORY SKETCHES

预训练数据(pretrain_data)

  • DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining
  • The Pile: An 800GB Dataset of Diverse Text for Language Modeling
  • CCNet: Extracting High Quality Monolingual Datasets fromWeb Crawl Data
  • WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models
  • CLUECorpus2020: A Large-scale Chinese Corpus for Pre-training Language Model
  • In-Context Pretraining: Language Modeling Beyond Document Boundaries

领域模型 (domain_llms)

  • MedGPT: Medical Concept Prediction from Clinical Narratives
  • BioGPT:Generative Pre-trained Transformer for Biomedical Text Generation and Mining
  • Galactia:A Large Language Model for Science
  • PubMed GPT: A Domain-specific large language model for biomedical text ⭐️
  • BloombergGPT: A Large Language Model for Finance
  • ChatDoctor:Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge
  • Med-PaLM:Large Language Models Encode Clinical Knowledge[V1,V2] ⭐️
  • Augmented Large Language Models with Parametric Knowledge Guiding
  • XuanYuan 2.0: A Large Chinese Financial Chat Model with Hundreds of Billions Parameters
  • ChatLaw Open-Source Legal Large Language Model ⭐️
  • MediaGPT : A Large Language Model For Chinese Media
  • SMILE: Single-turn to Multi-turn Inclusive Language Expansion via ChatGPT for Mental Health Support
  • KITLM: Domain-Specific Knowledge InTegration into Language Models for Question Answering
  • FinVis-GPT: A Multimodal Large Language Model for Financial Chart Analysis
  • EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task Tasks for E-commerce
  • FinGPT: Open-Source Financial Large Language Models
  • TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT
  • CFGPT: Chinese Financial Assistant with Large Language Model
  • Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-world Multi-turn Dialogue
  • LLEMMA: AN OPEN LANGUAGE MODEL FOR MATHEMATICS
  • CFBenchmark: Chinese Financial Assistant Benchmark for Large Language Model
  • InvestLM: A Large Language Model for Investment using Financial Domain Instruction Tuning
  • WeaverBird: Empowering Financial Decision-Making with Large Language Model, Knowledge Base, and Search Engine
  • FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design
  • MEDITAB: SCALING MEDICAL TABULAR DATA PREDICTORS VIA DATA CONSOLIDATION, ENRICHMENT, AND REFINEMENT
  • PLLaMa: An Open-source Large Language Model for Plant Science

LLM超长文本处理 (long_input)

  • 位置编码、注意力机制优化
  • 上文压缩排序方案
    • Lost in the Middle: How Language Models Use Long Contexts ⭐️
    • LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models
    • LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression ⭐️
    • Learning to Compress Prompts with Gist Tokens
    • Unlocking Context Constraints of LLMs: Enhancing Context Efficiency of LLMs with Self-Information-Based Content Filtering
    • LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration
  • 训练和模型架构方案
    • Never Train from Scratch: FAIR COMPARISON OF LONGSEQUENCE MODELS REQUIRES DATA-DRIVEN PRIORS
    • Soaring from 4K to 400K: Extending LLM’s Context with Activation Beacon
    • Never Lost in the Middle: Improving Large Language Models via Attention Strengthening Question Answering
    • Focused Transformer: Contrastive Training for Context Scaling
    • Effective Long-Context Scaling of Foundation Models
    • ON THE LONG RANGE ABILITIES OF TRANSFORMERS
    • Efficient Long-Range Transformers: You Need to Attend More, but Not Necessarily at Every Layer
    • POSE: EFFICIENT CONTEXT WINDOW EXTENSION OF LLMS VIA POSITIONAL SKIP-WISE TRAINING
    • LONGLORA: EFFICIENT FINE-TUNING OF LONGCONTEXT LARGE LANGUAGE MODELS
    • LongAlign: A Recipe for Long Context Alignment of Large Language Models
    • Data Engineering for Scaling Language Models to 128K Context
  • 效率优化
    • Efficient Attention: Attention with Linear Complexities
    • Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
    • HyperAttention: Long-context Attention in Near-Linear Time
    • FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
    • With Greater Text Comes Greater Necessity: Inference-Time Training Helps Long Text Generation

LLM长文本生成(long_output)

  • Re3 : Generating Longer Stories With Recursive Reprompting and Revision
  • RECURRENTGPT: Interactive Generation of (Arbitrarily) Long Text
  • DOC: Improving Long Story Coherence With Detailed Outline Control
  • Weaver: Foundation Models for Creative Writing
  • Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models

NL2SQL

  • 大模型方案
    • DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction ⭐️
    • C3: Zero-shot Text-to-SQL with ChatGPT ⭐️
    • SQL-PALM: IMPROVED LARGE LANGUAGE MODEL ADAPTATION FOR TEXT-TO-SQL
    • BIRD Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQL ⭐️
    • A Case-Based Reasoning Framework for Adaptive Prompting in Cross-Domain Text-to-SQL
    • ChatDB: AUGMENTING LLMS WITH DATABASES AS THEIR SYMBOLIC MEMORY
    • A comprehensive evaluation of ChatGPT’s zero-shot Text-to-SQL capability
    • Few-shot Text-to-SQL Translation using Structure and Content Prompt Learning
  • Domain Knowledge Intensive
    • Towards Knowledge-Intensive Text-to-SQL Semantic Parsing with Formulaic Knowledge
    • Bridging the Generalization Gap in Text-to-SQL Parsing with Schema Expansion
    • Towards Robustness of Text-to-SQL Models against Synonym Substitution
    • FinQA: A Dataset of Numerical Reasoning over Financial Data
  • others
    • RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL
    • MIGA: A Unified Multi-task Generation Framework for Conversational Text-to-SQL

Code Generation

  • Code Generation with AlphaCodium: From Prompt Engineering to Flow Engineering
  • Codeforces as an Educational Platform for Learning Programming in Digitalization
  • Competition-Level Code Generation with AlphaCode
  • CODECHAIN: TOWARDS MODULAR CODE GENERATION THROUGH CHAIN OF SELF-REVISIONS WITH REPRESENTATIVE SUB-MODULES

降低模型幻觉 (reliability)

  • Survey
    • Large language models and the perils of their hallucinations
    • Survey of Hallucination in Natural Language Generation
    • Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models
    • A Survey of Hallucination in Large Foundation Models
    • A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions
    • Calibrated Language Models Must Hallucinate
    • Why Does ChatGPT Fall Short in Providing Truthful Answers?
  • Prompt or Tunning
    • R-Tuning: Teaching Large Language Models to Refuse Unknown Questions
    • PROMPTING GPT-3 TO BE RELIABLE
    • ASK ME ANYTHING: A SIMPLE STRATEGY FOR PROMPTING LANGUAGE MODELS ⭐️
    • On the Advance of Making Language Models Better Reasoners
    • RefGPT: Reference → Truthful & Customized Dialogues Generation by GPTs and for GPTs
    • Rethinking with Retrieval: Faithful Large Language Model Inference
    • GENERATE RATHER THAN RETRIEVE: LARGE LANGUAGE MODELS ARE STRONG CONTEXT GENERATORS
    • Large Language Models Struggle to Learn Long-Tail Knowledge
  • Decoding Strategy
    • Trusting Your Evidence: Hallucinate Less with Context-aware Decoding ⭐️
    • SELF-REFINE:ITERATIVE REFINEMENT WITH SELF-FEEDBACK ⭐️
    • Enhancing Self-Consistency and Performance of Pre-Trained Language Models through Natural Language Inference
    • Inference-Time Intervention: Eliciting Truthful Answers from a Language Model
    • Enabling Large Language Models to Generate Text with Citations
    • Factuality Enhanced Language Models for Open-Ended Text Generation
    • KL-Divergence Guided Temperature Sampling
    • KCTS: Knowledge-Constrained Tree Search Decoding with Token-Level Hallucination Detection
    • CONTRASTIVE DECODING IMPROVES REASONING IN LARGE LANGUAGE MODEL
    • Contrastive Decoding: Open-ended Text Generation as Optimization
  • Probing and Detection
    • Automatic Evaluation of Attribution by Large Language Models
    • QAFactEval: Improved QA-Based Factual Consistency Evaluation for Summarization
    • Zero-Resource Hallucination Prevention for Large Language Models
    • LLM Lies: Hallucinations are not Bugs, but Features as Adversarial Examples
    • Language Models (Mostly) Know What They Know ⭐️
    • LM vs LM: Detecting Factual Errors via Cross Examination
    • Do Language Models Know When They’re Hallucinating References?
    • SELFCHECKGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
    • SELF-CONTRADICTORY HALLUCINATIONS OF LLMS: EVALUATION, DETECTION AND MITIGATION
    • Self-consistency for open-ended generations
    • Improving Factuality and Reasoning in Language Models through Multiagent Debate
    • Selective-LAMA: Selective Prediction for Confidence-Aware Evaluation of Language Models
    • Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs
  • Reviewing and Calibration
    • Truth-o-meter: Collaborating with llm in fighting its hallucinations
    • RARR: Researching and Revising What Language Models Say, Using Language Models
    • CRITIC: LARGE LANGUAGE MODELS CAN SELFCORRECT WITH TOOL-INTERACTIVE CRITIQUING
    • VALIDATING LARGE LANGUAGE MODELS WITH RELM
    • PURR: Efficiently Editing Language Model Hallucinations by Denoising Language Model Corruptions
    • Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback
    • Adaptive Chameleon or Stubborn Sloth: Unraveling the Behavior of Large Language Models in Knowledge Clashes
    • Woodpecker: Hallucination Correction for Multimodal Large Language Models
    • Zero-shot Faithful Factual Error Correction

大模型评估(evaluation)

  • 事实性评估
    • TRUSTWORTHY LLMS: A SURVEY AND GUIDELINE FOR EVALUATING LARGE LANGUAGE MODELS’ ALIGNMENT
    • TrueTeacher: Learning Factual Consistency Evaluation with Large Language Models
    • TRUE: Re-evaluating Factual Consistency Evaluation
    • FACTSCORE: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
    • KoLA: Carefully Benchmarking World Knowledge of Large Language Models
    • When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories
    • FACTOOL: Factuality Detection in Generative AI A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
  • 检测任务
    • Detecting Pretraining Data from Large Language Models
    • Scalable Extraction of Training Data from (Production) Language Models
    • Rethinking Benchmark and Contamination for Language Models with Rephrased Samples

推理优化(inference)

  • Fast Transformer Decoding: One Write-Head is All You Need
  • Fast Inference from Transformers via Speculative Decoding
  • GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
  • Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
  • SkipDecode: Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference
  • BatchPrompt: Accomplish more with less

模型知识编辑黑科技(model_edit)

  • ROME:Locating and Editing Factual Associations in GPT
  • Transformer Feed-Forward Layers Are Key-Value Memories
  • MEMIT: Mass-Editing Memory in a Transformer
  • MEND:Fast Model Editing at Scale
  • Editing Large Language Models: Problems, Methods, and Opportunities
  • Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch

模型合并和剪枝(model_merge)

  • Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM
  • DARE Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
  • EDITING MODELS WITH TASK ARITHMETIC
  • TIES-Merging: Resolving Interference When Merging Models
  • LM-Cocktail: Resilient Tuning of Language Models via Model Merging
  • SLICEGPT: COMPRESS LARGE LANGUAGE MODELS BY DELETING ROWS AND COLUMNS

Other Prompt Engineer(prompt_engineer)

  • Calibrate Before Use: Improving Few-Shot Performance of Language Models
  • In-Context Instruction Learning
  • LEARNING PERFORMANCE-IMPROVING CODE EDITS
  • Boosting Theory-of-Mind Performance in Large Language Models via Prompting
  • Generated Knowledge Prompting for Commonsense Reasoning
  • RECITATION-AUGMENTED LANGUAGE MODELS
  • kNN PROMPTING: BEYOND-CONTEXT LEARNING WITH CALIBRATION-FREE NEAREST NEIGHBOR INFERENCE
  • EmotionPrompt: Leveraging Psychology for Large Language Models Enhancement via Emotional Stimulus
  • Causality-aware Concept Extraction based on Knowledge-guided Prompting
  • LARGE LANGUAGE MODELS AS OPTIMIZERS

Multimodal

  • InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
  • Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models
  • PaLM-E: An Embodied Multimodal Language Model
  • LLava Visual Instruction Tuning
  • MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
  • TabLLM: Few-shot Classification of Tabular Data with Large Language Models
  • BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
  • mPLUG-Owl : Modularization Empowers Large Language Models with Multimodality
  • LVLM eHub: A Comprehensive Evaluation Benchmark for Large VisionLanguage Models
  • Mirasol3B: A Multimodal Autoregressive model for time-aligned and contextual modalities
  • Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models
  • AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
  • Sora tech report

Timeseries LLM

  • TimeGPT-1
  • Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook
  • TIME-LLM: TIME SERIES FORECASTING BY REPROGRAMMING LARGE LANGUAGE MODELS
  • Large Language Models Are Zero-Shot Time Series Forecasters
  • TEMPO: PROMPT-BASED GENERATIVE PRE-TRAINED TRANSFORMER FOR TIME SERIES FORECASTING
  • Generative Pre-Training of Time-Series Data for Unsupervised Fault Detection in Semiconductor Manufacturing
  • Lag-Llama: Towards Foundation Models for Time Series Forecasting
  • PromptCast: A New Prompt-based Learning Paradigm for Time Series Forecasting

Quanization

  • AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
  • LLM-QAT: Data-Free Quantization Aware Training for Large Language Models
  • LLM.int8() 8-bit Matrix Multiplication for Transformers at Scale
  • SmoothQuant Accurate and Efficient Post-Training Quantization for Large Language Models

Adversarial Attacking

  • Curiosity-driven Red-teaming for Large Language Models
  • Red Teaming Language Models with Language Models
  • EXPLORE, ESTABLISH, EXPLOIT: RED-TEAMING LANGUAGE MODELS FROM SCRATCH

Others

  • Pretraining on the Test Set Is All You Need 哈哈作者你是懂讽刺文学的
  • Learnware: Small Models Do Big
  • The economic potential of generative AI
  • A PhD Student’s Perspective on Research in NLP in the Era of Very Large Language Models

近年来,人工智能领域的发展迅猛,特别是在自然语言处理(NLP)领域,大型语言模型(LLM)和提示工程(Prompt Engineering)已经成为研究的热点。同时,开源数据集和模型的共享,以及AI生成内容(AIGC)的应用也在不断扩展。以下是对这些领域的一个简要总结:

Prompt Engineering(提示工程)

Prompt Engineering 是指设计和优化输入提示(prompts)以引导大型语言模型(LLM)生成特定输出的技术。这种方法利用了LLM在预训练阶段学习到的丰富知识,通过精心设计的提示来激发模型的潜能,从而在各种任务上实现零样本(zero-shot)或少样本(few-shot)学习。

关键点

  • 零样本学习:不依赖于特定任务的训练数据,仅通过提示来指导模型完成任务。
  • 少样本学习:使用少量标注数据进行快速调整,以适应新任务。
  • 提示设计:选择合适的提示对于提高模型性能至关重要,包括手工设计的固定提示和通过算法生成的动态提示。

开源数据&模型

开源数据集和模型的共享对于推动AI研究和应用至关重要。开源数据集提供了丰富的、多样化的数据资源,而开源模型则允许研究者和开发者在已有的基础上进行创新和改进。

关键点

  • 数据集:高质量的开源数据集,如WikiText、SQuAD等,为模型训练提供了基础。
  • 预训练模型:大型预训练模型,如BERT、GPT-3等,通过在海量数据上的训练,学习到了丰富的语言表示。
  • 社区贡献:研究者和开发者通过贡献数据集、模型和工具,共同推动了AI技术的进步。

AIGC(AI Generated Content)

AIGC 指的是使用人工智能技术自动生成内容,这包括文本、图像、音频和视频等。AIGC 应用的兴起,使得个性化内容的创作变得更加高效和便捷。

关键点

  • 内容创作:AIGC可以用于新闻撰写、小说创作、艺术作品生成等。
  • 个性化定制:根据用户的偏好和需求,生成定制化的内容。
  • 版权和伦理问题:AIGC 带来了版权、原创性和伦理等方面的挑战。

总结

Prompt Engineering、开源数据&模型以及AIGC 应用是人工智能领域相互关联的三个重要方面。Prompt Engineering 通过优化输入提示来提升LLM的性能,开源数据和模型的共享为AI研究提供了基础,而AIGC 应用则展示了AI技术在内容创作领域的潜力。随着技术的进步,这些领域将继续发展,为人类社会带来更多的创新和变革。

  • 24
    点赞
  • 15
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
AIGC(Artificial Intelligence Generated Content,人工智能生成内容)的重要性体现在以下几个方面: 内容创作效率提升: AIGC能够快速生成大量高质量的内容,包括文本、图像、音频、视频等,极大地提高了创作效率。这不仅降低了人力成本,也使得内容更新和迭代的速度加快,满足了信息爆炸时代人们对新鲜内容的高需求。 个性化和定制化服务: AIGC可以根据用户的需求和偏好自动生成个性化的内容。这种能力在教育、娱乐、营销等领域具有巨大价值,能够提供高度定制化的用户体验,增强用户黏性和满意度。 创新与发现新应用AIGC技术的不断发展和普及促进了新的应用场景和商业模式的诞生。通过降低开发门槛,更多的开发者和企业能够探索和实验AIGC应用,有可能催生出全新的现象级应用和服务。 商业效益增长: AIGC在数字商业化领域具有显著优势。它能够赋能营销策略,提高广告和推广的精准度和效果,从而带动企业收入的增长。同时,通过自动化的内容生成,企业可以节省资源并专注于核心业务的创新和发展。 知识传播与教育: AIGC能够生成教育材料、教程和知识摘要,帮助人们更高效地获取和学习新知识。在教育领域,AIGC可以个性化定制学习路径和内容,适应不同学生的学习速度和方式。 行业效率优化: 在保险、出版、法律等行业,AIGC可以自动处理大量的文档、报告和合同,提高工作效率,减少人为错误,并提供数据分析和决策支持。 学术研究与伦理考量: AIGC在学术研究中的应用需要遵循特定的使用边界和准则,以防止学术不端行为。明确的指南有助于确保研究成果的真实性和可信度,同时推动AI技术在科研领域的健康发展。
LLMprompt是通过模板定义的,该模板包含用于描述和表示任务输入和输出的占位符。通过prompt,我们可以控制LLM在不同任务上的应用。一个常见的prompt是使用问答形式的提示链,其中包括一个question()提示符用于将输入转换为问题,以及一个answer()提示符用于回答生成的问题。不同的提示链可以导致对输入的不同预测。因此,prompt的设计对于LLM的性能至关重要。 ASK ME ANYTHING PROMPTING (AMA)方法提出了一种简单而有效的方法来设计高质量的prompt。该方法通过产生多个有效的但不完美的prompt,然后将它们聚合起来,最终生成高质量的prompt。 这种方法可以减少开源LLM的参数数量,并取得比GPT3-175B更好的Few-Shot性能。 由于prompt的微小变化可能导致LLM性能的较大变化,因此prompt设计的重要性不容忽视。<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* *2* *3* [最新 | Ask Me Anything 一种提示(Prompt)语言模型的简单策略(斯坦福大学 & 含源码)](https://blog.csdn.net/yinizhilianlove/article/details/127215208)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 100%"] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

代码讲故事

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值