sigir + recsys + cikm + acl + AAAI 2024论文笔记

SIGIR

Scaling Laws For Dense Retrieval

探索信息检索的scaling laws
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

Large Language Models for Intent-Driven Session Recommendations

在这里插入图片描述
Motivation: 1. all sessions possess a consistent and fixed number of intentions. 2. they are limited to learning latent intentions solely within the embedding space, greatly impeding the transparency of ISR.

在这里插入图片描述

  1. prompt initialize,初步生成一个ranked list
  2. prompt optimization 修正错误case,并推测原因
  3. prompt selection 选择最好的ranked list

LLaRA: Large Language-Recommendation Assistant

Generative Retrieval as Multi-Vector Dense Retrieval

Breaking the Length Barrier: LLM-Enhanced CTR Prediction in Long Textual User Behaviors

在这里插入图片描述
为了减轻llm上线的压力,将LLM层级拆分,low layer冻结负责初步理解每个item具体的信息,聚合为原子表征,压缩长度。
high layer处理每种类型的序列原子表征,以及candidate item,最后全部拼在一起过ctr head。
该模型每天在50M ctr数据上更新。user representation以及item representation可以离线计算。
在这里插入图片描述

在这里插入图片描述

Data-efficient Fine-tuning for LLM-based Recommendation

选择最有效的数据来训LLM。已有工作依赖启发式方式,或者需要在大规模数据上优化。

想要实现两个目标(1)高准确(2)低代价
小模型计算influence score 来计算数据影响精度;effort score寻找对于LLM的难样本。
在这里插入图片描述

Towards a Search Engine for Machines: Unified Ranking for Multiple Retrieval-Augmented Large Language Models

GraphGPT: Graph Instruction Tuning for Large Language Models

graph grounding to link textual and graph structures
在这里插入图片描述
图表征与llm对齐

LLMGR: Large Language Model-based Generative Retrieval in Alipay Search

在这里插入图片描述
不采用多阶段召回,利用LLM中的知识直接产生推荐结果

CoSearchAgent: A Lightweight Collaborative Search Agent with Large Language Models

在这里插入图片描述

Sequential Recommendation with Latent Relations based on Large Language Model

以往考虑物品关系的推荐模型使用kg中的关系,存在稀疏以及需要人工定义的问题。
提出利用LLM来提供新的物品间关系。
在这里插入图片描述
知识图谱

Self-Improving Teacher Cultivates Beer Student: Distillation Calibration for Multimodal Large Language Models

多模态知识蒸馏

Dynamic In-Context Learning from Nearest Neighbors for Bundle Generation

LLaRA: Large Language-Recommendation Assistant

在这里插入图片描述
在这里插入图片描述

Unsupervised Large Language Model Alignment for Information Retrieval via Contrastive Feedback

Dimension Importance Estimation for Dense Information Retrieval

Graded Relevance Scoring of Written Essays with Dense Retrieval

I3: Intent-Introspective Retrieval Conditioned on Instructions

Drop your Decoder: Pre-training with Bag-of-Word Prediction for Dense Passage Retrieval.

Generative Retrieval via Term Set Generation

EASE-DR: Enhanced Sentence Embeddings for Dense Retrieval

Fine-Tuning LLaMA for Multi-Stage Text Retrieval

Large Language Models and Future of Information Retrieval: Opportunities and Challenges

C-Pack: Packed Resources For General Chinese Embeddings

提供各类资源用于训练中文embedding
在这里插入图片描述
训练3阶段,pretrain、通用数据微调、task specific微调

Recsys

Scaling Law of Large Sequential Recommendation Models

在纯id-based序列推荐任务上探索scaling law。
在这里插入图片描述
在这里插入图片描述

CIKM

Large Language Models Enhanced Collaborative Filtering

先微调大模型让
在这里插入图片描述
ACL

Grounding Language Model with Chunking-Free In-Context Retrieval

Llama2Vec: Unsupervised Adaptation of Large Language Models for Dense Retrieval

在这里插入图片描述

Spiral of Silence: How is Large Language Model Killing Information Retrieval?—A Case Study on Open Domain Question Answering

Synergistic Interplay between Search and Large Language Models for Information Retrieval

Search-Adaptor: Embedding Customization for Information Retrieval

Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval

Distillation Enhanced Generative Retrieval

Token-wise Influential Training Data Retrieval for Large Language Models

Generalizing Conversational Dense Retrieval via LLM-Cognition Data Augmentation

ADAM: Dense Retrieval Distillation with Adaptive Dark Examples

VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval

History-Aware Conversational Dense Retrieval

Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models

A Multi-Task Embedder For Retrieval Augmented LLMs

DAPR: A Benchmark on Document-Aware Passage Retrieval

DADA: Distribution-Aware Domain Adaptation of PLMs for Information Retrieval

Retrieval-Augmented Retrieval: Large Language Models are Strong Zero-Shot Retriever

ContextBLIP: Doubly Contextual Alignment for Contrastive Image Retrieval from Linguistically Complex Descriptions

D2LLM: Decomposed and Distilled Large Language Models for Semantic Search

单塔蒸馏双塔,在llm上增加模块,而不训练llm,为了保证精度以及效率
在这里插入图片描述

RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking

在这里插入图片描述

In-Batch Negatives for Knowledge Distillation with Tightly-Coupled Teachers for Dense Retrieval

在这里插入图片描述

AAAI

Fine-Grained Distillation for Long Document Retrieval

TwinBERT: Distilling Knowledge to Twin-Structured BERT Models for E€icient Retrieval

Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback

Dense Text Retrieval based on Pretrained Language Models: A Survey

Optimizing Dense Retrieval Model Training with Hard Negatives

用理论证明hard negatives的优势,能够更好地让模型学会高位的排序

Less is More: Pre-train a Strong Text Encoder for Dense Retrieval Using a Weak Decoder

有关模型大小的理论证明,证明decoder需要小

Dataset Regeneration for Sequential Recommendation

聚焦于将数据集重建,使得相同模型学习之后的效果更好

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值