- 博客(473)
- 资源 (1)
- 收藏
- 关注
原创 [Arxiv 2024] Self-Rewarding Language Models
[Arxiv 2024] Self-Rewarding Language Models
2024-08-28 11:42:02 1034
原创 [NeurIPS 2024] Self-Refine: Iterative Refinement with Self-Feedback
[NeurIPS 2024] Self-Refine: Iterative Refinement with Self-Feedback
2024-08-25 10:57:32 287
原创 [ACL 2024] Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning
[ACL 2024] Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning
2024-08-23 00:46:47 728
原创 [ACL 2024] Revisiting Knowledge Distillation for Autoregressive Language Models
[ACL 2024] Revisiting Knowledge Distillation for Autoregressive Language Models
2024-08-21 10:55:25 763
原创 [Arxiv 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Spec Dec
[Arxiv 2024] TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Spec Dec
2024-08-05 15:59:28 301
原创 [Arxiv 2024] EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees
[Arxiv 2024] EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees
2024-08-05 15:09:40 618
原创 [ICLR 2024] On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes
[ICLR 2024] On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes
2024-08-04 21:40:55 956
原创 [ACL 2023] Distilling Step-by-Step! Outperforming LLMs with Less Data and Smaller Model
[ACL 2023] Distilling Step-by-Step! Outperforming LLMs with Less Data and Smaller Model
2024-08-04 11:31:14 839
原创 [NeurIPS 2022] Chain-of-thought prompting elicits reasoning in large language models
[NeurIPS 2022] Chain-of-thought prompting elicits reasoning in large language models
2024-08-04 09:54:14 168
原创 Multi-Head Latent Attention: Boosting Inference Efficiency
Multi-Head Latent Attention: Boosting Inference Efficiency
2024-08-01 16:17:48 910
原创 LLM Preference Alignment (PPO, DPO, SimPO)
LLM Preference Alignment (PPO, DPO, SimPO)
2024-08-01 11:18:05 717
原创 Introduction to Deep Reinforcement Learning (Policy Gradient, Actor-Critic, PPO)
Introduction to Deep Reinforcement Learning (Policy Gradient, Actor-Critic, PPO)
2024-07-30 10:48:51 987
原创 Introduction to popular LLM components
Introduction to popular LLM components
2024-06-05 16:51:43 599
原创 [NeurIPS 2022] FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
[NeurIPS 2022] FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
2024-05-29 15:47:34 1037
原创 一个小技巧轻松提升量化精度!IntactKV:保持关键词元无损的大语言模型量化方法
本文介绍我们针对大语言模型量化的工作 IntactKV,可以作为插件有效提升 GPTQ、AWQ、QuaRot 等现有主流量化方法效果。论文作者来自清华大学、华为诺亚、中科院自动化所和香港中文大学。论文代码已经开源,欢迎大家使用!
2024-05-29 15:07:29 1128
原创 [SC 2020] ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
[SC 2020] ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
2024-05-10 14:52:27 633
原创 [Blog 2023] Flash-Decoding for long-context inference
[Blog 2023] Flash-Decoding for long-context inference
2024-05-07 21:08:31 480
原创 [ICLR 2024] FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
[ICLR 2024] FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning
2024-05-07 18:32:36 702
原创 [Arxiv 2023] GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
[Arxiv 2023] GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints
2023-11-19 15:47:18 36
原创 [Arxiv 2019] Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
[Arxiv 2019] Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
2023-09-27 14:35:36 509 1
原创 [ICLR 2023] LPT: Long-tailed Prompt Tuning for Image Classification
[ICLR 2023] LPT: Long-tailed Prompt Tuning for Image Classification
2023-08-21 16:00:20 1869
原创 [NeurIPS 2019] GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
[NeurIPS 2019] GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
2023-06-03 19:55:38 56
原创 [ECCV 2022] VL-LTR: Learning Class-wise Visual-Linguistic Representation for LTR
[ECCV 2022] VL-LTR: Learning Class-wise Visual-Linguistic Representation for LTR
2023-05-29 20:44:22 357
原创 [NeurIPS 2022] Relational Proxies: Emergent Relationships as Fine-Grained Discriminators
[NeurIPS 2022] Relational Proxies: Emergent Relationships as Fine-Grained Discriminators
2023-05-11 16:29:11 193 1
原创 [CVPR 2023] HIER: Metric Learning Beyond Class Labels via Hierarchical Regularization
[CVPR 2023] HIER: Metric Learning Beyond Class Labels via Hierarchical Regularization
2023-05-05 10:52:55 333
原创 [NeurIPS 2019] Hyperspherical Prototype Networks
[NeurIPS 2019] Hyperspherical Prototype Networks
2023-03-28 23:29:48 225
原创 [Arxiv 2022] InstructGPT: Training language models to follow instructions with human feedback
[Arxiv 2022] InstructGPT: Training language models to follow instructions with human feedback
2023-03-25 00:46:33 215
原创 [Arxiv 2022] HIRL: A General Framework for Hierarchical Image Representation Learning
[Arxiv 2022] HIRL: A General Framework for Hierarchical Image Representation Learning
2023-03-12 16:10:08 131
原创 [CVPR 2022] HCSC: hierarchical contrastive selective coding
[CVPR 2022] HCSC: hierarchical contrastive selective coding
2023-03-12 14:39:18 205
原创 [NIPS 2017] Improved Training of Wasserstein GANs (WGAN-GP)
[NIPS 2017] Improved Training of Wasserstein GANs
2023-03-11 14:06:06 617 4
原创 [ICML 2017] Wasserstein Generative Adversarial Networks (WGAN)
[ICML 2017] Wasserstein Generative Adversarial Networks (WGAN)
2023-03-11 09:02:45 752
原创 [ICLR 2016] Unsupervised representation learning with DCGANs
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks目录IntroductionApproach and Model ArchitectureIntroductionMotivation and Contributions(1) Representation learning from unlabled dataLearning reusable.
2023-03-10 20:36:10 676
原创 [Arxiv 2023] Hyperbolic Contrastive Learning
[Arxiv 2023] Hyperbolic Contrastive Learning
2023-03-06 09:28:04 399
原创 [CVPR 2022] Balanced Contrastive Learning for Long-Tailed Visual Recognition
[CVPR 2022] Balanced Contrastive Learning for Long-Tailed Visual Recognition
2023-03-04 20:32:49 1291
原创 [NeurIPS 2020] Supervised Contrastive Learning
[NeurIPS 2020] Supervised Contrastive Learning
2023-03-02 16:42:59 838
原创 [ACM MM 2021] RAMS-Trans: Recurrent Attention Multi-scale Transformer for FGVC
[ACM MM 2021] RAMS-Trans: Recurrent Attention Multi-scale Transformer for FGVC
2023-02-28 09:35:11 293
软件加密解密.rar
2021-02-07
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人