大语言模型-大模型基础文献

大模型基础

1、Attention Is All You Need https://arxiv.org/abs/1706.03762

attention is all you need

2、Sequence to Sequence Learning with Neural Networks https://arxiv.org/abs/1409.3215

基于深度神经网络(DNN)的序列到序列学习方法

3、Neural Machine Translation by Jointly Learning to Align and Translate https://arxiv.org/abs/1409.0473

4、BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding https://arxiv.org/abs/1810.04805

5、Scaling Laws for Neural Language Models https://arxiv.org/pdf/2001.08361.pdf

6、Emergent Abilities of Large Language Models https://openreview.net/pdf?id=yzkSU5zdwD

Emergent Abilities of Large Language Models

7、Training Compute-Optimal Large Language Models (ChinChilla scaling law) https://arxiv.org/abs/2203.15556

8、Scaling Instruction-Finetuned Language Models https://arxiv.org/pdf/2210.11416.pdf

Direct Preference Optimization:

9、Your Language Model is Secretly a Reward Model https://arxiv.org/pdf/2305.18290.pdf

10、Progress measures for grokking via mechanistic interpretability https://arxiv.org/abs/2301.05217

11、Language Models Represent Space and Time https://arxiv.org/abs/2310.02207

12、GLaM: Efficient Scaling of Language Models with Mixture-of-Experts https://arxiv.org/abs/2112.06905

13、Adam: A Method for Stochastic Optimization https://arxiv.org/abs/1412.6980

14、Efficient Estimation of Word Representations in Vector Space (Word2Vec) https://arxiv.org/abs/1301.3781

15、Distributed Representations of Words and Phrases and their Compositionality https://arxiv.org/abs/1310.4546

attention is all you need

基于深度神经网络(DNN)的序列到序列学习方法

Emergent Abilities of Large Language Models

相关推荐

  1. 语言模型-模型基础文献

    2024-01-29 12:54:01       33 阅读
  2. 语言模型--能力

    2024-01-29 12:54:01       30 阅读
  3. 语言模型--危害

    2024-01-29 12:54:01       39 阅读
  4. 语言模型--数据

    2024-01-29 12:54:01       44 阅读
  5. 语言模型--引言

    2024-01-29 12:54:01       31 阅读

最近更新

  1. TCP协议是安全的吗?

    2024-01-29 12:54:01       16 阅读
  2. 阿里云服务器执行yum,一直下载docker-ce-stable失败

    2024-01-29 12:54:01       16 阅读
  3. 【Python教程】压缩PDF文件大小

    2024-01-29 12:54:01       15 阅读
  4. 通过文章id递归查询所有评论(xml)

    2024-01-29 12:54:01       18 阅读

热门阅读

  1. mysql优化案例

    2024-01-29 12:54:01       31 阅读
  2. unicloud-db组件

    2024-01-29 12:54:01       32 阅读
  3. 了解云原生

    2024-01-29 12:54:01       36 阅读
  4. php小数四舍五入、向上取整、向下取整

    2024-01-29 12:54:01       32 阅读
  5. 动态设置小程序IOS底部小黑条

    2024-01-29 12:54:01       31 阅读
  6. torch.matmul和torch.bmm区别

    2024-01-29 12:54:01       40 阅读
  7. React Hooks 详解之 useState

    2024-01-29 12:54:01       37 阅读
  8. 【Spring Boot 3】【@Scheduled】动态删除定时任务

    2024-01-29 12:54:01       40 阅读
  9. uniapp+vue3+Ts(小兔仙项目)

    2024-01-29 12:54:01       36 阅读