【S2ST】PolyVoice: Language Models for Speech to Speech Translation

PolyVoice: Language Models for Speech to Speech Translation


LM-based method in S2ST

contributions

  • Decoder-only model for speech2speech translation.
  • Unit-based audio LM predicts the SoundStream Codec

Overview of PolyVoice

在这里插入图片描述two LM-based components: a S2UT front-end for translation and a U2S back-end for synthesis.
An extra language model for duration prediction.

  • Semantic unit are extracted by mhubert
  • Acoustic units are soundstream codec(residual vector quantizer), using a autoregressive model and a non-autoregressive model.

相关推荐

  1. 音频I<span style='color:red;'>2</span><span style='color:red;'>S</span>

    音频I2S

    2024-03-10 08:38:03      49 阅读

最近更新

  1. docker php8.1+nginx base 镜像 dockerfile 配置

    2024-03-10 08:38:03       75 阅读
  2. Could not load dynamic library ‘cudart64_100.dll‘

    2024-03-10 08:38:03       80 阅读
  3. 在Django里面运行非项目文件

    2024-03-10 08:38:03       64 阅读
  4. Python语言-面向对象

    2024-03-10 08:38:03       75 阅读

热门阅读

  1. [力扣 Hot100]Day46 二叉树展开为链表

    2024-03-10 08:38:03       39 阅读
  2. 面试中如何介绍zookeeper的ZAB协议

    2024-03-10 08:38:03       36 阅读
  3. .Net Core/.net 6/.Net 8 实现Mqtt服务器

    2024-03-10 08:38:03       31 阅读
  4. 【杂言】迟到的 2024 展望

    2024-03-10 08:38:03       37 阅读
  5. 程序员如何选择职业赛道?

    2024-03-10 08:38:03       40 阅读
  6. rabbitmq4

    rabbitmq4

    2024-03-10 08:38:03      32 阅读
  7. 题目 1908: 蓝桥杯-矩阵相乘

    2024-03-10 08:38:03       34 阅读
  8. 3.Rust数据类型

    2024-03-10 08:38:03       35 阅读
  9. 【C++ 学习】C++ 传值 传指针 传引用

    2024-03-10 08:38:03       36 阅读
  10. 防抖与节流

    2024-03-10 08:38:03       39 阅读
  11. Lua 脚本语言基础语法及应用

    2024-03-10 08:38:03       31 阅读