Jetson AGX Orin平台搭建whisper语音转写实时录音

1:下载whisper C++版本

whisper.cpp

编译WHISPER_CUDA=1 make -j

错误

A: 平台不支持,修改Makefile,查看支持的计算ARCH_FLAG

nvcc fatal   : Value 'all' is not defined for option 'gpu-architecture'
make: *** [Makefile:290: ggml-cuda.o] Error 1
make: *** Waiting for unfinished jobs....
make: *** [Makefile:287: ggml-cuda/getrows.o] Error 1
nvcc fatal   : Value 'all' is not defined for option 'gpu-architecture'
nvcc fatal   : Value 'all' is not defined for option 'gpu-architecture'
make: *** [Makefile:287: ggml-cuda/diagmask.o] Error 1
make: *** [Makefile:287: ggml-cuda/mmvq.o] Error 1
nvcc fatal   : Value 'all' is not defined for option 'gpu-architecture'
make: *** [Makefile:287: ggml-cuda/quantize.o] Error 1
nvidia@ubuntu:~/TTS/whisper.cpp$ nvcc --list-gpu-arch
compute_35
compute_37
compute_50
compute_52
compute_53
compute_60
compute_61
compute_62
compute_70
compute_72
compute_75
compute_80
compute_86
compute_87
ifdef WHISPER_CUDA
	ifeq ($(shell expr $(NVCC_VERSION) \>= 11.6), 1)
		CUDA_ARCH_FLAG ?= native
	else
		CUDA_ARCH_FLAG ?= compute_87
	endif

B:错误 修改Makefile 339行开始注释掉

CFLAGS   += -mcpu=native
make: CFLAGS: Command not found
# ifneq ($(filter aarch64%,$(UNAME_M)),)
# 	CFLAGS   += -mcpu=native
# 	CXXFLAGS += -mcpu=native
# endif

这样编译可通过了, Steam是实时转写的,用大模型效果会好点

模型下载用github的脚本下载会报错,可以在以下链接下载

https://huggingface.co/ggerganov/whisper.cpp
https://ggml.ggerganov.com
WHISPER_CUDA=1 make -j

WHISPER_CUDA=1 make stream -j

./main -m models/ggml-model-whisper-base.bin -f ../news.wav -l Chinese

./stream -m ./models/ggml-model-whisper-base.bin -t 6 --step 0 --length 15000 -vth 1 -l chinese -c 1

很多应用APP都在example目录

1:语音转写需要转成16bit的

ffmpeg -i input.mp3 -ar 16000 -ac 1 -c:a pcm_s16le output.wav

2:转写效果

nvidia@ubuntu:~/TTS/whisper.cpp$ time ./main -m models/ggml-model-whisper-base.bin -f ../news.wav -l Chinese -pp -ps
whisper_init_from_file_with_params_no_state: loading model from 'models/ggml-model-whisper-base.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head  = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 512
whisper_model_load: n_text_head   = 8
whisper_model_load: n_text_layer  = 6
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 2 (base)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs       = 99
whisper_backend_init: using CUDA backend
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:   no
ggml_cuda_init: CUDA_USE_TENSOR_CORES: yes
ggml_cuda_init: found 1 CUDA devices:
  Device 0: Orin, compute capability 8.7, VMM: yes
whisper_model_load:    CUDA0 total size =   147.37 MB
whisper_model_load: model size    =  147.37 MB
whisper_backend_init: using CUDA backend
whisper_init_state: kv self size  =   16.52 MB
whisper_init_state: kv cross size =   18.43 MB
whisper_init_state: compute buffer (conv)   =   16.39 MB
whisper_init_state: compute buffer (encode) =  132.07 MB
whisper_init_state: compute buffer (cross)  =    4.78 MB
whisper_init_state: compute buffer (decode) =   96.48 MB

system_info: n_threads = 4 / 12 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | METAL = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 | CUDA = 1 | COREML = 0 | OPENVINO = 0

main: processing '../news.wav' (3134182 samples, 195.9 sec), 4 threads, 1 processors, 5 beams + best of 5, lang = chinese, task = transcribe, timestamps = 1 ...


[00:00:00.000 --> 00:00:04.720]  [_BEG_]早啊 新聞來了[_TT_236]

[00:02:51.080 --> 00:02:54.200]  [_BEG_]吉林東部遼寧東北部新疆伊莉河谷[_TT_156]
[00:02:54.200 --> 00:02:58.840]  貴州南部雲南東部廣西西部等地部份地區有中道大雨[_TT_388]
[00:02:58.840 --> 00:03:04.120]  其中貴州西南部雲南東北部廣西西北部等地局部地區有暴雨[_TT_652]
[00:03:04.120 --> 00:03:08.320]  感謝關注央視新聞[_TT_862]
[00:03:08.320 --> 00:03:11.160]  更多資訊可以下載央視新聞客戶專[_TT_1004]
[00:03:11.160 --> 00:03:12.520]  我們明天早上見[_TT_1072]
whisper_print_progress_callback: progress =  98%
[00:03:12.520 --> 00:03:15.520]  [_BEG_]祝祝祝祝祝祝祝祝祝祝祝祝祝祝[_TT_150]
whisper_print_progress_callback: progress =  99%


whisper_print_timings:     load time =   221.28 ms
whisper_print_timings:     fallbacks =   0 p /   1 h
whisper_print_timings:      mel time =   167.07 ms
whisper_print_timings:   sample time =  3298.24 ms /  5411 runs (    0.61 ms per run)
whisper_print_timings:   encode time =  1261.24 ms /     8 runs (  157.66 ms per run)
whisper_print_timings:   decode time =     0.00 ms /     1 runs (    0.00 ms per run)
whisper_print_timings:   batchd time =  5198.60 ms /  5377 runs (    0.97 ms per run)
whisper_print_timings:   prompt time =   111.11 ms /  1260 runs (    0.09 ms per run)
whisper_print_timings:    total time = 10290.02 ms

real    0m10.412s
user    0m6.978s
sys     0m0.526s

相关推荐

  1. Jetson AGX Orin平台whisper语音写实录音

    2024-05-11 15:04:03       10 阅读
  2. 本地部署whisper模型(语音文字)

    2024-05-11 15:04:03       36 阅读
  3. 2023-12-27 语音文字的whisper应用部署

    2024-05-11 15:04:03       42 阅读
  4. HPC平台

    2024-05-11 15:04:03       46 阅读

最近更新

  1. TCP协议是安全的吗?

    2024-05-11 15:04:03       18 阅读
  2. 阿里云服务器执行yum,一直下载docker-ce-stable失败

    2024-05-11 15:04:03       19 阅读
  3. 【Python教程】压缩PDF文件大小

    2024-05-11 15:04:03       18 阅读
  4. 通过文章id递归查询所有评论(xml)

    2024-05-11 15:04:03       20 阅读

热门阅读

  1. 美国基金会注册优势和流程

    2024-05-11 15:04:03       10 阅读
  2. web server apache tomcat11-34-Ahead of Time compilation support

    2024-05-11 15:04:03       14 阅读
  3. QT day2

    QT day2

    2024-05-11 15:04:03      9 阅读
  4. Web3 Tools - 助记词生成(完整代码)

    2024-05-11 15:04:03       12 阅读
  5. 《自卑与超越》

    2024-05-11 15:04:03       10 阅读
  6. Python文件转exe文件

    2024-05-11 15:04:03       8 阅读
  7. 摘要Summaries--课时五(Lesson 5)

    2024-05-11 15:04:03       8 阅读
  8. tokenize

    tokenize

    2024-05-11 15:04:03      9 阅读
  9. HTTP 报文详解

    2024-05-11 15:04:03       9 阅读
  10. final关键字

    2024-05-11 15:04:03       11 阅读
  11. Vue的生命周期

    2024-05-11 15:04:03       9 阅读