qwen API调用

GitHub - QwenLM/vllm-gptq: A high-throughput and memory-efficient inference and serving engine for LLMs

pip install fschat

python -m fastchat.serve.controller

python -m fastchat.serve.vllm_worker --model-path $model_path --tensor-parallel-size 1 --trust-remote-code

python -m fastchat.serve.openai_api_server --host localhost --port 8000

pip install --upgrade openai=0.28

import openai
# to get proper authentication, make sure to use a valid key that's listed in
# the --api-keys flag. if no flag value is provided, the `api_key` will be ignored.
openai.api_key = "EMPTY"
openai.api_base = "http://localhost:8000/v1"

model = "qwen"
call_args = {
    'temperature': 1.0,
    'top_p': 1.0,
    'top_k': -1,
    'max_tokens': 2048, # output-len
    'presence_penalty': 1.0,
    'frequency_penalty': 0.0,
}
# create a chat completion
completion = openai.ChatCompletion.create(
  model=model,
  messages=[{"role": "user", "content": "Hello! What is your name?"}],
  **call_args
)
# print the completion
print(completion.choices[0].message.content)
 python -m fastchat.serve.openai_api_server --host IP --port 8000    

相关推荐

  1. JVM指令:方法调用之解析调用

    2024-03-12 03:48:02       29 阅读
  2. thinkphp控制器调用脚本

    2024-03-12 03:48:02       53 阅读
  3. linux系统调用介绍

    2024-03-12 03:48:02       65 阅读
  4. OpenFeign远程调用实例

    2024-03-12 03:48:02       58 阅读
  5. Go HTTP 调用(下)

    2024-03-12 03:48:02       54 阅读
  6. 调用链概念

    2024-03-12 03:48:02       47 阅读

最近更新

  1. docker php8.1+nginx base 镜像 dockerfile 配置

    2024-03-12 03:48:02       98 阅读
  2. Could not load dynamic library ‘cudart64_100.dll‘

    2024-03-12 03:48:02       106 阅读
  3. 在Django里面运行非项目文件

    2024-03-12 03:48:02       87 阅读
  4. Python语言-面向对象

    2024-03-12 03:48:02       96 阅读

热门阅读

  1. 【MyBatis-Plus 常用注解详解】

    2024-03-12 03:48:02       39 阅读
  2. react hook: useLayoutEffect

    2024-03-12 03:48:02       46 阅读
  3. 如何优雅的比较两个对象是否相等

    2024-03-12 03:48:02       43 阅读
  4. 在并发场景如何正确的使用锁机制呢?

    2024-03-12 03:48:02       44 阅读
  5. 7-Zip:一款免费开源但强大的压缩软件

    2024-03-12 03:48:02       47 阅读
  6. WPF实现一个表格数据从cs获取动态渲染

    2024-03-12 03:48:02       49 阅读
  7. 存储日期,该如何抉择呢

    2024-03-12 03:48:02       35 阅读
  8. SpringBoot集成Swagger3.0

    2024-03-12 03:48:02       37 阅读
  9. Linux——信号处理

    2024-03-12 03:48:02       42 阅读