4bit/8bit 启动 Mixtral 8*7B 大语言模型

2024-01-18 16:00:04
开发
35

4bit/8bit 启动 Mixtral 8*7B 大语言模型

0. 背景
1. 修改代码

0. 背景

个人电脑配置实在难以以 float16 运行 Mixtral 8*7B 大语言模型，所以参数 4bit 或者 8bit 来启动。

实际测试结果，4bit 时推理速度明显变快了，8bit 时推理也非常慢。

使用的推理框架时 fastchat。

1. 修改代码

vi fastchat/model/model_adapter.py

修改前，

class MistralAdapter(BaseModelAdapter):
    """The model adapter for Mistral AI models"""

    def match(self, model_path: str):
        return "mistral" in model_path.lower() or "mixtral" in model_path.lower()

    def load_model(self, model_path: str, from_pretrained_kwargs: dict):
        model, tokenizer = super().load_model(model_path, from_pretrained_kwargs)
        model.config.eos_token_id = tokenizer.eos_token_id
        model.config.pad_token_id = tokenizer.pad_token_id
        return model, tokenizer

修改后，

class MistralAdapter(BaseModelAdapter):
    """The model adapter for Mistral AI models"""

    def match(self, model_path: str):
        return "mistral" in model_path.lower() or "mixtral" in model_path.lower()

    def load_model(self, model_path: str, from_pretrained_kwargs: dict):
        # model, tokenizer = super().load_model(model_path, from_pretrained_kwargs)
        tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
        if "mixtral" in model_path.lower():
            model = AutoModelForCausalLM.from_pretrained(
                model_path,
                low_cpu_mem_usage=True,
                trust_remote_code=True,
                # attn_implementation="flash_attention_2",
                # load_in_8bit=True,
                load_in_4bit=True,
                **from_pretrained_kwargs,
            )
        else:
            model = AutoModelForCausalLM.from_pretrained(
                model_path,
                low_cpu_mem_usage=True,
                trust_remote_code=True,
                **from_pretrained_kwargs,
            )
        model.config.eos_token_id = tokenizer.eos_token_id
        model.config.pad_token_id = tokenizer.pad_token_id
        return model, tokenizer

完结！

原文地址:https://blog.csdn.net/engchina/article/details/135581973 本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：https://www.suanlizi.com/kf/1747891588094365696.html 如若内容造成侵权/违法违规/事实不符，请联系《酸梨子》网邮箱：1419361763@qq.com进行投诉反馈，一经查实，立即删除！

阅读全部

4bit/8bit 启动 Mixtral 8*7B 大语言模型

4bit/8bit 启动 Mixtral 8*7B 大语言模型

0. 背景

1. 修改代码

相关推荐

最近更新

热门阅读