git clone https://github.com/microsoft/DeepSpeed.git
然后需要修改一下代码:
1、打开目录下文件:csrc/quantization/pt_binding.cpp
将其中第 203 行的
std::vector<long int> 改为: std::vector<long long> 2、打开目录下的文件,csrc/transformer/inference/csrc/pt_binding.cpp 其中第534行:
auto prev_key = torch::from_blob(workspace + offset, {bsz, heads, all_tokens, k}, {hidden_dim * InferenceContext::Instance().GetMaxTokenLength(), k * InferenceContext::Instance().GetMaxTokenLength(), k, 1}, options);
添加上unsigned,即改为:
auto prev_key = torch::from_blob(workspace + offset, {bsz, heads, all_tokens, k}, {hidden_dim * (unsigned)InferenceContext::Instance().GetMaxTokenLength(), k * (unsigned)InferenceContext::Instance().GetMaxTokenLength(), k, 1}, options);
并将1570行
auto intermediate_gemm = at::from_blob(intermediate_ptr, {input.size(0), input.size(1), mlp_1_out_neurons}, options);
改为:
auto intermediate_gemm = at::from_blob(intermediate_ptr, {input.size(0), input.size(1), (int)mlp_1_out_neurons}, options);
3、菜单中找到x64 Native Tools Command Prompt for VS 2022,鼠标右键,使用管理员运行build_win.bat
发表评论