Notice: In order to resolve issues efficiently, please follow the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节。)
Before asking
- Search existing issues: https://github.com/modelscope/FunASR/issues
- Search the docs: https://modelscope.github.io/FunASR/
- Check the README quick start and deployment section.
Question
我使用仓库代码运行模型进行测试 ,运行 "iic/SenseVoiceSmall 在cpu 和npu 是正常的。 运行 FunAudioLLM/Fun-ASR-Nano-2512 得到 RuntimeError: Currently, AutocastCPU only support Bfloat16 as the autocast_cpu_dtype 错误。
Code or command
运行代码及错误日志
python
Python 3.11.6 (main, Apr 1 2026, 19:46:38) [GCC 12.3.1 (openEuler 12.3.1-38.oe2403)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from funasr import AutoModel
>>> model2 = AutoModel(model="FunAudioLLM/Fun-ASR-Nano-2512", device="npu:1")
funasr version: 1.3.14.
Check update of funasr, and it would cost few times. You may disable it by set `disable_update=True` in AutoModel
You are using the latest version of funasr-1.3.14
Downloading Model from https://www.modelscope.cn to directory: /root/.cache/modelscope/hub/models/FunAudioLLM/Fun-ASR-Nano-2512
WARNING:root:trust_remote_code: False
>>> result = model2.generate(input="/data/models/FunAudioLLM/example/zh.mp3")
0%| | 0/1 [00:00<?, ?it/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/data/sh/fun-asr/fun-asr-gitee/funasr/auto/auto_model.py", line 477, in generate
results = self.inference(
^^^^^^^^^^^^^^^
File "/data/sh/fun-asr/fun-asr-gitee/funasr/auto/auto_model.py", line 560, in inference
res = model.inference(**batch, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/data/sh/fun-asr/fun-asr-gitee/funasr/models/fun_asr_nano/model.py", line 712, in inference
return self.inference_llm(
^^^^^^^^^^^^^^^^^^^
File "/data/sh/fun-asr/fun-asr-gitee/funasr/models/fun_asr_nano/model.py", line 856, in inference_llm
with torch.autocast(
File "/usr/local/lib64/python3.11/site-packages/torch/amp/autocast_mode.py", line 329, in __enter__
torch.set_autocast_cpu_dtype(self.fast_dtype) # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Currently, AutocastCPU only support Bfloat16 as the autocast_cpu_dtype
What have you tried?
我修改了模型Qwen3-0.6B 的配置把 Bfloat16 改成了float16。如果是Bfloat 在npu 和cpu 下会得到 AttributeError: module 'torch' has no attribute 'Bfloat16' 改成float16 又会得到 RuntimeError: Currently, AutocastCPU only support Bfloat16 as the autocast_cpu_dtype
Environment
- OS:
- Python version: 3.11.6
- FunASR version: 1.3.14
- ModelScope version: 1.37.1
- PyTorch / torchaudio version: 2.1.0
- rotch-npu: 2.1.0.post18.dev20251112
- Install method (
pip, source, Docker): pip
- Device (
cuda, cpu, mps): npu ,cpu
- npu model :910B4
- cann 8.5.1
- 驱动 25.5.1
- Docker image tag, if used:
Notice: In order to resolve issues efficiently, please follow the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节。)
Before asking
Question
我使用仓库代码运行模型进行测试 ,运行 "iic/SenseVoiceSmall 在cpu 和npu 是正常的。 运行 FunAudioLLM/Fun-ASR-Nano-2512 得到 RuntimeError: Currently, AutocastCPU only support Bfloat16 as the autocast_cpu_dtype 错误。
Code or command
运行代码及错误日志
What have you tried?
我修改了模型Qwen3-0.6B 的配置把 Bfloat16 改成了float16。如果是Bfloat 在npu 和cpu 下会得到 AttributeError: module 'torch' has no attribute 'Bfloat16' 改成float16 又会得到 RuntimeError: Currently, AutocastCPU only support Bfloat16 as the autocast_cpu_dtype
Environment
pip, source, Docker): pipcuda,cpu,mps): npu ,cpu