npu 运行 FunAudioLLM/Fun-ASR-Nano-2512

Notice: In order to resolve issues efficiently, please follow the template.
（注意：为了更加高效率解决您遇到的问题，请按照模板提问，补充细节。）

## Before asking

1. Search existing issues: https://github.com/modelscope/FunASR/issues
2. Search the docs: https://modelscope.github.io/FunASR/
3. Check the README quick start and deployment section.

## Question


我使用仓库代码运行模型进行测试 ，运行 "iic/SenseVoiceSmall 在cpu 和npu 是正常的。 运行 FunAudioLLM/Fun-ASR-Nano-2512 得到 RuntimeError: Currently, AutocastCPU only support Bfloat16 as the autocast_cpu_dtype  错误。
## Code or command
运行代码及错误日志
```bash
 python
Python 3.11.6 (main, Apr  1 2026, 19:46:38) [GCC 12.3.1 (openEuler 12.3.1-38.oe2403)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from funasr import AutoModel
>>> model2 = AutoModel(model="FunAudioLLM/Fun-ASR-Nano-2512", device="npu:1")
funasr version: 1.3.14.
Check update of funasr, and it would cost few times. You may disable it by set `disable_update=True` in AutoModel
You are using the latest version of funasr-1.3.14
Downloading Model from https://www.modelscope.cn to directory: /root/.cache/modelscope/hub/models/FunAudioLLM/Fun-ASR-Nano-2512
WARNING:root:trust_remote_code: False
>>> result = model2.generate(input="/data/models/FunAudioLLM/example/zh.mp3")
  0%|                                                                                                   | 0/1 [00:00<?, ?it/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
        - Avoid using `tokenizers` before the fork if possible
        - Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/data/sh/fun-asr/fun-asr-gitee/funasr/auto/auto_model.py", line 477, in generate
    results = self.inference(
              ^^^^^^^^^^^^^^^
  File "/data/sh/fun-asr/fun-asr-gitee/funasr/auto/auto_model.py", line 560, in inference
    res = model.inference(**batch, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/data/sh/fun-asr/fun-asr-gitee/funasr/models/fun_asr_nano/model.py", line 712, in inference
    return self.inference_llm(
           ^^^^^^^^^^^^^^^^^^^
  File "/data/sh/fun-asr/fun-asr-gitee/funasr/models/fun_asr_nano/model.py", line 856, in inference_llm
    with torch.autocast(
  File "/usr/local/lib64/python3.11/site-packages/torch/amp/autocast_mode.py", line 329, in __enter__
    torch.set_autocast_cpu_dtype(self.fast_dtype)  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Currently, AutocastCPU only support Bfloat16 as the autocast_cpu_dtype
```

## What have you tried?
我修改了模型Qwen3-0.6B 的配置把 Bfloat16 改成了float16。如果是Bfloat 在npu 和cpu 下会得到  AttributeError: module 'torch' has no attribute 'Bfloat16'  改成float16  又会得到 RuntimeError: Currently, AutocastCPU only support Bfloat16 as the autocast_cpu_dtype


## Environment

- OS:
- Python version: 3.11.6
- FunASR version: 1.3.14
- ModelScope version:  1.37.1
- PyTorch / torchaudio version:  2.1.0  
- rotch-npu: 2.1.0.post18.dev20251112
- Install method (`pip`, source, Docker):  pip 
- Device (`cuda`, `cpu`, `mps`): npu ,cpu
- npu model :910B4
- cann 8.5.1
- 驱动 25.5.1
- Docker image tag, if used:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

npu 运行 FunAudioLLM/Fun-ASR-Nano-2512 #3034

Before asking

Question

Code or command

What have you tried?

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

npu 运行 FunAudioLLM/Fun-ASR-Nano-2512 #3034

Description

Before asking

Question

Code or command

What have you tried?

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions