语音识别服务funasr搭建
本文讨论语音识别功能,使用的是阿里的开源语音识别项目FunASR,含两种部署方式,社区windows版和docker容器化部署,windows社区版的可以用于本地开发使用,生产环境建议使用容器版。
1、windows社区版部署
1.1、环境安装
软件需要Visual Studio 2022 c++环境,如果没有Visual Studio 2022 c++运行环境,双击 VC_redist.x64(2022).exe 安装 Visual Studio 2022环境下编译的C++程序运行所需要的库。
1.2、下载windows社区软件包
https://www.modelscope.cn/models/iic/funasr-runtime-win-cpu-x64/files

随便选个版本的下载,这里选择的是0.2.0版本
1.3、下载所需模型
git clone https://www.modelscope.cn/damo/speech_fsmn_vad_zh-cn-16k-common-onnx.git; git clone https://www.modelscope.cn/damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx.git; git clone https://www.modelscope.cn/damo/speech_ngram_lm_zh-cn-ai-wesp-fst.git; git clone https://www.modelscope.cn/damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx.git; git clone https://www.modelscope.cn/thuduj12/fst_itn_zh.git
1.4、启动服务
将上面下载的windows社区软件包解压后,打开powershell,进入到解压后的目录,执行下面的命令
./funasr-wss-server.exe --vad-dir D:/developTest/funasr-runtime-resources/models/speech_fsmn_vad_zh-cn-16k-common-onnx --model-dir D:/developTest/funasr-runtime-resources/models/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx --lm-dir D:/developTest/funasr-runtime-resources/models/speech_ngram_lm_zh-cn-ai-wesp-fst --punc-dir D:/developTest/funasr-runtime-resources/models/punc_ct-transformer_cn-en-common-vocab471067-large-onnx --itn-dir D:/developTest/funasr-runtime-resources/models/fst_itn_zh --certfile 0
参数说明:
--model-dir modelscope model ID 或者 本地模型路径 --vad-dir modelscope model ID 或者 本地模型路径 --punc-dir modelscope model ID 或者 本地模型路径 --lm-dir modelscope model ID 或者 本地模型路径 --itn-dir modelscope model ID 或者 本地模型路径 --certfile ssl的证书文件,如果需要关闭ssl,参数设置为0
1.5、客户端调用
在windows社区版的解压目录下有客户端执行文件funasr-wss-client.exe
./funasr-wss-client.exe --server-ip 127.0.0.1 --port 10095 --wav-path asr_example_zh.wav
服务默认端口是10095,--wav-path指定音频文件地址
2、docker容器化部署
2.1、拉取docker镜像
docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.4.6
2.2、启动容器
在宿主机创建模型目录放置模型,这里的模型建议手动下载,就用上面的git下载下来,如果使用启动命令自动下载会很慢很卡。
docker run -p 10095:10095 -it --privileged=true -v D:\developTest\funasr-runtime-resources\models:/workspace/models registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.4.6
映射容器端口,挂载之前创建的存放模型目录到容器内部。
2.3、启动服务
进入容器内部,进到FunASR/runtime目录下

执行如下命令启动服务
nohup bash run_server.sh \ --certfile 0 \ --vad-dir speech_fsmn_vad_zh-cn-16k-common-onnx \ --model-dir speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx \ --punc-dir punc_ct-transformer_cn-en-common-vocab471067-large-onnx \ --lm-dir speech_ngram_lm_zh-cn-ai-wesp-fst \ --itn-dir fst_itn_zh > log.txt 2>&1 &
指定模型目录,这里的模型都是事先下载好的,就不需要通过启动命令下载了,certfile设为0,表示关闭ssl。
3、调用示例
这里大致写了两种java调用方式一种是通过ProcessBuilder,一种是WebSocketClient,大家可以用来看看。
- 使用ProcessBuilder,运行上面的客户端执行命令,获取执行结果
public String localTranslation(MultipartRequest multipartRequest) { StringBuffer resultBuffer = new StringBuffer(); // 需要传递给exe程序的参数 String exePath = "D:\\developTest\\funasr-runtime-resources\\funasr-runtime-win-cpu-x64\\funasr-runtime-win-cpu-x64-v0.2.0\\funasr-wss-client.exe"; String serveIp = "127.0.0.1"; // 假设你想要设置的IP地址 String port = "10095"; File targetFile = null; try { MultipartFile mFile = multipartRequest.getFile("file"); File dir = new File("D:\\developTest\\funasr-runtime-resources\\wav"); if (!dir.exists()) { dir.mkdirs(); } targetFile = File.createTempFile("tmp_", ".wav", dir); mFile.transferTo(targetFile); String wavPath = "D:\\developTest\\funasr-runtime-resources\\wav\\"+ targetFile.getName(); String[] cmd = new String[]{exePath, "--server-ip", serveIp, "--port", port, "--wav-path", wavPath}; ProcessBuilder pb = new ProcessBuilder(); pb.command(cmd); Process process = pb.start(); //超时时间 int timeoutSeconds = 30;//超时30秒自动断开 //创建单线程线程池 ExecutorService executor = Executors.newSingleThreadExecutor(); Future<?> future = executor.submit(() -> { try { pb.redirectErrorStream(true); // 读取外部程序的输出 BufferedReader reader = new BufferedReader(new InputStreamReader(process.getInputStream())); String line; while ((line = reader.readLine()) != null) { System.out.println(line); resultBuffer.append(line); } // 处理错误输出 BufferedReader errorReader = new BufferedReader(new InputStreamReader(process.getErrorStream())); while ((line = errorReader.readLine()) != null) { System.out.println(line); if(line.contains("on_message")){ String[] array = line.split("on_message ="); resultBuffer.append(array[1]); } } // 等待程序执行完成 process.waitFor(); } catch (Exception e) { e.printStackTrace(); }finally { if(process.isAlive()){ process.destroy(); } } }); try { // 等待进程完成或超时 future.get(timeoutSeconds, TimeUnit.SECONDS); System.out.println("进程在规定时间内完成。"); } catch (Exception e) { System.out.println("超时预警: 进程可能挂起。"); resultBuffer.append("timeout"); } finally { //关闭连接 if(process.isAlive()){ process.destroy(); } executor.shutdownNow(); // 取消任务并关闭线程池 } } catch (Exception e) { e.printStackTrace(); resultBuffer.append("error"); }finally { if (targetFile.exists()) { targetFile.delete(); } } System.out.println(resultBuffer.toString()); return resultBuffer.toString(); }
- 使用WebSocketClient直接调用FunASR服务
Client工具类
package com.example.demo1.web; import java.io.*; import java.net.URI; import java.util.Map; import org.java_websocket.client.WebSocketClient; import org.java_websocket.drafts.Draft; import org.java_websocket.handshake.ServerHandshake; import org.json.simple.JSONArray; import org.json.simple.JSONObject; import org.json.simple.parser.JSONParser; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import java.util.regex.Matcher; import java.util.regex.Pattern; public class FunasrWsClient extends WebSocketClient { private static final Logger logger = LoggerFactory.getLogger(FunasrWsClient.class); private boolean iseof = false; private static String wavPath; private static String mode = "offline"; private static String strChunkSize = "5,10,5"; private static int chunkInterval = 10; private static int sendChunkSize = 1920; private static String hotwords=""; private static String fsthotwords=""; private String wavName = "javatest"; private MyCallBack callBack; public FunasrWsClient(URI serverUri,MyCallBack callBack) { super(serverUri); this.callBack = callBack; } public FunasrWsClient(URI serverUri,String wavPath,MyCallBack callBack) { super(serverUri); this.callBack = callBack; this.wavPath = wavPath; } public FunasrWsClient(URI serverUri,String strChunkSize,int chunkInterval,String mode,String hotwords,String wavPath,MyCallBack callBack) { super(serverUri); this.callBack = callBack; this.strChunkSize = strChunkSize; this.chunkInterval = chunkInterval; this.mode = mode; this.fsthotwords = hotwords; this.wavPath = wavPath; int RATE = 16000; String[] chunkList = strChunkSize.split(","); int int_chunk_size = 60 * Integer.valueOf(chunkList[1].trim()) / chunkInterval; int CHUNK = Integer.valueOf(RATE / 1000 * int_chunk_size); this.sendChunkSize = CHUNK * 2; } public class RecWavThread extends Thread { private FunasrWsClient funasrClient; public RecWavThread(FunasrWsClient funasrClient) { this.funasrClient = funasrClient; } public void run() { this.funasrClient.recWav(); } } public FunasrWsClient(URI serverUri, Draft draft) { super(serverUri, draft); } public FunasrWsClient(URI serverURI) { super(serverURI); } public FunasrWsClient(URI serverUri, Map<String, String> httpHeaders) { super(serverUri, httpHeaders); } public void getSslContext(String keyfile, String certfile) { // TODO return; } public void sendJson( String mode, String strChunkSize, int chunkInterval, String wavName, boolean isSpeaking,String suffix) { try { JSONObject obj = new JSONObject(); obj.put("mode", mode); JSONArray array = new JSONArray(); String[] chunkList = strChunkSize.split(","); for (int i = 0; i < chunkList.length; i++) { array.add(Integer.valueOf(chunkList[i].trim())); } obj.put("chunk_size", array); obj.put("chunk_interval", new Integer(chunkInterval)); obj.put("wav_name", wavName); if(FunasrWsClient.hotwords.trim().length()>0) { String regex = "\\d+"; JSONObject jsonitems = new JSONObject(); String[] items=FunasrWsClient.hotwords.trim().split(" "); Pattern pattern = Pattern.compile(regex); String tmpWords=""; for(int i=0;i<items.length;i++) { Matcher matcher = pattern.matcher(items[i]); if (matcher.matches()) { jsonitems.put(tmpWords.trim(), items[i].trim()); tmpWords=""; continue; } tmpWords=tmpWords+items[i]+" "; } obj.put("hotwords", jsonitems.toString()); } if(suffix.equals("wav")){ suffix="pcm"; } obj.put("wav_format", suffix); if (isSpeaking) { obj.put("is_speaking", new Boolean(true)); } else { obj.put("is_speaking", new Boolean(false)); } logger.info("sendJson: " + obj); // return; send(obj.toString()); return; } catch (Exception e) { e.printStackTrace(); } } public void sendEof() { try { JSONObject obj = new JSONObject(); obj.put("is_speaking", new Boolean(false)); logger.info("sendEof: " + obj); // return; send(obj.toString()); iseof = true; return; } catch (Exception e) { e.printStackTrace(); } } public void recWav() { String fileName=FunasrWsClient.wavPath; String suffix=fileName.split("\\.")[fileName.split("\\.").length-1]; sendJson(mode, strChunkSize, chunkInterval, wavName, true,suffix); File file = new File(FunasrWsClient.wavPath); int chunkSize = sendChunkSize; byte[] bytes = new byte[chunkSize]; int readSize = 0; try (FileInputStream fis = new FileInputStream(file)) { if (FunasrWsClient.wavPath.endsWith(".wav")) { fis.read(bytes, 0, 44); } readSize = fis.read(bytes, 0, chunkSize); while (readSize > 0) { if (readSize == chunkSize) { send(bytes); } else { byte[] tmpBytes = new byte[readSize]; for (int i = 0; i < readSize; i++) { tmpBytes[i] = bytes[i]; } send(tmpBytes); } if (!mode.equals("offline")) { Thread.sleep(Integer.valueOf(chunkSize / 32)); } readSize = fis.read(bytes, 0, chunkSize); } if (!mode.equals("offline")) { Thread.sleep(2000); sendEof(); Thread.sleep(3000); close(); } else { sendEof(); } } catch (Exception e) { e.printStackTrace(); } } @Override public void onOpen(ServerHandshake handshakedata) { RecWavThread thread = new RecWavThread(this); thread.start(); } @Override public void onMessage(String message) { JSONObject jsonObject = new JSONObject(); JSONParser jsonParser = new JSONParser(); logger.info("received: " + message); try { jsonObject = (JSONObject) jsonParser.parse(message); logger.info("text: " + jsonObject.get("text")); callBack.callBack(jsonObject.get("text")); if(jsonObject.containsKey("timestamp")) { logger.info("timestamp: " + jsonObject.get("timestamp")); } } catch (org.json.simple.parser.ParseException e) { e.printStackTrace(); } if (iseof && mode.equals("offline") && !jsonObject.containsKey("is_final")) { close(); } if (iseof && mode.equals("offline") && jsonObject.containsKey("is_final") && jsonObject.get("is_final").equals("false")) { close(); } } @Override public void onClose(int code, String reason, boolean remote) { logger.info( "Connection closed by " + (remote ? "remote peer" : "us") + " Code: " + code + " Reason: " + reason); } @Override public void onError(Exception ex) { logger.info("ex: " + ex); ex.printStackTrace(); } }
public static void main(String[] args) throws URISyntaxException { String srvIp = "localhost"; String srvPort = "10095"; String wavPath = "D:\\developTest\\funasr-runtime-resources\\wav\\tmp_84677349854990998.wav"; Object lock = new Object(); StringBuffer text = new StringBuffer(); ExecutorService executor = Executors.newSingleThreadExecutor(); Future<?> future = executor.submit(()->{ try { String wsAddress = "ws://" + srvIp + ":" + srvPort; FunasrWsClient c = new FunasrWsClient(new URI(wsAddress),wavPath,new MyCallBack(){ @Override public void callBack(Object obj){ text.append(obj.toString()); synchronized (lock){ try { lock.notify(); }catch (Exception e){ e.printStackTrace(); } } } }); synchronized (lock){ c.connect(); lock.wait(); } }catch (Exception e){ // e.printStackTrace(); } }); try { future.get(10, TimeUnit.SECONDS); System.out.println("规定时间内完成"); }catch (Exception e){ // e.printStackTrace(); System.out.println("任务超时"); text.append("任务超时"); }finally { executor.shutdownNow(); // 取消任务并关闭线程池 } System.out.println(text.toString()); }

浙公网安备 33010602011771号