语音合成调研相关资料汇总

# 1.市场现有TTS

a. 百度TTS

b. 阿里TTS

c. 科大讯飞TTS

d. 腾讯TTS

e. 网易TTS

f. 思必驰TTS

2. 论文

a. 拼接合成可参考:基于HMM的单元挑选语音合成方法研究-何鑫
b. 中科大 凌振华
https://dblp.org/pid/70/5210.html
c. Recent Advances in Google Real-time HMM-driven Unit Selection Synthesizer
https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/45564.pdf
d. Google’s Next-Generation Real-Time Unit-Selection Synthesizer using
Sequence-To-Sequence LSTM-based Autoencoders

e. Siri On-Device Deep Learning-Guided Unit Selection Text-to-Speech System
https://www.isca-speech.org/archive/pdfs/interspeech_2017/capes17_interspeech.pdf
f. 论文:基于 HMM 的可训练中文语音合成
g. 论文:基于混合基元模型的非定长基元选取算法
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.660.6287&rep=rep1&type=pdf
h. 论文:基于决策树的语音基元语境特征权重训练算法
i. 论文:Improving the Performance of HMM-Based Voice Conversion using Context Clustering Decision Tree and Appropriate Regression Matrix Format
https://www.cs.cmu.edu/~lqin/cmu_files/interspeech2006.pdf
j. 论文:HMM-based Unit Selection Using Frame Sized Speech Segments
k. 微软亚洲研究院给出的综述:
https://www.msra.cn/zh-cn/news/features/neural-speech-synthesis-survey

3. 理论

a. 声学基础
https://wenku.baidu.com/view/f4e5e1297e1cfad6195f312b3169a4517723e5bc.html
b. 共振峰解释:
https://www.zhihu.com/question/24190826
c. 怎么在语谱图中看共振峰
https://blog.csdn.net/BEIERMODE666/article/details/121640622
d. em算法详解和在高斯混合模型中的应用
https://www.cnblogs.com/jerrylead/archive/2011/04/06/2006936.html
e. gmm详解
https://zhuanlan.zhihu.com/p/30483076
f. Gmm diag variance
https://stats.stackexchange.com/questions/326671/different-covariance-types-for-gaussian-mixture-models
g. gmm基础:
https://blog.csdn.net/zeronose/article/details/104643677
h. GMM-HMM
https://www.cnblogs.com/IO382/p/13205135.html

4. HTS训练

a. festival文档:
http://www.festvox.org/docs/manual-1.4.2/festival_15.html#SEC56
b. 步骤
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.700.880&rep=rep1&type=pdf
https://zhuanlan.zhihu.com/p/63753017
c. 问题:
https://ubuntuforums.org/showthread.php?t=2349398
d. 安装HTK步骤博客:
http://blog.huati365.com/772dc3cd77642673
e. 可能与gcc版本有关
https://www.linuxquestions.org/questions/linux-software-2/configure-error-c-compiler-cannot-create-executables-4175557896/page2.html
f. 使用htk docker:
https://github.com/loretoparisi/htk
g. Hts demo的搭建:
https://blog.csdn.net/sunflower_yolanda/article/details/51646020
h. Gcc3.4编译时遇到问题:
https://stackoverflow.com/questions/6329887/compiling-problems-cannot-find-crt1-o
i. Ubuntu14.04上的安装流程:
https://blog.csdn.net/qq_41337100/article/details/90509311
j. HTK的安装和使用完整版:
https://www.cnblogs.com/mingzhao810/archive/2012/08/03/2617674.html
k. HTK中各个参数的介绍:
https://blog.csdn.net/qq_34611579/article/details/79754324
l. HTS DEMO运行流程:
http://www.doc88.com/p-9713794709488.html
m. windows下的安装教程:
http://blog.sina.com.cn/s/blog_e5c6a0ea0102x5qc.html
n. 开源声码器在语音合成中的应用:
https://blog.csdn.net/vn9PLgZvnPs1522s82g/article/details/86697936
o. 关联word和音素:
a. 单词转音素方法:
https://zhuanlan.zhihu.com/p/336872753
b. festival详解:
https://irw.ncut.edu.tw/peterju/festival.html
c. festival中文文档:
http://t.zoukankan.com/sztom-p-14960473.html
d. 音节下的音素(可以用来找单词下的音素):
http://www.cs.columbia.edu/~ecooper/tts/festival.html
p. 找出声学信息:
a. 相关操作
https://zhuanlan.zhihu.com/p/157526275
b. HMM原理篇:
https://blog.csdn.net/abcjennifer/article/details/27346787
c. deltas和delta-deltas以及动态特征的解释
https://zhuanlan.zhihu.com/p/23305179
d. Hts_engine_world:
https://github.com/mipuc/hts-engine-world/blob/master/lib/synthworld.cpp
e. 使用festival跑出label文件
http://hts.sp.nitech.ac.jp/hts-users/spool/2014/msg00120.html
f. http://hts.sp.nitech.ac.jp/hts-users/spool/2011/msg00003.html
g. 音节概念区分:http://www.360doc.com/content/20/0405/14/44865632_904003110.shtml
h. 解释强制 对齐的含义:
https://medium.com/@pilarsoledad/construyendo-un-sintetizador-de-texto-a-voz-usando-python-y-selección-de-unidades-a5dc2e11a091
i. FMA强制对齐:标记出开始和结束的位置然后用工具对音频进行截切形成新的音频库
https://blog.csdn.net/weixin_38638559/article/details/115053918

5. merlin语音合成

a. merlin中文参数合成文档地址:
https://mtts.readthedocs.io/zh_CN/stable/toolkit.html
b. merlin中文参数合成项目地址:
https://github.com/Jackiexiao/MTTS/blob/master/misc/mandarin_label.md
c. merlin语音合成相关的工具包:
https://mtts.readthedocs.io/zh_CN/stable/toolkit.html
d. 中文语音参数合成的实现步骤:
https://mtts.readthedocs.io/zh_CN/stable/mtts_implement/speech_synthesis.html
e. 经验文档:
https://github.com/Jackiexiao/MTTS/blob/dev/docs/tutorial.rst
f. merlin官方文档:
http://jrmeyer.github.io/tts/2017/02/14/Installing-Merlin.html
g. 使用merlin进行unit-selection:
https://github.com/CSTR-Edinburgh/merlin/issues/226
h. 使用festvox构建unit database:
http://festvox.org/bsv/c2645.html#AEN2665
i. 博客:使用HMM进行语音识别任务:
https://www.cnblogs.com/ansersion/p/4155828.html
j. 博客:使用merlin进行HMM训练:
https://www.cnblogs.com/zhanxiage1994/p/7828668.html
k. 自动标注韵律边界:
https://blog.csdn.net/shaopengfei/article/details/117091143
l. praat辅助标注工具:
https://blog.csdn.net/shaopengfei/article/details/109378097
m. hts训练过程:
i. https://blog.csdn.net/qq_22337113/article/details/108009873
ii. https://blog.csdn.net/wz_0728/article/details/76044581
n. merlin训练过程:
https://blog.csdn.net/qq_41571456/article/details/103733082
o. merlin详细操作流程:
https://www.cnblogs.com/zhanxiage1994/p/7797969.html
p. sppas对齐软件使用:
https://blog.csdn.net/shaopengfei/article/details/18351809
q. 使用merlin构建声音详细:
https://shartoo.github.io/2017/04/16/merlin-tts/
r. hts和merlin的深入理解:
http://vsooda.github.io/tag/#merlin

6. 其他

a. DURLAN:
https://github.com/ivanvovk/DurIAN
b. fastSpeech2:
https://zhuanlan.zhihu.com/p/363808377
c. TTS简介:
https://zhuanlan.zhihu.com/p/321798376
d. wavenet原理和实现:
https://zhuanlan.zhihu.com/p/28849767
e. 语音合成技术综合介绍:
https://zhuanlan.zhihu.com/p/113282101
f. 百度语音拼接合成技术:
https://patents.google.com/patent/CN105719641A/zh
g. 捷华通声语音拼接合成技术:
https://patents.google.com/patent/CN110047463A/zh
h. 富士通语音合成:
https://www.fujitsu.com/cn/about/local/businesspolicy/tech/list/voice-processing-p04.html
i. 语音合成极限元概述:
https://www.36kr.com/p/1721814335489
j. 深度学习在语音上的应用:
https://www.leiphone.com/category/ai/trT0kxuTmx67dtPk.html
k. 语音合成系统概述:
https://zhuanlan.zhihu.com/p/41355600
l. 语音合成技术原理与关键技术:
https://zhuanlan.zhihu.com/p/41355959
m. 极线元:
https://zhuanlan.zhihu.com/p/27395458
n. siri中语音合成技术:
https://www.jiqizhixin.com/articles/2017-08-25-7
o. stress和acount的区别:
http://sky.cssn.cn/yyx/yyx_xwtt/201705/t20170528_3533662_1.shtml
p. 苹果SIRI语音合成揭秘:
https://www.jiqizhixin.com/articles/2017-08-25-7

posted @ 2022-04-20 16:58  热风丶1921  阅读(241)  评论(0)    收藏  举报