随笔分类 - read
以摘录为主,记录一些有可能会用到的方法或灵感
摘要:pre 本文SVC指的是歌声转换(Singing Voice Conversion (SVC)),例如常见且开源的 So-VITS-SVC, RVC, DDSP-SVC 关键词:歌声转换、声音克隆、AI翻唱 本来是不打算写ReFlow-VAE-SVC的,不过实在是对名字里面那个VAE很在意,而且由于
阅读全文
摘要:pre 本文SVC指的是歌声转换(Singing Voice Conversion (SVC)),例如常见且开源的 So-VITS-SVC, RVC, DDSP-SVC 关键词:歌声转换、声音克隆、AI翻唱 DDSP-SVC训练快,但总是有音色泄漏。RIFT-SVC训练会慢上许多,效果略好。 So-
阅读全文
摘要:pre 本文SVC指的是歌声转换(Singing Voice Conversion (SVC)),例如常见且开源的 So-VITS-SVC, RVC, DDSP-SVC 关键词:歌声转换、声音克隆、AI翻唱 DDSP-SVC训练快,但总是有音色泄漏,瞎改了几下似乎也没啥帮助。于是试试RIFT-SVC
阅读全文
摘要:pre 本文SVC指的是歌声转换(Singing Voice Conversion (SVC)),例如常见且开源的 So-VITS-SVC, RVC, DDSP-SVC 关键词:歌声转换、声音克隆、AI翻唱 最早在23年刷到了惠惠的冬之花翻唱,惊为天人,一直对这块很感兴趣,奈何当时有其他研究,平时时
阅读全文
摘要:Pre title: IF-Font: Ideographic Description Sequence-Following Font Generation source: NeurIPS 2024 paper: https://proceedings.neurips.cc/paper_files/
阅读全文
摘要:Pre 想认真整理却没时间,很无奈,大概就这样吧 Zero-Shot Text-to-Image Generation (DALL-E) code https://github.com/openai/DALL-E Idea 提出 dVAE 将离散采样问题放松为连续近似,VQ-VAE迫使模型在所有情况
阅读全文
摘要:Pre title: Few shot font generation via transferring similarity guided global style and quantization local style accepted: ICCV 2023 paper: https://ar
阅读全文
摘要:Pre title: Language Model Beats Diffusion - Tokenizer is Key to Visual Generation accepted: ICLR 2024 paper: https://arxiv.org/abs/2310.05737 code: no
阅读全文
摘要:Pre title: Vector Quantized Image-to-Image Translation accepted: ECCV 2022 paper: https://arxiv.org/abs/2207.13286 code: https://github.com/cyj407/VQ-
阅读全文
摘要:Pre title: VQ-Font: Few-Shot Font Generation with Structure-Aware Enhancement and Quantization accepted: arXiv 2023 paper: https://arxiv.org/abs/2308.
阅读全文
摘要:Pre title: Radical Analysis Network for Zero-Shot Learning in Printed Chinese Character Recognition accepted: ICME 2018 paper: https://arxiv.org/abs/1
阅读全文
摘要:Pre title: Vector-quantized Image Modeling with Improved VQGAN accepted: ICLR 2022 paper: https://arxiv.org/abs/2110.04627 code: https://github.com/th
阅读全文
摘要:Pre title: Breaking the Representation Bottleneck of Chinese Characters:Neural Machine Translation with Stroke Sequence Modeling accepted: EMNLP 2022
阅读全文
摘要:Pre ref: 《An Introduction to Autoencoders》 ref: https://zhuanlan.zhihu.com/p/388620573 ref: https://www.spaces.ac.cn/archives/5253 ref: https://zhuanl
阅读全文
摘要:Pre title: Drawing and Recognizing Chinese Characters with Recurrent Neural Network source: TPAMI 2018 paper: https://arxiv.org/abs/1606.06539 code: h
阅读全文
摘要:Pre title: Calligraphy Font Generation via Explicitly Modeling Location-Aware Glyph Component Deformations source: TMM 2023 paper: https://ieeexplore.
阅读全文
摘要:Pre title: BBDM: Image-to-Image Translation With Brownian Bridge Diffusion Models source: CVPR 2023 paper: https://arxiv.org/abs/2205.07680 code: http
阅读全文
摘要:Pre title: Small-scale proxies for large-scale Transformer training instabilities source: ICLR 2024 paper: https://arxiv.org/abs/2309.14322 code: ref:
阅读全文
摘要:Pre title: DualVector: Unsupervised Vector Font Synthesis with Dual-Part Representation accepted: CVPR2023 paper: https://arxiv.org/abs/2305.10462 cod
阅读全文
摘要:1. Pre title: Design and Development of a Framework For Stroke-Based Handwritten Gujarati Font Generation source: arXiv2024 paper: https://arxiv.org/a
阅读全文

浙公网安备 33010602011771号