合集-解码多模态

摘要:BLIP,全称是Bootstrapped Language-Image Pretraining,源自《BLIP: Bootstrapping Language-Image Pre-training for Unifified Vision-Language Understanding and Generation》这篇文章,是来自Salesforce Research的一个多模态模型。 阅读全文
posted @ 2025-05-06 21:56 彼得虫 阅读(500) 评论(0) 推荐(0)