摘要:
目录UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding LearningTL;DRMethodMLLM-as-a-Judge for Hard Negatives MiningMLLM Judgment Based Trainin 阅读全文
摘要:
目录Qwen2.5-VL Technical ReportTL;DRMethodFast and Efficient Vision EncoderMRoPE对齐绝对时间信息Pre-TrainingInterleaved Image-Text DataGrounding Data with Absol 阅读全文
摘要:
目录SAIL-Embedding Technical Report: Omni-modal Embedding Foundation ModelTL;DRDataRecommendation-aware Data ConstructionDynamic Hard Negative MiningQ:动 阅读全文
摘要:
目录VLM2Vec-V2: Advancing Multimodal Embedding for Videos, Images, and Visual DocumentsTL;DRMethodQ:VLM2Vec-V2与原始VLM2Vec算法有什么区别?BenchmarkQ&AQ:CLS, QA, R 阅读全文
摘要:
目录REACT: SYNERGIZING REASONING AND ACTING IN LANGUAGE MODELSTL;DRMethod实验设计不同方法的对比BadCase分析Q&AExperimentWebShop总结与思考相关链接 REACT: SYNERGIZING REASONIN 阅读全文
摘要:
目录MemGPT: Towards LLMs as Operating SystemsTL;DRMethodMain contextExperiment总结与思考相关链接 MemGPT: Towards LLMs as Operating Systems link 时间:23.10 单位:UC Be 阅读全文
摘要:
目录Qwen2-VL: Enhancing Vision-Language Model’s Perception of the World at Any ResolutionTL;DRMethodNaive Dynamic ResolutionMultimodal Rotary Position E 阅读全文