上一页 1 2 3 4 5 6 ··· 81 下一页
摘要: Abstract Background: Competitors: GCG with gradient-based search to generate adversarial suffixes in order to jailbreak LLM GCG的缺点:计算效率地下,没有对可转移性还有可拓展 阅读全文
posted @ 2025-02-02 00:49 雪溯 阅读(21) 评论(0) 推荐(0)
摘要: Abstract Background: 目前的jailbreak mutator方式更集中在语义level,更容易被防御措施检查到 本文: AdaPPA (Adaptive Position Pre-Filled Jailbreak Attack) Task: adaptive position 阅读全文
posted @ 2025-01-15 23:13 雪溯 阅读(43) 评论(0) 推荐(0)
摘要: Abstract background: 本文认为现有的jailbreaking方法要么需要人力,要么需要大模型,本文不需要 本文: ReNELLM Task: Jailbreaking LLM blackbox Method: Prompt Rewriting, Scenario Nesting, 阅读全文
posted @ 2025-01-15 23:12 雪溯 阅读(85) 评论(0) 推荐(0)
摘要: Abstract 本文: Tasks: Decomposition Attacks: get information leakage of LLM Method: 利用LLM(称为ADVLLM)+Few shots example把一个恶意的问题分成许多小的问题,发送给Victim LLMs,再使用 阅读全文
posted @ 2025-01-13 23:52 雪溯 阅读(14) 评论(0) 推荐(0)
摘要: Abstract Github: https://github.com/verazuo/jailbreak_llms Method: 从多个数据源中总结jailbreaking prompts和模式,直接攻击,但侧重总结 Tasks: Tool: JAILBREAKHUB Task: jailbre 阅读全文
posted @ 2025-01-12 00:08 雪溯 阅读(82) 评论(0) 推荐(0)
摘要: 目的: reduce bias of LLMs(length, concreteness, empty reference, content continuation, nested instruction, familiar knowledge) Tool: OffsetBias: pairwis 阅读全文
posted @ 2024-12-30 18:58 雪溯 阅读(27) 评论(0) 推荐(0)
摘要: Abstract 本文: Speculative RAG Task: improving retrieval results by combining RAG with LLMs refinement Method: 利用large Generalist LM大点的通用模型来验证RAG drafts 阅读全文
posted @ 2024-12-25 01:54 雪溯 阅读(65) 评论(0) 推荐(0)
摘要: Abstract good words: subjectivity, variability, scale Task: Survey of LLM-as-a-Judge, benchmark & evaluation of LLM-as-a-Judge systems Core question: 阅读全文
posted @ 2024-12-21 00:46 雪溯 阅读(125) 评论(0) 推荐(0)
摘要: Abstract Task: Defense LLM from prompt injection attacks Tool: TaskTracker Methods: use activation deltas( the difference in activations before and af 阅读全文
posted @ 2024-12-13 15:58 雪溯 阅读(58) 评论(0) 推荐(0)
摘要: Abstract Github: https://github.com/JailbreakBench/jailbreakbench https://jailbreakbench.github.io/ Task: Opensource benchmark an evolving repository 阅读全文
posted @ 2024-12-10 22:42 雪溯 阅读(92) 评论(0) 推荐(0)
上一页 1 2 3 4 5 6 ··· 81 下一页