摘要:
Abstract Background: Competitors: GCG with gradient-based search to generate adversarial suffixes in order to jailbreak LLM GCG的缺点:计算效率地下,没有对可转移性还有可拓展 阅读全文
摘要:
Abstract good words: subjectivity, variability, scale Task: Survey of LLM-as-a-Judge, benchmark & evaluation of LLM-as-a-Judge systems Core question: 阅读全文
摘要:
Abstract Task: Defense LLM from prompt injection attacks Tool: TaskTracker Methods: use activation deltas( the difference in activations before and af 阅读全文