Skill Discovery | 无监督技能发现的经典工作总结

🐱 Unsupervised
🦜 Guided

🐱 Unsupervised

Diversity is All You Need: Learning Skills without a Reward Function (diayn)

ICLR 2019。
arxiv：https://arxiv.org/abs/1802.06070
pdf：https://arxiv.org/pdf/1802.06070
html：https://ar5iv.labs.arxiv.org/html/1802.06070
website：https://sites.google.com/view/diayn
博客：论文速读纪录 | 2025.01

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills (EDL)

ICML 2020。
arxiv：https://arxiv.org/abs/2002.03647
GitHub：https://github.com/victorcampos7/edl
博客：论文速读记录 | 2026.02

CIC: Contrastive Intrinsic Control for Unsupervised Skill Discovery

疑似 ICML 2022。
arxiv：https://arxiv.org/abs/2202.00161
pdf：https://arxiv.org/pdf/2202.00161
html：https://ar5iv.labs.arxiv.org/html/2202.00161
website：https://sites.google.com/view/cicrl/
GitHub：https://github.com/rll-research/cic
博客：论文速读纪录 | 2025.01

Lipschitz-constrained Unsupervised Skill Discovery (LSD)

ICLR 2022。
arxiv：https://arxiv.org/abs/2202.00914
pdf：https://arxiv.org/pdf/2202.00914
html：https://ar5iv.labs.arxiv.org/html/2202.00914
博客：论文速读记录 | 2025.12（1）

Controllability-Aware Unsupervised Skill Discovery (CSD)

ICML 2023。
arxiv：https://arxiv.org/abs/2302.05103
pdf：https://arxiv.org/pdf/2302.05103
html：https://ar5iv.labs.arxiv.org/html/2302.05103
博客：论文速读记录 | 2025.12（1）

METRA: Scalable Unsupervised RL with Metric-Aware Abstraction

ICLR 2024 Oral。
arxiv：https://arxiv.org/abs/2310.08887
pdf：https://arxiv.org/pdf/2310.08887
html：https://arxiv.org/html/2310.08887
website：https://seohong.me/projects/metra/
GitHub：https://github.com/seohongpark/METRA
博客：Skill Discovery | METRA：让策略探索 state 的紧凑 embedding space

Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning (csf)

ICLR 2025 oral。
arxiv：https://arxiv.org/abs/2412.08021
pdf：https://arxiv.org/pdf/2412.08021
html：https://arxiv.org/html/2412.08021v3
open review：https://openreview.net/forum?id=xoIeVdFO7U
GitHub：https://github.com/Princeton-RL/contrastive-successor-features
博客：论文速读记录 | 2025.06

Foundation policies with hilbert representations (HILP, offline metra)

ICML 2024。
arxiv：https://arxiv.org/abs/2402.15567
website：https://seohong.me/projects/hilp/
博客：论文速读记录 | 2025.12（2）

Disentangled Unsupervised Skill Discovery for Efficient Hierarchical Reinforcement Learning (DUDSi)

neurips 2024，7 5 5 4 poster。
arxiv：https://arxiv.org/abs/2410.11251
open review：https://openreview.net/forum?id=ePOBcWfNFC
website：https://jiahenghu.github.io/DUSDi-site/
博客：论文速读记录 | 2025.09

SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions

neurips 2024，8 6 5 5 poster，5 是 borderline ac。
arxiv：https://arxiv.org/abs/2410.18416
open review：https://openreview.net/forum?id=i816TeqgVh
website：https://wangzizhao.github.io/SkiLD/
博客：论文速读记录 | 2025.09

Efficient Skill Discovery via Regret-Aware Optimization

ICML 2025，3 3 2 1 poster。
arxiv：https://arxiv.org/abs/2506.21044
open review：https://openreview.net/forum?id=4qMJ8Ignmp
GitHub：https://github.com/ZhHe11/RSD
博客：论文速读记录 | 2025.10

🦜 Guided

Safety-Aware Unsupervised Skill Discovery

ICRA 2023。
paper：https://safe-skill.github.io/static/pdfs/safe-skill.pdf
website：https://safe-skill.github.io/
博客：CSDN |【ICRA 2023】SASD 论文阅读笔记：一种安全感知的无监督技能发现方法

Do's and Don'ts: Learning Desirable Skills with Instruction Videos (dodont)

NeurIPS 2024 poster。
arxiv：https://arxiv.org/abs/2406.00324
pdf：https://arxiv.org/pdf/2406.00324
html：https://arxiv.org/html/2406.00324
website：https://mynsng.github.io/dodont/
open review：https://openreview.net/forum?id=7X5zu6GIuW
博客：Skill Discovery | DoDont：使用 do + don't 示例视频，引导 agent 学习人类期望的 skill

Language Guided Skill Discovery (LGSD)

ICLR 2025，8 8 6 6 poster。
arxiv：https://arxiv.org/abs/2406.06615
pdf：https://arxiv.org/pdf/2406.06615
html：https://arxiv.org/html/2406.06615v2
open review：https://openreview.net/forum?id=i3e92uSZCp
博客：Skill Discovery | LGSD：用描述 state 的语言 embedding 的距离，作为 metra 的 d(x,y) 距离约束

Reference Guided Skill Discovery (RGSD)

ICLR 2026。
arxiv：https://arxiv.org/abs/2510.06203
pdf：https://arxiv.org/pdf/2510.06203
html：https://arxiv.org/html/2510.06203
open review：https://openreview.net/forum?id=IaGf8Eh5Uo
博客：Skill Discovery | RGSD：基于高质量参考轨迹，预训练 skill space

Controlled Diversity with Preference: Towards Learning a Diverse Set of Desired Skills (CDP)

AAMAS 2023。
arxiv：https://arxiv.org/abs/2303.04592
GitHub：https://github.com/HussonnoisMaxence/CDP
期刊版本：Human-informed skill discovery: Controlled diversity with preference in reinforcement learning，science direct。
博客：论文速读记录 | 2025.11

Human-Aligned Skill Discovery Balancing Behaviour Exploration and Alignment (HaSD)

AAMAS 2025。
arxiv：https://arxiv.org/abs/2501.17431
GitHub：https://github.com/HussonnoisMaxence/HaSD-AAMAS
博客：论文速读记录 | 2025.11

Guiding Skill Discovery with Foundation Models (fog)

最新论文链接：https://liacs.leidenuniv.nl/~plaata1/papers/4848.pdf
ICLR 2025 版 open review 论文链接：https://openreview.net/pdf?id=nZBUtzJhf8
最新 website：https://sites.google.com/view/submission-fog （可惜有一些可视化好像挂掉了）
博客：Skill Discovery | FoG：使用 LLM / CLIP 给出 dodont 权重，以引导 agent 安全探索

posted @ 2026-04-12 15:25 MoonOut 阅读(230) 评论(0) 收藏举报

刷新页面返回顶部