Hi, I am a PhD student researching artificial intelligence at KAIST. I write posts about interesting problems or ideas after I think about them for a while. I also write educational posts, e.g. on physics.
These days I spend most of time wondering about the landscape of “agents”. Recently I constructed a model-free algorithm that performs well in general environments, showing that model-free policy iteration is a generally applicable concept, like model-based planning. Some questions I’m currently interested in are: how values might become more coherent and systematized in agents that don’t explicitly optimize for them, e.g. LLMs, especially values that affect long-horizon behavior for which supervision is very sparse; modeling a single agent as a system of subagents that maximize their own utilities; self-awareness/introspection and their implications for decision-making; and how this all relates to bounded rationality, embedded agency, and deep learning.
On the practical side, I have a keen interest in the ability of AIs to understand human minds at an intuitive level, a.k.a. their theory of mind. This has downstream implications for psychiatry, education, alignment, etc. I also have a broad interest in AI safety, and one particular issue that I find pressing is: that the society is already hard enough to understand, and it will only become harder to understand with the proliferation of AIs. Studying the properties of current AI systems in multi-agent environments, and developing tools to understand them, therefore seems crucial. Some other topics I had previously been interested in are AI superforecasting, and scalable oversight.
Education
Korea Advanced Institute of Science and Technology (KAIST)
Ph.D. in Graduate School of Artificial Intelligence
2025 - Present
Korea Advanced Institute of Science and Technology (KAIST)
M.S. in Graduate School of Artificial Intelligence
2023 - 2025
Korea Advanced Institute of Science and Technology (KAIST)
B.S. in Mathematics | Summa Cum Laude
2019 - 2023
Publications
- Yegon Kim, Juho Lee,
Preprint, A Model-Free Universal AI - Yegon Kim, Juho Lee,
ICLR 2026 Workshop Trustworthy AI, Mitigating Legibility Tax with Decoupled Prover-Verifier Games - Yegon Kim, Seungyoo Lee, Chaeyun Jang, Hyungi Lee, Juho Lee,
Preprint, Parallel Test-Time Scaling with Multi-Sequence Verifiers - Chaeyun Jang, Moonseok Choi, Yegon Kim, Hyungi Lee, Juho Lee,
ICML 2025 Workshop R2-FM, Verbalized Confidence Triggers Self-Verification: Emergent Behavior Without Explicit Reasoning Supervision - Yegon Kim, Hyunsu Kim, Gyeonghoon Ko, Juho Lee,
ICML 2025, Active Learning with Selective Time-Step Acquisition for PDEs - Hyunsu Kim, Yegon Kim, Hongseok Yang, Juho Lee,
ICML 2024, Variational Partial Group Convolutions for Input-Aware Partial Equivariance of Rotations and Color-Shifts
Talks
- A Model-Free Universal AI. Invited talk for UAI regular meeting, 2026. [slides]
- Active Learning with Selective Time-Step Acquisition for PDEs. Master’s Thesis Defense, KAIST, 2025. [slides]
- Lab reading group 2026/03/10: AI Safety via Debate. [slides]
- Lab reading group 2025/10/01: AlphaEvolve. [slides]
- Lab reading group 2025/01/08: On the Measure of Intelligence. [slides]
- Lab reading group 2024/10/15: An Observation on Generalization. [slides]