Jaeyoon Jung

About

Hi, I'm Jaeyoon Jung. I received my B.S. in Artificial Intelligence from Soongsil University and work at MAUM.AI as a senior AI research scientist. My research focuses on building intelligent agents that perceive, reason, and act across diverse modalities, with interests in embodied AI, automated fact-checking, and multimodal understanding.

Publications

(*: Equal contribution, C: Conferences, J: Journal, W: Workshop, R: Competition Report, P: Preprint).

Preprint

[P3] WorldCam: Interactive Autoregressive 3D Gaming Worlds with Camera Pose as a Unifying Geometric Representation

Jisu Nam, Yicong Hong, Chun-Hao Paul Huang, Feng Liu, JoungBin Lee, Jiyoung Kim, Siyoon Jin, Yunsung Lee, Jaeyoon Jung, Suhwan Choi, Seungryong Kim, Yang Zhou

Under Review

Arxiv Website Code

[P2] CostNav: A Navigation Benchmark for Cost-Aware Evaluation of Embodied Agents

Haebin Seong, Sungmin Kim, Minchan Kim, Yongjun Cho, Myunchul Joe, Suhwan Choi, Jaeyoon Jung, Jiyong Youn, Yoonshik Kim, Samwoo Seong, Yubeen Park, Youngjae Yu, Yunsung Lee

Under Review

Arxiv Website Code

[P1][W6] Is a Picture Worth a Thousand Words? Agentic Multimodal Fact-Checking for Adaptive Use of Visual Evidence

Jaeyoon Jung, Yejun Yoon, Kunwoo Park

Under Review

Workshop on Human-Centric AI at CIKM, 2025

2026

[R2] The PokeAgent Challenge: Competitive and Long-Context Learning at Scale

Seth Karten, Jake Grigsby, Tersoo Upaa Jr, Junik Bae, Seonghun Hong, Hyunyoung Jeong, Jaeyoon Jung, Kun Kerdthaisong, Gyungbo Kim, Hyeokgi Kim, Yujin Kim, Eunju Kwon, Dongyu Liu, Patrick Mariglia, Sangyeon Park, Benedikt Schink, Xianwei Shi, Anthony Sistilli, Joseph Twin, Arian Urdu, Matin Urdu, Qiao Wang, Ling Wu, Wenli Zhang, Kunsheng Zhou, Stephanie Milani, Kiran Vodrahalli, Amy Zhang, Fei Fang, Yuke Zhu, Chi Jin

NeurIPS 2025 Competition Report

Arxiv Website

[C5] D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Suhwan Choi*, Jaeyoon Jung*, Haebin Seong*, Minchan Kim, Minyeong Kim, Yongjun Cho, Yoonshik Kim, Park Yu Been, Youngjae Yu, and Yunsung Lee

International Conference on Learning Representations (ICLR), 2026

Arxiv Website Code

[W7] VILLAIN at AVerImaTeC: Verifying Image-Text Claims via Multi-Agent Collaboration

Jaeyoon Jung*, Yejun Yoon*, Seunghyun Yoon, Kunwoo Park

Ninth Workshop on Fact Extraction and VERification (FEVER) at EACL, 2026, (Oral)

Arxiv Code

[C4] Exploring Fine-Tuning of Large Audio Language Models for Spoken Language Understanding under Limited Speech Data

Youngwon Choi, Jaeyoon Jung, Hyeonyu Kim, Huu-Kim Nguyen, Hwayeon Kim

IEEE/CVF International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2026, (Oral)

Arxiv

2025

[W5] Team HUMANE at AVeriTeC 2025: HerO 2 for Efficient Fact Verification

Yejun Yoon, Jaeyoon Jung, Seunghyun Yoon, Kunwoo Park

Eighth Workshop on Fact Extraction and VERification (FEVER) at ACL, 2025

Arxiv Code

[C3] Hypothetical Documents or Knowledge Leakage? Rethinking LLM-based Query Expansion

Yejun Yoon, Jaeyoon Jung, Seunghyun Yoon, Kunwoo Park

Findings of the Association for Computational Linguistics (ACL Findings), 2025

Arxiv

[W4] KOFFVQA: An Objectively Evaluated Free-form VQA Benchmark for Large Vision-Language Models in the Korean Language

Yoonshik Kim*, Jaeyoon Jung*

IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2025

Arxiv Website Code Data

[C2][W3] CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction

Suhwan Choi*, Yongjun Cho*, Minchan Kim*, Jaeyoon Jung*, Myunchul Joe, Yubeen Park, Minseo Kim, Sungwoong Kim, Sungjae Lee, Hwiseong Park, Jiwan Chung, Youngjae Yu

International Conference on Robotics and Automation (ICRA), 2025

Workshop on Open-World Agents at NeurIPS, 2024, (Oral, Outstanding Paper Award (3/97 = 3.1%))

Arxiv Website Code Data

2024

[W2] The Herd of Open LLMs for Verifying Real-World Claims

Yejun Yoon*, Jaeyoon Jung*, Seunghyun Yoon, Kunwoo Park

Seventh Workshop on Fact Extraction and VERification (FEVER) at EMNLP, 2024, (Oral)

Arxiv Code

[W1] EnCLAP++: Analyzing the EnCLAP Framework for Optimizing Automated Audio Captioning Performance

Jaeyeon Kim, Minjeon Jeon, Jaeyoon Jung, Sang Hoon Woo, Jinjoo Lee

Detection and Classification of Acoustic Scenes and Events (DCASE) Workshop, 2024

Arxiv

[R1] Expanding on EnCLAP with Auxiliary Retrieval Model for Automated Audio Captioning

Jaeyeon Kim, Jaeyoon Jung, Minjeong Jeon, Sang Hoon Woo, Jinjoo Lee

DCASE 2024 Challenge Technical Report

Arxiv

[C1] EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning

Jaeyeon Kim, Jaeyoon Jung, Jinjoo Lee, Sang Hoon Woo

IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2024

Arxiv Code

Education

Soongsil University Mar. 2020 - Feb. 2026

B.S. in Artificial Intelligence GPA: 4.2/4.5, Summa Cum Laude

includes a 2-year mandatory alternative military service at MAUM.AI

Awards

Selected
All

Summa Cum Laude, Department of Artificial Intelligence, Soongsil University February 2026

Best Thesis Award, Department of Artificial Intelligence, Soongsil University December 2025

For paper 'Is a Picture Worth a Thousand Words? Agentic Multimodal Fact-Checking for Adaptive Use of Visual Evidence'

1st Place, AVerImaTeC shared task hosted by the ninth FEVER workshop at EACL 2026 December 2025

Code Built multimodal fact-checking pipeline with agentic multimodal models

5th Place, Judge's Choice Award, PokéAgent Challenge at NeurIPS 2025 November 2025

Code Award: $400, competition report co-authorship.
Track 2: speedrunning, long-horizon RPG gameplay

2nd Place, AVeriTeC shared task hosted by the eight FEVER workshop at ACL 2025 May 2025

Code Enhanced fact-checking pipeline using new retrieval method and quantization

Outstanding Paper Award, NeurIPS 2024 - Open World Agent Workshop December 2024

For paper 'CANVAS: Commonsense-Aware Navigation System for Intuitive Human-Robot Interaction'

2nd Place, AVeriTeC shared task hosted by the seventh FEVER workshop at EMNLP 2024 August 2024

Code Built fact-checking pipeline with open LLMs, selected as baseline for eighth workshop

2nd Place, Detection and Classification of Acoustic Scenes and Events 2024 Challenge June 2024

Task 6: Automated Audio Captioning, and Task 8: Language-Based Audio Retrieval

1st Place, LG Display Product Quality Classification (LG Aimers 2) March 2023

Code Award: ₩5,000,000 (≈ $3,500).
Built a robust regressor and classifier, then combined them with custom hard voting into a generalizable ensemble model

1st Place, Samsung AI Challenge (3D Metrology) October 2022

Code Award: ₩10,000,000 (≈ $7,000).
Used CycleGAN for unpaired transfer from simulated to real SEM images, and improved depth map prediction using cosine similarity-based KNN to retrieve relevant depth maps as references

2nd Place, LG Innotek Radar Performance Prediction (LG Aimers 1) September 2022

Code Award: ₩3,000,000 (≈ $2,100).
Analyzed radar factory tabular data using domain knowledge and SHAP to engineer and select features, then trained a boosting ensemble on the refined imbalanced dataset with stratified k-fold

Summa Cum Laude, Department of Artificial Intelligence, Soongsil University February 2026

Best Thesis Award, Department of Artificial Intelligence, Soongsil University December 2025

For paper 'Is a Picture Worth a Thousand Words? Agentic Multimodal Fact-Checking for Adaptive Use of Visual Evidence'

1st Place, AVerImaTeC shared task hosted by the ninth FEVER workshop at EACL 2026 December 2025

Code Built multimodal fact-checking pipeline with agentic multimodal models

5th Place, Judge's Choice Award, PokéAgent Challenge at NeurIPS 2025 November 2025

Code Award: $400, competition report co-authorship.
Track 2: speedrunning, long-horizon RPG gameplay

1st Place, 5th Uni-DTHON Challenge November 2025

Code Award: ₩2,000,000 (≈ $1,400)