|
Kun Yuan (袁坤)
Researcher & Engineer
in Agentic LLM and AIGC
Kuaishou Technology
Research Interests: Large Language Models, Agentic Coding, Reinforcement Learning,
Diffusion-based Video Generation, Large-scale AI/RL Infrastructure
Email: yuankunbupt at gmail dot com
Wechat: yuankun_casia
[Google
Scholar]
[LinkedIn]
|
|
Short Bio
I am an algorithm R&D Engineer, currently working at Kuaishou
Technology since 2021. At Kuaishou, my work involves two main directions. Since 2025, I have
been working on foundation Large Language Models (LLMs), focusing on enhancing agentic coding
capabilities through RL-based post-training. This work facilitated the development of the KAT (Kwai
Auto-Think) models and the coding agent Codeflickr,
which currently serves over 10,000 internal developers and provides external services. Prior to this, my
focus was on video algorithms, where I applied multimodal LLM-based content understanding, KVQ (Kwai
Visual Quality), and diffusion-based generative models, LPM (Large Processing Model), to improve the
visual quality of video-on-demand (VoD) and live streaming.
Before joining Kuaishou, I was a Computer Vision Researcher at SenseTime Research from 2018 to 2021. My work there focused
on improving the accuracy of face recognition algorithms, which were deployed in smart city projects and
mobile devices. I also contributed to deep learning open-source toolchains, including SenseSpring and
OpenMMLab. From 2016 to 2017, I was an algorithm intern at Horizon
Robotics, participating in the development of
edge-chip-based person re-identification systems.
I received my Master's degree from the National Laboratory of Pattern Recognition (NLPR) at the Institute of Automation, Chinese Academy of Sciences in 2018,
and my Bachelor's degree from Beijing University of Posts and
Telecommunications (BUPT) in 2015. My research interests include Agentic LLMs, AIGC (Image and
Video Generation), and Large-scale AI Infrastructure.
工作介绍
自 2021 年加入快手,在人工智能和大模型领域持续探索,工作经历包含三个方面:
(1) 大语言模型后训练与 Agentic LLM 开发;
(2) 基于多模态大模型的视频内容理解和基于 AIGC 生成式模型的视频处理;
(3) AI Infrastructure,包括强化学习训练平台和大模型推理性能优化。
1. 大语言模型后训练与 Agentic LLM 开发: 通过强化学习后训练提升大模型(千亿参数量)在代码场景(SWE, Software Engineering)的能力。引入自动化流程持续从
Github 采集并筛选高质量、多语言(Python、Java、Go、C++等)、多场景(前端、后端、算法、测试等)代码数据,构建十万量级 {task, environment, verifier}
三元组数据进行强化学习训练,结合 GRPO/GSPO 算法有效提升了模型在代码生成、补全、修复等任务上的表现。同时,创新性地将多脚手架引入到轨迹的生成中提升真实使用场景的泛化能力,适配了大量黑盒
Agent 类型(Claude code, OpenCode, Kilocode等)。最终研发的 KAT
(Kwai Auto-Think) 模型,在 SWE-bench-verified & Multilingual 数据集上取得接近 Claude Opus 4.6
的效果,并在Artificial Analysis Coding Index榜单上取得国内模型的第一位 (2026/03/31)。
2. 基于多模态大模型的视频内容理解和基于 AIGC 生成式模型的视频处理: (1) 视频内容理解。基于海量的视频数据+多模态模型训练自研了 快手视频质量评价体系
KVQ,量化视频生产消费链路中诸如编码、处理、传输等过程的画质信息。通过自研
QPT
系列算法,走通了基于海量无监督数据训练质量感知模型的技术路线,结合高质垂类数据微调,在快手 100+ 垂类场景的表现超过 Golden
Eye;并引入多模态大模型,通过高质量的指令微调,给出白盒化的归因分析。落地快手点播、直播场景,指导智能编码、多码率决策下发、审核风控、推荐分发、搜索排序等场景,日均调用 2 亿次。
(2) 生成式视频处理。针对快手视频的画质问题,基于 Transformer 设计并实现了多种视频处理算法,包括 KEP (Kuaishou Enhanced Processing)/KRP
(Kuaishou Restoration Processing),显著改善了视频画质,让用户看到比作者上传源更清晰的画面。进一步自研 Diffusion-based 增强算法 XPSR 和
业界首个 Autoregressive-based 增强算法
VARSR,通过生成能力的改善突破画质上限,结合亿级别的训练数据,取得了令人惊艳的增强修复效果,落地服务端点播场景取得了显著用户时长提升收益,
同时赋能电商、商业化,通过清晰度的提升促进 GMV、广告消耗。
3. AI Infrastructure: (1) RL infrastructure。针对大语言模型训练,基于 Megatron 和 SGlang
搭建了高效的分布式强化学习训练平台,探索最优的
TP/PP/CP/EP 组合方式,支持稳定高效的千卡训练。同时,基于快手完善的容器云平台,实现万级别的沙箱并发,通过预先镜像编译缩短了冷启动时间,
并在任务完成后及时销毁,有效支持了 Agentic Coding 的快速迭代。
(2) AIGC 模型推理优化。针对视频生成模型推理,自研多模型单引擎部署方案、Diffusion 低精度量化、一致性模型蒸馏等技术,将 diffusion 模型推理降低至 1 步,
并与 NVIDIA 展开深度合作,基于 TensorRT-LLM、FP8 量化等技术大幅提升大模型在视频处理场景下的推理效率,整体加速 80+ 倍,为 AI 能力的规模化应用提供了坚实的技术基础,
显著降低了机器成本、提升了服务的覆盖率。并在 GTC2025 上进行技术分享:重塑短视频视觉体验:智能视频质量评价与处理大模型。
News
-
[2026-03] KAT-Coder-V2 released.
-
[2026-03] One paper accepted by CVPR 2026.
-
[2025-05] One paper accepted by ICML 2025.
-
[2025-03] I give a talk at Nvidia GTC 2025 about "Redefining Visual Experience of
Short-form Videos: Accelerating Large Models for Intelligent Video Quality Assessment and Processing
by TensorRT-LLM".
-
[2025-03] One paper accepted by CVPR 2025.
-
[2024-07] Two papers accepted by ACM MM 2024.
-
[2024-07] One paper accepted by ECCV 2024.
-
[2024-03] Two papers accepted by CVPR 2024.
-
[2023-10] Two papers accepted by ACM MM 2023.
-
[2023-03] One paper accepted by CVPR 2023.
-
[2022-03] One paper accepted by CVPR 2022.
-
[2021-02] One paper accepted by ICLR 2021.
-
[2021-02] Two papers accepted by ICCV 2021.
-
[2020-08] One paper accepted by ECCV 2020.
-
[2018-07] One paper accepted by IJCAI 2018.
Publications
(* denotes equal contribution, # denotes corresponding author)
2026
KAT-Coder-V2 Technical Report
KwaiKAT Team
*
KAT-Coder-V2 achieves 79.6% on SWE-bench Verified (vs. Claude Opus 4.6 at 80.8%)
*
88.7 on PinchBench (surpassing GLM-5 and MiniMax M2.7)
Technical Report, 2026.
[
Paper]
[
Project Page]
[
Wechat Sharing]
Bridging Video Quality Scoring and Justification via Large Multimodal Models
Qizhi Xie,
Kun Yuan#, Yunpeng Qu, Jiachao Gong, Mingda Wu, Ming Sun, Jihong Zhu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026.
[
Paper]
2025
Visual Autoregressive Modeling for Image Super-Resolution
Yunpeng Qu,
Kun Yuan#, Jinhua Hao, Kai Zhao, Qizhi Xie, Ming Sun, Chao Zhou
International Conference on Machine Learning (ICML), 2025.
[
Paper]
[
Project Page]
[
Wechat Sharing]
KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception
Yunpeng Qu,
Kun Yuan#, Qizhi Xie, Ming Sun, Chao Zhou, Jian Wang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025.
[
Paper][
Project
Page]
2024
QPT V2: Masked Image Modeling Advances Visual Scoring
Qizhi Xie,
Kun Yuan#, Yunpeng Qu, Mingda Wu, Ming Sun, Chao Zhou, Jihong Zhu
ACM International Conference on Multimedia (ACM MM), 2024.
[
Paper][
Project Page]
QNCD: Quantization Noise Correction for Diffusion Models
Huanpeng Chu, Wei Wu, Chengjie Zang,
Kun Yuan
ACM International Conference on Multimedia (ACM MM), 2024.
[
Paper][
Project Page]
XPSR: Cross-modal Priors for Diffusion-based Image Super-Resolution
Yunpeng Qu*,
Kun Yuan*, Kai Zhao, Qizhi Xie, Jinhua Hao, Ming Sun, Chao Zhou
European Conference on Computer Vision (ECCV), 2024.
[
Paper]
[
Project Page]
[
Wechat Sharing]
KVQ: Kwai Video Quality Assessment for Short-form Videos
Yiting Lu*, Xin Li*, Yajing Pei*,
Kun Yuan#, Qizhi Xie, Yunpeng Qu, Ming Sun, Chao Zhou, Zhibo
Chen#
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
[
Paper]
[
Supp]
[
Project Page]
PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild
Kun Yuan*, Hongbo Liu*, Mading Li*, Muyi Sun, Ming Sun, Jiachao Gong, Jinhua Hao, Chao Zhou,
Yansong Tang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
[
Paper]
2023
Capturing Co-existing Distortions in User-Generated Content for No-reference Video Quality
Assessment
Kun Yuan*, Zishang Kong*, Chuanchuan Zheng, Ming Sun, Xing Wen
ACM International Conference on Multimedia (ACM MM), 2023.
[
Paper]
Ada-DQA: Adaptive Diverse Quality-aware Feature Acquisition for Video Quality Assessment
Hongbo Liu*, Mingda Wu*,
Kun Yuan*, Ming Sun, Yansong Tang, Chuanchuan Zheng, Xing Wen, Xiu
Li
ACM International Conference on Multimedia (ACM MM), 2023.
[
Paper]
Quality-aware Pre-trained Models for Blind Image Quality Assessment
Kai Zhao*,
Kun Yuan*, Ming Sun, Mading Li, Xing Wen
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
[
Paper]
[
Wechat Sharing]
2022
ShowFace: Coordinated Face Inpainting with Memory-Disentangled Refinement Networks
Zhuojie Wu, Xingqun Qi, Zijian Wang, Wanting Zhou,
Kun Yuan, Muyi Sun, Zhenan Sun
British Machine Vision Conference (BMVC), 2022.
[
Paper]
Self-supervised Correlation Mining Network for Person Image Generation
Zijian Wang, Xingqun Qi,
Kun Yuan, Muyi Sun
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
[
Paper]
2021
Learning N:M Fine-grained Structured Sparse Neural Networks from Scratch
Aojun Zhou, Yukun Ma, Junnan Zhu, Jianbo Liu, Zhijie Zhang,
Kun Yuan, Wenxiu Sun, Hongsheng
Li
International Conference on Learning Representations (ICLR), 2021.
[
Paper]
[
Project Page]
Incorporating Convolution Designs into Visual Transformers
Kun Yuan, Shaopeng Guo, Ziwei Liu, Aojun Zhou, Fengwei Yu, Wei Wu
IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
[
Paper]
[
Project Page]
Differentiable Dynamic Wirings for Neural Networks
Kun Yuan, Quanquan Li, Shaopeng Guo, Dapeng Chen, Aojun Zhou, Fengwei Yu, Ziwei Liu
IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
[
Paper]
Earlier
Learning Connectivity of Neural Networks from a Topological Perspective
Kun Yuan, Quanquan Li, Jing Shao, Junjie Yan
European Conference on Computer Vision (ECCV), 2020.
[
Paper]
SafeNet: Scale-normalization and Anchor-based Feature Extraction Network for Person
Re-identification
Kun Yuan, Qian Zhang, Chang Huang, Shiming Xiang, Chunhong Pan
International Joint Conferences on Artificial Intelligence (IJCAI), 2018.
[
Paper]
Deep Networks for Degraded Document Image Binarization through Pyramid Reconstruction
Gaofeng Meng,
Kun Yuan, Ying Wu, Shiming Xiang, Chunhong Pan
International Conference on Document Analysis and Recognition (ICDAR), 2017.
[
Paper]
Efficient Cloud Detection in Remote Sensing Images using Edge-aware Segmentation Network and
Easy-to-hard Training Strategy
Kun Yuan, Gaofeng Meng, Dongcai Cheng, Jun Bai, Shiming Xiang, Chunhong Pan
IEEE International Conference on Image Processing (ICIP), 2017.
[
Paper]
Workshops
NTIRE 2025 Challenge on Short-form UGC Video Quality Assessment and Enhancement: Methods and
Results
Xin Li,
Kun Yuan, Bingchen Li, Fengbin Guan, Yizhen Shao, Zihao Yu, Xijun Wang, Yiting Lu, Wei
Luo, Suhang Yao, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), 2025.
[
Paper]
[
Project Page]
NTIRE 2024 Challenge on Short-form UGC Video Quality Assessment: Methods and Results
Xin Li,
Kun Yuan, Yajing Pei, Yiting Lu, Ming Sun, Chao Zhou, Zhibo Chen, Radu Timofte
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), 2024.
[
Paper]
[
Project Page]
Zoom-VQA: Patches, Frames and Clips Integration for Video Quality Assessment
Kai Zhao,
Kun Yuan, Ming Sun, Xing Wen
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshop (CVPRW), 2023.
[
Paper]
[
Project Page]
Awards
-
快手研发线优秀项目奖:“基于Transformer的视频处理模型研究与落地”
2024
-
快手洛子峰奖:“KVQ:基于 AI 的视频质量评价”
2023
-
快手洛子峰奖:“基于主观的智能视频增强与编解码架构联合优化”
2023