CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding
Fuzhou University CEED-VLA: Consistency Vision-Language-Action Model with Early-Exit Decoding : A rXiv preprint : 2025.06 : Wenxuan Song, Jiayi Chen , Yuxin Huang, Pengxiang Ding : , 西 , 20 25 - 11 - 20 CEED-VLA: 退
VLM到VLA:具身智能演化之路
VLM VLA VLM Vision-Language Model VLM LLM VLM 1. ( CV) CNN, Vision Transformer 2. ( NLP) LLM GPT VLM
VLM核心架构
VLM Vision Encoder Projector LLM Projector Vision Encoder LLM 线 ( Linear Projector) MLP Q-Former (Querying Transformer)
VLA(Vision-Language-Action Model)
VLA Vision-Language-Action Model Embodied AI 使 AI VLM: --------> VLA: VLM + VLM: ( / ) ---> ( ) VLA: ( - / ) + ( ) ---> ( ) - - VLA VLA
VLA(Vision-Language-Action Model)
VLA Vision-Language-Action Model VLA VLM + + latent space , 7 Token
VLA(Vision-Language-Action Model)
VLA Vision-Language-Action Model latent space Vision-Language Encoder Latent vector Action decoder Latent vector
目前已有的几种VLA
VLA VLA 1.RT-2 OpenVLA π0 2. Helix Groot N1
例: OpenVLA
: OpenVLA OpenVLA 70 亿 VLA 2024 6 Open X-Embodiment 21 22 使 DINOv2 CLIP Llama-2 Vision Encoder Language Encoder 7 1. (x,y,z) 2. 姿 ( xyz 3.
VLA目前的不足与挑战
VLA 1. : 2. : VLA 3. Token OpenVLA 4. VLA 5. : VLA
1.摘要
1. VLA Jacobi 退 线 4
2.引言
2. VLA VLA AR Jacobi Jacobi fixed-point iteration method 线 VLA Jacobi AR 1.28 VLA ground-truth tokens token token
2.引言
2. : CEED-VLA consistency distillation Jacobi 退 4.1 4.3
2.引言 -Jacobi fixed-point iteration method
2. - Jacobi fixed-point iteration method AR Decoding x LLM p ·|x 使 AR token yi i y1 y2 ..yn n x promot, AR 7 yi-1 i p(y | yi-1, x)   yi-1   ( )   x   ( )   y   arg max y   (yi) (yi-1) (x) (y) LLM n n token Yn 使 token
2.引言
2. Jacobi Decoding AR Jacobi token Fixed Point Fixed point: 1.J j jacobi j=0 2. 沿 (j-Iteration)   Y(j)   Y(j+1) 3. Y(0) -> Y(1) -> Y(2) -> ... -> Y(k) Trajectory 沿  (Y(k)) (Y(k -1))
2.引言-Jacobi fixed-point iteration method
2. - Jacobi fixed-point iteration method Jacobi Decoding
2.引言-Jacobi fixed-point iteration method
2. - Jacobi fixed-point iteration method Jacobi Decoding
3.相关工作
3. (1)VLA LLM VLA DeeR-VLA QAIL RoboMamba TinyVLA VLA token FAST MoLe VLA 线 PD-VLA 线 使 OpenVLA OFT CEED-VLA VLA VLA
4.模型-概述
4. -
4.模型-1
4. -1 Teacher Model 使 OpenVLA/ LLaVA-VLA Vanilla VLAs P θ st l VLA action chunking t m at 7 7 1. (x,y,z) 2. 姿 ( xyz 3.
4.模型-概述
4. - Teacher Model 使 OpenVLA/ LLaVA-VLA 1. 使 Teacher Model Jacobi Decoding 使 使 tokens actions + Trajectory dataset Student Model Student Model y(j) y(k) Jacobi decoding
4.模型-概述
4. - Student Model CEED-VLA 2.Student Model Consistency Distillation Consistency Loss Mixed-label AR Supervision Mixed-label AR Loss 使 Student Model y(j) y(k) Teacher Model y * 使 ground truth
4.模型-概述
4. - Inference: Early-exit Decoding 退 3.Early-exit Decoding Jacobi Student Model Jacobi 退
4.模型-2
4. -2 Student Model CEED-VLA 1. VLA P Q θ ·| x CEED-VLA θ P 2. Jacobi inherent consistency VLA P C 使 Jacobi 3. Jacobi D 2
4.模型-2
4. -2 Student Model CEED-VLA Consistency Loss Jacobi J Y Y* x Jacobi J Y Y* CEED-VLA Y Y* Q θ− Teacher model Q θ Student Model θ−= stopgrad θ KL ·||· KL i KL Student Model Teacher model y* KL
4.模型-2
4. -2 Student Model CEED-VLA Mixed-label AR Supervision Teacher Model AR LAR Teacher Model
4.模型-2
4. -2 Student Model CEED-VLA Mixed-label AR Supervision L1 Jacobi Ground truth AR δ max D L1 y −GT δ max y * Ground-trouth
4.模型-2
4. -2 Student Model CEED-VLA ω
4.模型-2
4. -2 Inference: Early-exit Decoding CEED-VLA Jacobi CEED- VLA AR 1 Y k =Y k-1 30 token 1: CEED-VLA AR Jacobi 退
4.模型-2
4. -2 Inference: Early-exit Decoding 1 2. token 使
4.模型-2
4. -2 Inference: Early-exit Decoding 退 σ σ
5.实验
5. CEED-VLA CEED-VLA VLA 广 使 仿 CALVIN LIBERO CALVIN(ABC->D) 500 LIBERO 10 50 2 线
5.消融实验
5. OpenVLA 3 退
5.消融实验
5. 退 5 σ AR 37 退 σ 16 16 退 σ 8 CEED - VLA - Turbo 退 4 使 Early-exit Decoding
5.消融实验
5. Mixed-label AR Supervision AR 5 Student Model Teacher Model Ground Truth 使 Teacher Model Avg.len 使 Ground Truth AR
5.消融实验
5. Loss Design 1 10 6 AR
5.消融实验
5. Data Size CEED-VLA Jacobi token 5 60 k epoch 10 120 k
5.真实世界实验
5. AgileX PiPer VLA 使 RealSense L515 使 ORBBEC Dabai https://irpn-eai.github.io/CEED-VLA/
5.真实世界实验结果
5. 4 20
6.结论
6. CEED-VLA 退 广 CEED-VLA
6.结论
6. VLA GPU 使 使 广
谢谢!