Academic

Publications

Unified Embodied VLM Reasoning with Robotic Action via Autoregressive Discretized Pre-training

[PDF(coming soon)] [Code(coming soon)]

InternVLA-M1: A Spatially Grounded Foundation Model for Generalist Robot Policy
InternVLA-M1 Contributors
[Code] [Project]

BibTeX

@misc{internvla2024,
  title  = {InternVLA-M1: Latent Spatial Grounding for Instruction-Following Robotic Manipulation},
  author = {InternVLA-M1 Contributors},
  year   = {2025},
  booktitle={arXiv},
}

Beyond ‘Templates’: Category-Agnostic Object Pose, Size, and Shape Estimation from a Single View Under Review
Jinyu Zhang, Haitao Lin, Jiashu Hou, Xiangyang Xue, Yanwei Fu.
[PDF] [Code(coming soon)]

LAC-Net: Linear-Fusion Attention-Guided Convolutional Network for Accurate Robotic Grasping Under the Occlusion, IROS 2024 (Oral Pitch).
Jinyu Zhang, Yongchong Gu, Jianxiong Gao, Haitao Lin, Qiang Sun, Xinwei Sun, Xiangyang Xue, Yanwei Fu.
[PDF] [Code] [Project]

BibTeX

@misc{zhang2024lacnetlinearfusionattentionguidedconvolutional,
      title={LAC-Net: Linear-Fusion Attention-Guided Convolutional Network for Accurate Robotic Grasping Under the Occlusion}, 
      author={Jinyu Zhang and Yongchong Gu and Jianxiong Gao and Haitao Lin and Qiang Sun and Xinwei Sun and Xiangyang Xue and Yanwei Fu},
      year={2024},
      eprint={2408.03238},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2408.03238}, 
}

Open-Source Contributions

InternManip — Contributor & Maintainer
A unified framework for robotic training and evaluation.
GitHub Repo
InternData-M1 dataset — Core Contributor
A large-scale embodied robotics dataset (~250k simulated demonstrations) with rich annotations, including 2D/3D boxes, trajectories, grasp points, and semantic masks.
Hugging Face
InternData-M1 — Core Contributor
A Spatially Grounded Foundation Model for Generalist Robot Policy.
Github Repo
IROS 2025 Challenge Organization — Contributor & Maintainer
Core contributor of the IROS 2025 Challenge on Dual-Arm Manipulation, featuring 10 carefully designed scenarios with diverse tasks, assessing multi-scale robotic capabilities.
Dataset can be accessed from here, welcome to participate! Hugging Face