Siyuan Li

Siyuan Li

PhD student@ETH Zurich

Email: siyuan.li AT vision.ee.ethz.ch

© 2026

Bio


I am a PhD student at Computer Vision Laboratory, ETH Zurich, Switzerland, supervised by Dr. Martin Danelljan and Prof. Luc Van Gool. I am lucky to have the opportunities to work with Prof. Helge Rhodin at UBC, Prof. Pascal Fua, Prof. Alexandre Alahi at EPFL, and Prof. Jingyi Yu at ShanghaiTech. I'm interested in computer vision, machine learning and their applications on autonomous driving, robotics and augment reality.

Before that, I obtained my B.Sc. in Computer Science at Wuhan University, Wuhan, China and M.Sc. in Computer Science at EPFL, Lausanne, Switzerland.

Working Experience


  • 04. 2025 - 11. 2025
  • Research Scientist Intern (SAM Team)
    FAIR, Meta, Menlo Park
  • 10. 2020 - 08. 2021
  • Research Scientist Intern
    Disney Research Zurich
  • 07. 2020 - 10. 2020
  • Research Scientist Intern
    Tencent AI Lab
  • 02. 2019 - 11. 2019
  • Research Assistant
    Computer Vision Lab, EPFL

    Publications


    • SAM 3: Segment Anything with Concepts

      SAM 3: Segment Anything with Concepts

      Nicolas Carion*, Laura Gustafson*, ... Siyuan Li° (random author ordering), ... Piotr Dollar†, Nikhila Ravi†, Kate Saenko†, Pengchuan Zhang†,Christoph Feichtenhofer† [Show All Authors]

      Arxiv 2026

      Paper

      Code

      Page

    • MVTracker: Multi-View 3D Point Tracking

      MVTracker: Multi-View 3D Point Tracking

      Frano Rajič, Haofei Xu, Marko Mihajlovic, Siyuan Li, Irem Demir, Emircan Gündoğdu, Lei Ke, Sergey Prokudin, Marc Pollefeys, Siyu Tang

      International Conference on Computer Vision (ICCV 2025 Oral)

      Paper

      Code

      Page

    • ProDyG: Progressive Dynamic Scene Reconstruction via Gaussian Splatting from Monocular Videos

      ProDyG: Progressive Dynamic Scene Reconstruction via Gaussian Splatting from Monocular Videos

      Shi Chen, Erik Sandström, Sandro Lombardi, Siyuan Li, Martin R. Oswald

      Conference on Neural Information Processing Systems (NeurIPS 2025)

      Paper

      Slides

    • UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler

      UniDepthV2: Universal Monocular Metric Depth Estimation Made Simpler

      Luigi Piccinelli, Christos Sakaridis, Yung-Hsu Yang, Mattia Segu, Siyuan Li, Wim Abbeloos, Luc Van Gool

      IEEE Transactions on Pattern Analysis and Machine Intelligence(TPAMI 2025)

      arXiv

      Code

      Page

    • 3D-MOOD: Lifting 2D to 3D for Monocular Open-Set Object Detection

      3D-MOOD: Lifting 2D to 3D for Monocular Open-Set Object Detection

      Yung-Hsu Yang, Luigi Piccinelli, Mattia Segu, Siyuan Li, Rui Huang, Yuqian Fu, Marc Pollefeys, Hermann Blum, Zuria Bauer

      International Conference on Computer Vision (ICCV 2025)

      Paper

      Code

      Page

    • UniK3D: Universal Camera Monocular 3D Estimation

      UniK3D: Universal Camera Monocular 3D Estimation

      Luigi Piccinelli, Christos Sakaridis, Mattia Segu, Yung-Hsu Yang, Siyuan Li, Wim Abbeloos, Luc Van Gool

      Conference on Computer Vision and Pattern Recognition (CVPR 2025)

      Paper

      Code

      Page

    • One2Any: One-Reference 6D Pose Estimation for Any Object

      One2Any: One-Reference 6D Pose Estimation for Any Object

      Mengya_Liu , Siyuan Li, Ajad Chhatkuli, Prune Truong, Luc Van Gool, Federico Tombari

      Conference on Computer Vision and Pattern Recognition (CVPR 2025)

      Paper

      Code

      Page

    • Samba: Synchronized Set-of-Sequences Modeling for Multiple Object Tracking

      Samba: Synchronized Set-of-Sequences Modeling for Multiple Object Tracking

      Mattia Segu, Luigi Piccinelli, Siyuan Li, Yung-Hsu Yang, Bernt Schiele, Luc Van Gool

      International Conference on Learning Representations (ICLR 2025 Spotlight)

      Paper

      Code

      Page

    • Matching Anything By Segmenting Anything

      Matching Anything By Segmenting Anything

      Siyuan Li, Lei Ke, Martin Danelljan, Luigi Piccinelli, Mattia Segu, Luc Van Gool, Fisher Yu

      Conference on Computer Vision and Pattern Recognition (CVPR 2024 Highlight)

      arXiv

      Code

      Page

    • UniDepth: Universal Monocular Metric Depth Estimation

      UniDepth: Universal Monocular Metric Depth Estimation

      Luigi Piccinelli, Yung-Hsu Yang, Christos Sakaridis, Mattia Segu, Siyuan Li, Luc Van Gool, Fisher Yu

      Conference on Computer Vision and Pattern Recognition (CVPR 2024 Highlight)

      arXiv

      Code

      Page

    • SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking

      SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking

      Siyuan Li, Lei Ke, Yung-Hsu Yang, Luigi Piccinelli, Mattia Segu, Martin Danelljan, Luc Van Gool

      European Conference on Computer Vision (ECCV 2024)

      arXiv

    • Walker: Self-supervised Multiple Object Tracking by Walking on Temporal Object Appearance Graphs

      Walker: Self-supervised Multiple Object Tracking by Walking on Temporal Object Appearance Graphs

      Mattia Segu, Luigi Piccinelli, Siyuan Li, Luc Van Gool, Fisher Yu, Bernt Schiele

      European Conference on Computer Vision (ECCV 2024)

      Paper

      Code

    • Cascade-DETR: Delving into High-Quality Universal Object Detection

      Cascade-DETR: Delving into High-Quality Universal Object Detection

      Mingqiao Ye*, Lei Ke*, Siyuan Li, Yu-Wing Tai, Chi-Keung Tang, Martin Danelljan, Fisher Yu

      International Conference on Computer Vision (ICCV 2023)

      arXiv

      Code

    • OVTrack: Open-Vocabulary Multiple Object Tracking

      OVTrack: Open-Vocabulary Multiple Object Tracking

      Siyuan Li*, Tobias Fischer*, Lei Ke, Henghui Ding, Martin Danelljan, Fisher Yu

      Conference on Computer Vision and Pattern Recognition (CVPR 2023)

      arXiv

      Code

      Page

    • Tracking Every Thing in the Wild

      Tracking Every Thing in the Wild

      Siyuan Li, Martin Danelljan, Henghui Ding, Thomas E. Huang, Fisher Yu

      European Conference on Computer Vision (ECCV 2022)

      arXiv

      Code

      Page

    • Semantically-aware Discriminator: The Power of a Shared Representation for Image Synthesis

      Semantically-aware Discriminator: The Power of a Shared Representation for Image Synthesis

      Saeed Saadatnejad, Siyuan Li, Taylor Mordan, Alexandre Alahi

      IEEE Transactions on Intelligent Transportation Systems (T-ITS 2021)

      arXiv

    • Deformation-aware Unpaired Image Translation for Pose Estimation on Laboratory Animals

      Deformation-aware Unpaired Image Translation for Pose Estimation on Laboratory Animals

      Siyuan Li, Semih Günel, Mirela Ostrek, Pavan Ramdya, Pascal Fua, Helge Rhodin

      Conference on Computer Vision and Pattern Recognition (CVPR 2020)

      Paper

      Code