Yufan Ren

Hello! I am currently a direct Ph.D. student (2020-) at Image and Visual Representation Lab, part of the School of Computer and Communication Sciences at EPFL, Switzerland. Under the guidance of Prof. Sabine Süsstrunk and Dr. Tong Zhang, I work on 3D vision, Neural Rendering, and Diffusion Models.

Prior to my Ph.D., I earned my bachelor's degree from Zhejiang University, Hangzhou, where I was honored to receive the Chu Kochen Award. Those four years were some of the happiest of my life.

I'm passionate about the recent advancement in generative models (Diffusion Models) and large language models (chatGPT), and I'm always eager to engage in discussions on these topics. If you're a master's student at EPFL and share similar interests, I'm currently seeking a research assistant to work with me. Some of my available master projects are on the lab's website here, and don't hesitate to reach out if you have any questions or want to chat!

Email  /  Google Scholar  /  LinkedIn  /  GitHub

profile photo

[March, 2024] I'm glad to share that starting this March I will work as a Research Intern working on 3D reconstruction at Nvidia in Zurich!

🎉 [Oct, 2023]: The next International Conference on Computational Photography (ICCP) is happening at EPFL, Switzerland in 2024. I am excited to announce my role as website chair.

🎉 [Sept, 2023]: I am invited for a talk at AWS LauzHack Cloud Research Day on DeepFakes detection.

🎉 [Apirl, 2023]: I am admitted to the International Computer Vision Summer School (ICVSS). See you in Sicily 🏝!

DiffusionPCR: Diffusion Models for Robust Multi-Step Point Cloud Registration
Zhi Chen*, Yufan Ren*, Tong Zhang, Zheng Dang, Wenbing Tao, Sabine Süsstrunk, Mathieu Salzmann
ArXiv'23, arXiv, Project Page, Code (Coming soon)

Point Cloud Registration (PCR) estimates the relative rigid transformation between two point clouds. We propose formulating PCR as a denoising diffusion probabilistic process, mapping noisy transformations to the ground truth.

VolRecon: Volume Rendering of Signed Ray Distance Functions for Generalizable Multi-View Reconstruction
Yufan Ren*, Fangjinhua Wang*, Tong Zhang, Marc Pollefeys, Sabine Süsstrunk
CVPR'23, arXiv, Project Page, Code

We introduce VolRecon, a novel generalizable implicit reconstruction method with Signed Ray Distance Function (SRDF). To reconstruct the scene with fine details and little noise, VolRecon combines projection features aggregated from multi-view features, and volume features interpolated from a coarse global feature volume.

Learning V1 Simple Cells with Vector Representation of Local Content and Matrix Representation of Local Motion
Ruiqi Gao, Jianwen Xie, Siyuan Huang, Yufan Ren, Song-Chun Zhu, Ying Nian Wu
AAAI'22, arXiv

We propose a representational model for image pairs such as consecutive video frames that are related by local pixel displacements, in the hope that the model may shed light on motion perception in primary visual cortex (V1). The model couples the following two components: (1) the vector representations of local contents of images and (2) the matrix representations of local pixel displacements caused by the relative motions between the agent and the objects in the 3D scene.

Blendedmvs: A large-scale dataset for generalized multi-view stereo networks
Yao Yao, Zixin Luo, Shiwei Li, Jingyang Zhang, Yufan Ren, Lei Zhou, Tian Fang, Long Quan
CVPR'20, arXiv, Dataset

We introduce BlendedMVS, a novel large-scale dataset, to provide sufficient training ground truth for learning-based MVS. To create the dataset, we apply a 3D reconstruction pipeline to recover high-quality textured meshes from images of well-selected scenes. Then, we render these mesh models to color images and depth maps. To introduce the ambient lighting information during training, the rendered color images are further blended with the input images to generate the training input.

Selected Projects and Activities

International Computer Vision Summer School 2023
Università di Catania,

The school aims to provide a stimulating opportunity for young researchers and Ph.D. students. The participants will benefit from direct interaction and discussions with world leaders in Computer Vision. Participants will also have the possibility to present the results of their research, and to interact with their scientific peers, in a friendly and constructive environment.

Multimodal Fake Media Detection: AI Singapore Trusted Media Challenge
AI Singapore 2022, EPFL News

Peter Grönquist and I did this challenge and won the 100,000 USD prize (incl. grant). In this challenge, we design machine learning models to detect three types of fakeness, i.e., fake faces (DeepFakes), manipulated audio, and mis-synchronization (lip-sync), and use engineering tricks to make it fast.

Thanks for the awesome template of Jon Barron.