Jiachen Zhu

About Me (Jiachen Zhu / 朱家晨)

I am a fifth-year PhD candidate in Computer Science at NYU Courant, advised by Yann LeCun.
I am also a Visiting Researcher at FAIR, Meta, where I am hosted by Zhuang Liu.

I am actively seeking full-time research positions in vision models or vision-language models starting in late 2025. If you think my background aligns with your team’s goals, please reach out!

Education

Research Interests

My research focuses on self-supervised learning for images and videos, as well as pretraining vision encoders for vision-language models (VLMs). I am also interested in understanding the design of all kind of neural network architectures.

Papers

Transformers without Normalization

MetaMorph: Multimodal Understanding and Generation via Instruction Tuning

Variance-Covariance Regularization Improves Representation Learning

VoLTA: Vision-Language Transformer with Weakly-Supervised Local-Feature Alignment

Masked Siamese ConvNets

TiCo: Transformation Invariance and Covariance Contrast for Self-Supervised Visual Representation Learning

Contact

jiachen DOT zhu AT nyu DOT edu

Google Scholar

Appendix

My Favourite Illusion!

Two ideas that I find both shockingly simple and extremely clever: 1, 2