Han (Paris) Zhang 章晗

I am currently an Applied Research Engineer at NVIDIA.

I earned my Master's degree in Computer Science from Stanford University, where I was honored as a Siebel Scholar, Class of 2025.

I received my B.A. in Computer Science and Economics from UC Berkeley, where I was awarded the prestigious EECS Department Citation. During my time there, I had the privilege of conducting research in vision and language under the guidance of Prof. Trevor Darrell, Prof. Joseph E. Gonzalez, and Ph.D. candidate Lisa Dunlap.

Email  /  Google Scholar  /  LinkedIn

profile photo
Research

My research interests broadly span computer vision, with a particular focus on vision-and-language multimodalities, generative models, and the robustness and interpretability of AI systems.

Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation
Lisa Dunlap, Alyssa Umino, Han Zhang, Jiezhi Yang, Joseph E. Gonzalez, Trevor Darrell
NeurIPS 2023
website / arXiv / code

We introduce ALIA, a method which utilizes large vision and language models to automatically generate natural language descriptions of a dataset's domains and augment the training data via language-guided image editing.


Using Language to Extend to Unseen Domains
Lisa Dunlap, Clara Mohri, Han Zhang, Devin Guillory, Trevor Darrell, Joseph E. Gonzalez, Aditi Raghunathan, Anna Rohrbach
ICLR 2023 (Spotlight)
website / arXiv / code / blog

We propose LADS, a method that learns a transformation of the image embeddings from the training domain to each unseen test domain guided by language, while preserving task relevant information.


Delayed and Indirect Impacts of Link Recommendations
Han Zhang, Shangen Lu, Yixin Wang, Mihaela Curmei
FAccT 2023
arXiv / code

We find that link recommendations have surprising delayed and indirect effects on the structural properties of networks through adapting a simulation-based approach and an explicit dynamic formation model.


Industry
Applied Research Engineer Intern, NVIDIA
Jan 2025 - March 2025, June 2024 - September 2024

Developed and released Commercial VILA, a Vision-Language Model (VLM) trained exclusively on commercial datasets.

Designed autolabeling workflows and trained a 3D-VLM for spatial reasoning in warehouse environments, with a demo featured at GTC 2025 Spatial AI session and exhibition booth.

Research Intern, Tencent AI Lab
June 2023 - Feb 2024

Conducted research on 3D-aware human body generation utilizing diffusion models, ControlNet, and autoencoders.

Engineering Summer Analyst, Goldman Sachs
June 2022 - August 2022

Developed and modularized a calculator for the loan product, GS Select, enabling automated daily computation of key financial metrics, including risk-weighted assets.

Software Engineer Intern, Bytedance
March 2021 - August 2021

Optimized TikTok's Android package size and cold startup time by developing custom optimization passes leveraging an open-source Android bytecode optimizer, ReDex.


Website template from Jon Barron. Thanks for stopping by :)
Last updated: May 27, 2025