|
Research
My research interests broadly span computer vision, with a particular focus on vision-and-language multimodalities, generative models, and the robustness and interpretability of AI systems.
|
|
CHURRO: A Large Dataset for Handwriting and Print Recognition in Historical Documents with Large Multimodal Models
Sina Semnani,
Han Zhang,
Xinyan He,
Merve Tekgurler,
Monica Lam,
EMNLP 2025
website /
arXiv /
code
We introduce CHURRO, a large-scale, multimodal OCR benchmark and dataset of historical documents, revealing that even the best large multimodal models (LMMs) struggle with historical text and demonstrating that targeted fine-tuning significantly improves their performance.
|
|
Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation
Lisa Dunlap,
Alyssa Umino,
Han Zhang,
Jiezhi Yang,
Joseph E. Gonzalez,
Trevor Darrell
NeurIPS 2023
website /
arXiv /
code
We introduce ALIA, a method which utilizes large vision and language models to automatically generate natural language descriptions of a dataset's domains and augment the training data via language-guided image editing.
|
|
Using Language to Extend to Unseen Domains
Lisa Dunlap,
Clara Mohri,
Han Zhang,
Devin Guillory,
Trevor Darrell,
Joseph E. Gonzalez,
Aditi Raghunathan,
Anna Rohrbach
ICLR 2023 (Spotlight)
website /
arXiv /
code /
blog
We propose LADS, a method that learns a transformation of the image embeddings from the training domain to each unseen test domain guided by language, while preserving task relevant information.
|
|
Delayed and Indirect Impacts of Link Recommendations
Han Zhang,
Shangen Lu,
Yixin Wang,
Mihaela Curmei
FAccT 2023
arXiv /
code
We find that link recommendations have surprising delayed and indirect effects on the structural properties of networks through adapting a simulation-based approach and an explicit dynamic formation model.
|
|
Applied Research Scientist, NVIDIA
May 2025 - Present
Applied Research Intern, NVIDIA
Jan 2025 - March 2025, June 2024 - September 2024
|
|
Engineering Summer Analyst, Goldman Sachs
June 2022 - August 2022
Developed and modularized a calculator for the loan product, GS Select, enabling automated daily computation of key financial metrics, including risk-weighted assets.
|
|
Software Engineer Intern, Bytedance
March 2021 - August 2021
Optimized TikTok's Android package size and cold startup time by developing custom optimization passes leveraging an open-source Android bytecode optimizer, ReDex.
|
Website template from Jon Barron. Thanks for stopping by :)
Last updated: May 15, 2026
|
|