Aditya Mehta

I'm an undergraduate studying Computer Science at Caltech.

I'm currently a researcher in Georgia Gkioxari's lab, where I have the privilege of working with Damiano Marsili on improving personalized recognition and visual reasoning for multimodal LLMs.

This summer, I was at Encharge AI (Series B startup making AI accelerators) working with the efficient ML research team focusing on LLM quantization techniques for Llama 3.2 and improving math reasoning performance (GSM8K). Previously, I was at the Caltech Vision Lab, where I developed methods for adapting vision transformers for unsupervised and semi-supervised segmentation of microscopy images.

My interests broadly revolve around efficient ML (quantization) and post-training, especially for large vision-language models and reasoning tasks. I'm always happy to connect to learn more and/or collaborate!

Feel free to reach out to me at amehta [at] caltech [dot] edu

Resume / LinkedIn / Google Scholar / GitHub / Twitter

News

[Feb. 2026] TWIN is accepted to CVPR 2026!
[Dec. 2025] TWIN is released!
[Jun. - Aug. 2025] Completed Research Engineering Internship at Encharge AI working on Llama 3 Quantization
[Dec. 2024] Began research with Georgia Gkioxari and Damiano Marsili
[Feb. 2024] Began research at the Caltech Vision Lab (with Markus Marks, Neehar Kondapaneni, and Pietro Perona)
[May 2022] Represented India at Regeneron ISEF and won the 2nd Grand Award

Publications

Same or Not? Enhancing Visual Perception in Vision-Language Models
Damiano Marsili, Aditya Mehta, Ryan Lin, Georgia Gkioxari
CVPR, 2026
project page / arXiv / code / bibtex

Website Template credits: Jon Barron