Zach Nussbaum
I’m a principal machine learning engineer at Nomic where I lead the development of Nomic Embed, state-of-the art text and vision embedding models.
Prior to Nomic, I worked at Deep Genomics where I worked on machine learning for drug discovery. I worked on reimplementations of gene expression prediction from sequence (Enformer) as well as helping to train BigRNA, a model trained to predict tissue-specific RNA expression, splicing, microRNA sites, and RNA binding protein specificity from DNA sequence.
I have also participated in open-source/open-community research groups with ML Collective and OpenBioML.
In a past life, I was a Divsion 1 Baseball Player at Davidson College (yes, Steph Curry’s Davidson).
news
Jan 25, 2025 | CoRNStack: High-Quality Contrastive Data for Better Code Ranking accepted to ICLR 2025! |
---|---|
Jun 01, 2024 | Nomic Embed Vision techincal report, weights, and blog released. |
Mar 04, 2024 | DNA Diffusion accepted to ICLR MLGenX and selected for Outstanding Paper Award |
Feb 14, 2024 | Nomic Embed Text v1.5 blog and weights released. |
Feb 01, 2024 | Nomic Embed Text v1 techincal report, weights, and training code released. |
Sep 26, 2023 | BigRNA Preprint and blog post released |
Jul 21, 2021 | A Tale of Two Long Tails presented at the UDL Workshop at ICML |
latest posts
selected publications
2025
- CoRNStack: High-Quality Contrastive Data for Better Code RankingIn International Conference on Learning Representations (ICLR), 2025
2024
- Nomic embed: Training a reproducible long context text embedderarXiv preprint arXiv:2402.01613, 2024
- DNA-Diffusion: Leveraging Generative Models for Controlling Chromatin Accessibility and Gene Expression via Synthetic Regulatory ElementsIn ICLR 2024 Workshop on Machine Learning for Genomics Explorations, 2024
2023
- Gpt4all: Training an assistant-style chatbot with large scale data distillation from gpt-3.5-turboGitHub (2023), 2023
- An RNA foundation model enables discovery of disease mechanisms and candidate therapeuticsbioRxiv, 2023
2021
- A tale of two long tailsarXiv preprint arXiv:2107.13098, 2021