Zach Nussbaum
I’m a principal machine learning engineer at Nomic where I lead the development of Nomic Embed, state-of-the art text and vision embedding models.
Prior to Nomic, I worked at Deep Genomics where I worked on machine learning for drug discovery. I worked on reimplementations of gene expression prediction from sequence (Enformer) as well as helping to train BigRNA, a model trained to predict tissue-specific RNA expression, splicing, microRNA sites, and RNA binding protein specificity from DNA sequence.
I have also participated in open-source/open-community research groups with ML Collective and OpenBioML.
In a past life, I was a Divsion 1 Baseball Player at Davidson College (yes, Steph Curry’s Davidson).
news
| Jan 25, 2025 | CoRNStack: High-Quality Contrastive Data for Better Code Ranking accepted to ICLR 2025! |
|---|---|
| Jun 01, 2024 | Nomic Embed Vision techincal report, weights, and blog released. |
| Mar 04, 2024 | DNA Diffusion accepted to ICLR MLGenX and selected for Outstanding Paper Award |
| Feb 14, 2024 | Nomic Embed Text v1.5 blog and weights released. |
| Feb 01, 2024 | Nomic Embed Text v1 techincal report, weights, and training code released. |
| Sep 26, 2023 | BigRNA Preprint and blog post released |
| Jul 21, 2021 | A Tale of Two Long Tails presented at the UDL Workshop at ICML |
latest posts
selected publications
2025
-
CoRNStack: High-Quality Contrastive Data for Better Code RankingIn International Conference on Learning Representations (ICLR), 2025
2024
-
Nomic embed: Training a reproducible long context text embedderarXiv preprint arXiv:2402.01613, 2024 -
DNA-Diffusion: Leveraging Generative Models for Controlling Chromatin Accessibility and Gene Expression via Synthetic Regulatory ElementsIn ICLR 2024 Workshop on Machine Learning for Genomics Explorations, 2024
2023
-
Gpt4all: Training an assistant-style chatbot with large scale data distillation from gpt-3.5-turboGitHub (2023), 2023 -
An RNA foundation model enables discovery of disease mechanisms and candidate therapeuticsbioRxiv, 2023
2021
-
A tale of two long tailsarXiv preprint arXiv:2107.13098, 2021