publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
An up-to-date list is available on Google Scholar
2025
- CoRNStack: High-Quality Contrastive Data for Better Code RankingIn International Conference on Learning Representations (ICLR), 2025
2024
- Nomic embed: Training a reproducible long context text embedderarXiv preprint arXiv:2402.01613, 2024
- DNA-Diffusion: Leveraging Generative Models for Controlling Chromatin Accessibility and Gene Expression via Synthetic Regulatory ElementsbioRxiv, 2024
- DNA-Diffusion: Leveraging Generative Models for Controlling Chromatin Accessibility and Gene Expression via Synthetic Regulatory ElementsIn ICLR 2024 Workshop on Machine Learning for Genomics Explorations, 2024
- Nomic Embed Vision: Expanding the Latent SpacearXiv preprint arXiv:2406.18587, 2024
2023
- Gpt4all: Training an assistant-style chatbot with large scale data distillation from gpt-3.5-turboGitHub (2023), 2023
- An RNA foundation model enables discovery of disease mechanisms and candidate therapeuticsbioRxiv, 2023
- GPT4All: An ecosystem of open source compressed language modelsarXiv preprint arXiv:2311.04931, 2023
2021
- Machine learning methods for extracting structure functions from experimental dataIn APS April Meeting Abstracts, 2021
- A tale of two long tailsarXiv preprint arXiv:2107.13098, 2021
2020
- Using Adversarial Networks to Generate Realistic Structure Function SurfacesIn APS Division of Nuclear Physics Meeting Abstracts, 2020