ICML 2024 Workshop on Accessible and Efficient Foundation Models for Biological Discovery
There is a growing gap between machine learning (ML) research on biology-inspired problems and the actual broad-based use of ML in the lab or the clinic. This gap is especially pressing in the context of foundation models and other large ML models. Accessibility and efficiency concerns limit the adoption of these models by biologists and clinicians. Large ML models may require extensive GPU clusters to train, while most biological labs only have access to much more modest computational resources. The usability of these models for non-expert users is also a concern, as is the need to iteratively adapt these models based on lab discoveries.
Call for Papers
This workshop seeks to bring ML and biomedical researchers together to identify interdisciplinary approaches to design and apply large, complex ML models for biomedical discovery. We invite researchers from academia and industry to submit original papers to bridge the accessibility and efficiency gap between ML research and wet lab use. All accepted papers will be invited to present posters at the workshop, and a few will be invited to give individual spotlight presentations.
We are seeking original submissions in topics including, but not limited to:
- Parameter-, memory-, and compute-efficient foundation models for biological data, including
model compression and quantization techniques - Algorithms for training efficient generative models in biology
- Efficient fine-tuning and adaptation of biological foundation models
- Accessible cloud/web-based methods for foundational biological discovery
- Knowledge distillation and transfer learning across biological contexts
- Lab in the loop: iterative approaches to refine ML models based on initial experimental results
- Hypothesis-driven machine learning in biology and uncertainty modeling in biological foundation models
Submissions
Submissions must present original research that has not been previously published. Submitted manuscript should be composed of a main body, which can be up to four pages long, followed by an unlimited number of pages for references and appendices, all in a single file. All submissions must be anonymous and should not include any information that violates the double-blind review process, including citing authors' prior work or sharing links in a way that can reveal the identities of authors to potential reviewers. Submissions that do not conform to these instructions may be desk-rejected at the Program Committee's discretion to ensure a fair review process for all potential authors. After submission and during the review period, authors are allowed to post their papers as technical reports on bioRxiv, arXiv, or other public forums. For details concerning the format of the papers, please see the LaTeX style files on Overleaf.
Submissions should be made through OpenReview.
Important Dates
- Submission Opens: April 5, 2024
- Submission Deadline: May 21, 2024 May 28, 2024 Submissions are now closed
- Acceptance Notification: June 17, 2024 (Anywhere on Earth)
- Camera Ready Papers Due: July 26, 2024
- Workshop Date: July 27, 2024
AccMLBio will be an in-person workshop at ICML 2024 in Vienna, Austria on July 27, 2024.
Student Travel Grants
We will provide a limited number of grants to students authors of accepted papers to cover travel expenses to the conference. Travel grant applications, along with letters of recommendation, are due by June 17, 2024 (Anywhere on Earth). Applications for student travel grants are now closed.
Journal Partner
We will be partnering with Cell Reports Methods for expedited consideration of accepted papers. Cell Reports Methods is an all Open Access journal from Cell Press focused on methodological advances and insights for a broad audience. Consideration at the journal will be at author's discretion and will adhere to normal editorial standards and publication agreements at the journal.
Schedule
All times are listed in CEST.
9:00 | Opening Remarks |
9:10 | Invited Speaker - Burkhard Rost Artificial Intelligence Deciphers the Code of Life Written in Proteins |
9:40 | Spotlight Session |
Cramming Protein Language Model Training in 24 GPU Hours Nathan C. Frey, Taylor Joren, Aya Abdelsalam Ismail, Allen Goodman, Richard Bonneau, Kyunghyun Cho, Vladimir Gligorijevic |
|
Likelihood-based fine-tuning of protein language models for few-shot fitness prediction and design Alex Hawkins-Hooker, Jakub Kmec, Oliver Bent, Paul Duckworth |
|
ProtMamba: a homology-aware but alignment-free protein state space model Damiano Sgarbossa, Cyril Malbranke, Anne-Florence Bitbol |
|
Training Compute-Optimal Protein Language Models Xingyi Cheng, Bo Chen, Pan Li, Jing Gong, Jie Tang, Le Song |
|
10:30 | Coffee Break |
10:40 | Invited Speaker - David Page Perspectives on a Possible Foundation Model for Health |
11:10 | Invited Speaker - Bryan Bryson Learning the Rules of Pathogen-Derived Antigen Presentation on MHC-I and MHC-II |
11:40 | Spotlight Session |
scTree: Discovering Cellular Hierarchies in the Presence of Batch Effects in scRNA-seq Data Moritz Vandenhirtz, Florian Barkmann, Laura Manduchi, Julia E Vogt, Valentina Boeva |
|
MiniMol: A Parameter-Efficient Foundation Model for Molecular Learning Kerstin Klaser, Blazej Banaszewski, Samuel Maddrell-Mander, Callum McLean, Luis Müller, Ali Parviz, Shenyang Huang, Andrew W Fitzgibbon |
|
Simple and Effective Masked Diffusion Language Models Subham Sekhar Sahoo, Marianne Arriola, Aaron Gokaslan, Edgar Mariano Marroquin, Alexander M Rush, Yair Schiff, Justin T Chiu, Volodymyr Kuleshov |
|
One-Versus-Others Attention: Scalable Multimodal Integration for Biomedical Data Michal Golovanevsky, Eva Schiller, Akira A Nair, Ritambhara Singh, Carsten Eickhoff |
|
12:30 | Lunch |
14:00 | Invited Speaker - Lenore Cowen Learning Protein Function and Organization in Non-Model Organisms with PHILHARMONIC |
14:30 | Panel Discussion |
15:30 | Coffee Break |
16:00 | Poster Session |
16:50 | Closing Remarks |