ICML 2024 Workshop on Accessible and Efficient Foundation Models for Biological Discovery

There is a growing gap between machine learning (ML) research on biology-inspired problems and the actual broad-based use of ML in the lab or the clinic. This gap is especially pressing in the context of foundation models and other large ML models. Accessibility and efficiency concerns limit the adoption of these models by biologists and clinicians. Large ML models may require extensive GPU clusters to train, while most biological labs only have access to much more modest computational resources. The usability of these models for non-expert users is also a concern, as is the need to iteratively adapt these models based on lab discoveries.

Call for Papers

This workshop seeks to bring ML and biomedical researchers together to identify interdisciplinary approaches to design and apply large, complex ML models for biomedical discovery. We invite researchers from academia and industry to submit original papers to bridge the accessibility and efficiency gap between ML research and wet lab use. All accepted papers will be invited to present posters at the workshop, and a few will be invited to give individual spotlight presentations.

We are seeking original submissions in topics including, but not limited to:

Parameter-, memory-, and compute-efficient foundation models for biological data, including
model compression and quantization techniques
Algorithms for training efficient generative models in biology
Efficient fine-tuning and adaptation of biological foundation models
Accessible cloud/web-based methods for foundational biological discovery
Knowledge distillation and transfer learning across biological contexts
Lab in the loop: iterative approaches to refine ML models based on initial experimental results
Hypothesis-driven machine learning in biology and uncertainty modeling in biological foundation models

Submissions

Submissions must present original research that has not been previously published. Submitted manuscript should be composed of a main body, which can be up to four pages long, followed by an unlimited number of pages for references and appendices, all in a single file. All submissions must be anonymous and should not include any information that violates the double-blind review process, including citing authors' prior work or sharing links in a way that can reveal the identities of authors to potential reviewers. Submissions that do not conform to these instructions may be desk-rejected at the Program Committee's discretion to ensure a fair review process for all potential authors. After submission and during the review period, authors are allowed to post their papers as technical reports on bioRxiv, arXiv, or other public forums. For details concerning the format of the papers, please see the LaTeX style files on Overleaf.

Submissions should be made through OpenReview.

Important Dates

Submission Opens: April 5, 2024
Submission Deadline: May 21, 2024 May 28, 2024 Submissions are now closed
Acceptance Notification: June 17, 2024 (Anywhere on Earth)
Camera Ready Papers Due: July 26, 2024
Workshop Date: July 27, 2024

AccMLBio will be an in-person workshop at ICML 2024 in Vienna, Austria on July 27, 2024.

Student Travel Grants

We will provide a limited number of grants to students authors of accepted papers to cover travel expenses to the conference. Travel grant applications, along with letters of recommendation, are due by June 17, 2024 (Anywhere on Earth). Applications for student travel grants are now closed.

Journal Partner

We will be partnering with Cell Reports Methods for expedited consideration of accepted papers. Cell Reports Methods is an all Open Access journal from Cell Press focused on methodological advances and insights for a broad audience. Consideration at the journal will be at author's discretion and will adhere to normal editorial standards and publication agreements at the journal.

Invited Talks

Speakers are listed in alphabetical order.

Bryan Bryson

MIT

Lenore Cowen

Tufts University

David Page

Duke University

Burkhard Rost

Technical University of Munich

Panelists

Panelists are listed in alphabetical order.

Pranam Chatterjee

Duke University

Irene Chen

University of California Berkeley

Martin Steinegger

Seoul National University

Shweta Yadav

University of Illinois-Chicago

Schedule

All times are listed in CEST.

9:00	Opening Remarks
9:10	Invited Speaker - Burkhard Rost Artificial Intelligence Deciphers the Code of Life Written in Proteins
9:40	Spotlight Session
	Cramming Protein Language Model Training in 24 GPU Hours Nathan C. Frey, Taylor Joren, Aya Abdelsalam Ismail, Allen Goodman, Richard Bonneau, Kyunghyun Cho, Vladimir Gligorijevic
	Likelihood-based fine-tuning of protein language models for few-shot fitness prediction and design Alex Hawkins-Hooker, Jakub Kmec, Oliver Bent, Paul Duckworth
	ProtMamba: a homology-aware but alignment-free protein state space model Damiano Sgarbossa, Cyril Malbranke, Anne-Florence Bitbol
	Training Compute-Optimal Protein Language Models Xingyi Cheng, Bo Chen, Pan Li, Jing Gong, Jie Tang, Le Song
10:30	Coffee Break
10:40	Invited Speaker - David Page Perspectives on a Possible Foundation Model for Health
11:10	Invited Speaker - Bryan Bryson Learning the Rules of Pathogen-Derived Antigen Presentation on MHC-I and MHC-II
11:40	Spotlight Session
	scTree: Discovering Cellular Hierarchies in the Presence of Batch Effects in scRNA-seq Data Moritz Vandenhirtz, Florian Barkmann, Laura Manduchi, Julia E Vogt, Valentina Boeva
	MiniMol: A Parameter-Efficient Foundation Model for Molecular Learning Kerstin Klaser, Blazej Banaszewski, Samuel Maddrell-Mander, Callum McLean, Luis Müller, Ali Parviz, Shenyang Huang, Andrew W Fitzgibbon
	Simple and Effective Masked Diffusion Language Models Subham Sekhar Sahoo, Marianne Arriola, Aaron Gokaslan, Edgar Mariano Marroquin, Alexander M Rush, Yair Schiff, Justin T Chiu, Volodymyr Kuleshov
	One-Versus-Others Attention: Scalable Multimodal Integration for Biomedical Data Michal Golovanevsky, Eva Schiller, Akira A Nair, Ritambhara Singh, Carsten Eickhoff
12:30	Lunch
14:00	Invited Speaker - Lenore Cowen Learning Protein Function and Organization in Non-Model Organisms with PHILHARMONIC
14:30	Panel Discussion
15:30	Coffee Break
16:00	Poster Session
16:50	Closing Remarks

Accepted Papers

Organizers

Organizers are listed in alphabetical order.

Kanchan Jha

Duke University

Quincey Justman

Harvard Medical School

Meghana Kshirsagar

Microsoft AI for Good

Navid NaderiAlizadeh

Duke University

Rohit Singh

Duke University

Samuel Sledzieski

MIT

ICML 2024 Workshop on Accessible and Efficient Foundation Models for Biological Discovery

Call for Papers

Submissions

Important Dates

Student Travel Grants

Journal Partner

Invited Talks

Bryan Bryson

Lenore Cowen

David Page

Burkhard Rost

Panelists

Pranam Chatterjee

Irene Chen

Martin Steinegger

Shweta Yadav

Schedule

Accepted Papers

Organizers

Kanchan Jha

Quincey Justman

Meghana Kshirsagar

Navid NaderiAlizadeh

Rohit Singh

Samuel Sledzieski

Sponsors