ICML 2024 Workshop on Accessible and Efficient Foundation Models for Biological Discovery 

There is a growing gap between machine learning (ML) research on biology-inspired problems and the actual broad-based use of ML in the lab or the clinic. This gap is especially pressing in the context of foundation models and other large ML models. Accessibility and efficiency concerns limit the adoption of these models by biologists and clinicians. Large ML models may require extensive GPU clusters to train, while most biological labs only have access to much more modest computational resources. The usability of these models for non-expert users is also a concern, as is the need to iteratively adapt these models based on lab discoveries.


Call for Papers

This workshop seeks to bring ML and biomedical researchers together to identify interdisciplinary approaches to design and apply large, complex ML models for biomedical discovery. We invite researchers from academia and industry to submit original papers to bridge the accessibility and efficiency gap between ML research and wet lab use. All accepted papers will be invited to present posters at the workshop, and a few will be invited to give individual spotlight presentations.

We are seeking original submissions in topics including, but not limited to:

  • Parameter-, memory-, and compute-efficient foundation models for biological data, including 
    model compression and quantization techniques
  • Algorithms for training efficient generative models in biology
  • Efficient fine-tuning and adaptation of biological foundation models
  • Accessible cloud/web-based methods for foundational biological discovery
  • Knowledge distillation and transfer learning across biological contexts
  • Lab in the loop: iterative approaches to refine ML models based on initial experimental results
  • Hypothesis-driven machine learning in biology and uncertainty modeling in biological foundation models

Submissions

Submissions must present original research that has not been previously published. Submitted manuscript should be composed of a main body, which can be up to four pages long, followed by an unlimited number of pages for references and appendices, all in a single file. All submissions must be anonymous and should not include any information that violates the double-blind review process, including citing authors' prior work or sharing links in a way that can reveal the identities of authors to potential reviewers. Submissions that do not conform to these instructions may be desk-rejected at the Program Committee's discretion to ensure a fair review process for all potential authors. After submission and during the review period, authors are allowed to post their papers as technical reports on bioRxiv, arXiv, or other public forums. For details concerning the format of the papers, please see the LaTeX style files on Overleaf.

Submissions should be made through OpenReview.

Important Dates

  • Submission Opens: April 5, 2024
  • Submission Deadline: May 21, 2024 May 28, 2024 Submissions are now closed
  • Acceptance Notification: June 17, 2024 (Anywhere on Earth)
  • Camera Ready Papers Due: July 26, 2024
  • Workshop Date: July 27, 2024

AccMLBio will be an in-person workshop at ICML 2024 in Vienna, Austria on July 27, 2024.

Student Travel Grants

We will provide a limited number of grants to students authors of accepted papers to cover travel expenses to the conference. Travel grant applications, along with letters of recommendation, are due by June 17, 2024 (Anywhere on Earth).  Applications for student travel grants are now closed.

Journal Partner

We will be partnering with Cell Reports Methods for expedited consideration of accepted papers. Cell Reports Methods is an all Open Access journal from Cell Press focused on methodological advances and insights for a broad audience. Consideration at the journal will be at author's discretion and will adhere to normal editorial standards and publication agreements at the journal.


Invited Talks

Speakers are listed in alphabetical order.

Bryan Bryson

MIT

Lenore Cowen

Tufts University

David Page

Duke University

Burkhard Rost

Technical University of Munich


Panelists

Panelists are listed in alphabetical order.

Pranam Chatterjee

Duke University

Irene Chen

University of California Berkeley

Martin Steinegger

Seoul National University

Shweta Yadav

University of Illinois-Chicago


Schedule

All times are listed in CEST.

9:00 Opening Remarks
9:10 Invited Speaker - Burkhard Rost
Artificial Intelligence Deciphers the Code of Life Written in Proteins
9:40 Spotlight Session
Cramming Protein Language Model Training in 24 GPU Hours 
Nathan C. Frey, Taylor Joren, Aya Abdelsalam Ismail, Allen Goodman, Richard Bonneau, Kyunghyun Cho, Vladimir Gligorijevic
Likelihood-based fine-tuning of protein language models for few-shot fitness prediction and design
Alex Hawkins-Hooker, Jakub Kmec, Oliver Bent, Paul Duckworth
ProtMamba: a homology-aware but alignment-free protein state space model
Damiano Sgarbossa, Cyril Malbranke, Anne-Florence Bitbol
Training Compute-Optimal Protein Language Models
Xingyi Cheng, Bo Chen, Pan Li, Jing Gong, Jie Tang, Le Song
10:30 Coffee Break
10:40 Invited Speaker - David Page
Perspectives on a Possible Foundation Model for Health
11:10 Invited Speaker - Bryan Bryson
Learning the Rules of Pathogen-Derived Antigen Presentation on MHC-I and MHC-II
11:40 Spotlight Session
scTree: Discovering Cellular Hierarchies in the Presence of Batch Effects in scRNA-seq Data
Moritz Vandenhirtz, Florian Barkmann, Laura Manduchi, Julia E Vogt, Valentina Boeva
MiniMol: A Parameter-Efficient Foundation Model for Molecular Learning
Kerstin Klaser, Blazej Banaszewski, Samuel Maddrell-Mander, Callum McLean, Luis Müller, Ali Parviz, Shenyang Huang, Andrew W Fitzgibbon
Simple and Effective Masked Diffusion Language Models
Subham Sekhar Sahoo, Marianne Arriola, Aaron Gokaslan, Edgar Mariano Marroquin, Alexander M Rush, Yair Schiff, Justin T Chiu, Volodymyr Kuleshov
One-Versus-Others Attention: Scalable Multimodal Integration for Biomedical Data
Michal Golovanevsky, Eva Schiller, Akira A Nair, Ritambhara Singh, Carsten Eickhoff
12:30 Lunch
14:00 Invited Speaker - Lenore Cowen
Learning Protein Function and Organization in Non-Model Organisms with PHILHARMONIC
14:30 Panel Discussion
15:30 Coffee Break
16:00 Poster Session
16:50 Closing Remarks

Accepted Papers


Organizers

Organizers are listed in alphabetical order.

Kanchan Jha

Duke University

Quincey Justman

Harvard Medical School

Meghana Kshirsagar

Microsoft AI for Good

Navid NaderiAlizadeh

Duke University

Rohit Singh

Duke University

Samuel Sledzieski

MIT


Sponsors