Can Large Language Models Classify and Generate Antimicrobial Resistance Genes?

Hyunwoo Yoo; Haebin Shin; Gail Rosen

doi:10.18653/v1/2025.bionlp-1.21

Back

Conference proceeding

Can Large Language Models Classify and Generate Antimicrobial Resistance Genes?

Hyunwoo Yoo, Haebin Shin and Gail Rosen

Proceedings of the 24th Workshop on Biomedical Language Processing, pp 240-248

01 Jan 2025

DOI: https://doi.org/10.18653/v1/2025.bionlp-1.21

Files and links (1)

url

https://doi.org/10.18653/v1/2025.bionlp-1.21View

Published, Version of Record (VoR) Open

Abstract

Computer Science, Artificial Intelligence

Engineering, Biomedical

Language & Linguistics

Linguistics

Science & Technology

Computer Science

Engineering

Social Sciences

Technology

This study explores the application of generative Large Language Models (LLMs) in DNA sequence analysis, highlighting their advantages over encoder-based models like DNABERT2 and Nucleotide Transformer. While encoder models excel in classification, they struggle to integrate external textual information. In contrast, generative LLMs can incorporate domain knowledge, such as BLASTn annotations, to improve classification accuracy even without fine-tuning. We evaluate this capability on antimicrobial resistance (AMR) gene classification, comparing generative LLMs with encoder-based baselines. Results show that LLMs significantly enhance classification when supplemented with textual information. Additionally, we demonstrate their potential in DNA sequence generation, further expanding their applicability. Our findings suggest that LLMs offer a novel paradigm for integrating biological sequences with external knowledge, bridging gaps in traditional classification methods.

Metrics

6 Record Views

Details

Title: Can Large Language Models Classify and Generate Antimicrobial Resistance Genes?
Creators: Hyunwoo Yoo - Drexel University
Haebin Shin - KAIST AI, Seoul, South Korea
Gail Rosen - Drexel University
Contributors: D Demner-Fushman (Editor)
S Ananiadou (Editor)
M Miwa (Editor)
J Tsujii (Editor)
Publication Details: Proceedings of the 24th Workshop on Biomedical Language Processing, pp 240-248
Conference: Biomedical Natural Language Processing Workshop (BioNLP), 24 (Vienna, Austria, 01 Aug 2025)
Publisher: Association for Computational Linguistics
Number of pages: 9
Grant note: 2107108 / National Science Foundation (NSF); Instituto Politecnico Nacional - Mexico
Resource Type: Conference proceeding
Language: English
Academic Unit: Electrical and Computer Engineering
Web of Science ID: WOS:001616252100021
Other Identifier: 991022152238004721

InCites Highlights

Data related to this publication, from InCites Benchmarking & Analytics tool:

Collaboration types: International collaboration
Web of Science research areas: Computer Science, Artificial Intelligence; Engineering, Biomedical; Language & Linguistics

Can Large Language Models Classify and Generate Antimicrobial Resistance Genes?

Files and links (1)

Abstract

Metrics

Details

InCites Highlights

Drexel University Social media