AI Genomics Research: Transformer Models for DNA and Biosecurity Implications
Transformer-based AI models are revolutionizing genomics research, enabling the design and analysis of DNA sequences at unprecedented scales. This curated collection explores key papers, preprints, and the biosecurity debates they have sparked.
π Papers & Preprints
megaDNA
Nature Communications, 2024
A GPT/Transformer-inspired DNA model with long context capabilities for understanding and generating genomic sequences. This foundational work enables AI to process and generate DNA at the scale of entire genomes.
Evaluation of Transformer-Generated Bacteriophage Genomes
NAR Genomics and Bioinformatics, 2024
Analyzes whether synthetic phage genomes generated by megaDNA appear "natural" according to composition metrics and whether they can be detected as AI-generated.
Generative Design of Novel Bacteriophages with Genome Language Models
bioRxiv, 2025 (Preprint)
Reports generative design of phage genomes with experimental validation showing a subset as viable. Focus is on phage-bacteria interactions, not human pathogens.
Protein Set Transformer
Nature Communications, 2025
A Transformer architecture for representing and classifying viral genomes as sets of proteins. Primarily focused on analysis rather than generation of new sequences.
AAV Capsid Design with Protein Language Models
Thesis/Dissertation, 2025
Uses Transformer-based Protein Language Models (PLMs) to propose AAV variants. Common application in gene therapy and viral vectorsβnot designed for pathogen creation.
π Biosecurity & Dual-Use Debates
Dual-Use Concerns
These genomic AI models raise significant biosecurity questions. While designed for beneficial applications like phage therapy and gene therapy, they could theoretically be misused.
π§ͺ DNA Synthesis Screening Gaps
Nature has covered cases where "paraphrased" sequences can bypass DNA synthesis screening filters, prompting industry and research responses to strengthen safeguards.
π Emerging Risk Reviews
Perspective articles analyze emerging GenAI risks in biosciences, including threat vectors and the need for enhanced safeguards in an era of increasingly capable genomic models.
π¦ Virus Design Discussions
Articles discussing how genomic models can project viruses (especially phages) and why this capability is triggering biosecurity alerts across the scientific community.
π Want a Curated List?
We can provide filtered lists by category:
- Genome LMs β Language models for DNA/RNA sequences
- Phage Design β Bacteriophage genome generation
- Capsid/Vectors β AAV and viral vector engineering
- Biosecurity β Dual-use risks and safeguards
Specify if you prefer peer-reviewed papers only or if preprints (bioRxiv) are acceptable.
This article curates publicly available research. GenAI Hub does not endorse or provide guidance on dual-use applications.