AI Genomics Research: Transformer Models for DNA and Biosecurity Implications

Transformer-based AI models are revolutionizing genomics research, enabling the design and analysis of DNA sequences at unprecedented scales. This curated collection explores key papers, preprints, and the biosecurity debates they have sparked.

📄 Papers & Preprints

megaDNA

Nature Communications, 2024

A GPT/Transformer-inspired DNA model with long context capabilities for understanding and generating genomic sequences. This foundational work enables AI to process and generate DNA at the scale of entire genomes.

Evaluation of Transformer-Generated Bacteriophage Genomes

NAR Genomics and Bioinformatics, 2024

Analyzes whether synthetic phage genomes generated by megaDNA appear "natural" according to composition metrics and whether they can be detected as AI-generated.

Generative Design of Novel Bacteriophages with Genome Language Models

bioRxiv, 2025 (Preprint)

Reports generative design of phage genomes with experimental validation showing a subset as viable. Focus is on phage-bacteria interactions, not human pathogens.

Protein Set Transformer

Nature Communications, 2025

A Transformer architecture for representing and classifying viral genomes as sets of proteins. Primarily focused on analysis rather than generation of new sequences.

AAV Capsid Design with Protein Language Models

Thesis/Dissertation, 2025

Uses Transformer-based Protein Language Models (PLMs) to propose AAV variants. Common application in gene therapy and viral vectors—not designed for pathogen creation.

🔒 Biosecurity & Dual-Use Debates

Dual-Use Concerns

These genomic AI models raise significant biosecurity questions. While designed for beneficial applications like phage therapy and gene therapy, they could theoretically be misused.

🧪 DNA Synthesis Screening Gaps

Nature has covered cases where "paraphrased" sequences can bypass DNA synthesis screening filters, prompting industry and research responses to strengthen safeguards.

📊 Emerging Risk Reviews

Perspective articles analyze emerging GenAI risks in biosciences, including threat vectors and the need for enhanced safeguards in an era of increasingly capable genomic models.

🦠 Virus Design Discussions

Articles discussing how genomic models can project viruses (especially phages) and why this capability is triggering biosecurity alerts across the scientific community.

📚 Want a Curated List?

We can provide filtered lists by category:

Genome LMs — Language models for DNA/RNA sequences
Phage Design — Bacteriophage genome generation
Capsid/Vectors — AAV and viral vector engineering
Biosecurity — Dual-use risks and safeguards

Specify if you prefer peer-reviewed papers only or if preprints (bioRxiv) are acceptable.

This article curates publicly available research. GenAI Hub does not endorse or provide guidance on dual-use applications.