Welcome to the Prokaryotic Virus Remote Homologous Groups database (aka. PHROGS)

PHROG : families of prokaryotic virus proteins clustered using remote homology.
Terzian P*, Olo Ndela E*, Galiez C, Lossouarn J, PĂ©rez Bucio RE, Mom R, Toussaint A, Petit MA, Enault F.
NAR Genomics and Bioinformatics, Volume 3, Issue 3, September 2021, lqab067, https://doi.org/10.1093/nargab/lqab067

This database contains 38,880 PHROGs (protein orthologous groups) containing 868,340 proteins from complete genomes of viruses infecting bacteria or archaea (2,318 from RefSeq and 2,669 from GenBank, april 2018), in addition to 12,498 curated prophages derived from cultivated microbial isolates (Roux et al., 2015).
only one standardized annotation was attributed to each PHROG (using RefSeq annotations, and comparison of each PHROG to Pfam, UNIPROT, KEGG and the ACLAME database)
Download PHROGs as :


! Each of these zipped archive files (.tar.gz) contains 38,880 files (one fasta file, one MSA and HMM profile for each PHROG).
The current release is version 4 of PHROGs annotations, take a look at the News page for more information.

Out of the initial set of 938,864, a subset of 70,524 proteins were not grouped with other proteins into PHROG and remained singletons (or ORFans). You can also download these singleton protein as a single fasta file :

Check out the manual on the Documentation page to compare your proteins to PHROGs.

Virus and PHROG tables in tabular delimited format for downloading :