Welcome to the Prokaryotic Virus Remote Homologous Groups database (aka. PHROGS)

PHROG : families of prokaryotic virus proteins clustered using remote homology.
Terzian P*, Olo Ndela E*, Galiez C, Lossouarn J, PĂ©rez Bucio RE, Mom R, Toussaint A, Petit MA, Enault F.
NAR Genomics and Bioinformatics, Volume 3, Issue 3, September 2021, lqab067, https://doi.org/10.1093/nargab/lqab067




This database contains 38,880 PHROGs (protein orthologous groups) containing 868,340 proteins from complete genomes of viruses infecting bacteria or archaea (2,318 from RefSeq and 2,669 from GenBank, april 2018), in addition to 12,498 curated prophages derived from cultivated microbial isolates (Roux et al., 2015).
only one standardized annotation was attributed to each PHROG (using RefSeq annotations, and comparison of each PHROG to Pfam, UNIPROT, KEGG and the ACLAME database)
This website provides access to : Viruses and PHROGs can also be access using search tools below.


Find a PHROG of interest by searching in :

    PHROG annotation terms

    Refseq terms

    Pfam domain number or annotation

    KEGG Orthology or annotation

    GO term ID or annotation


    Or, if you already know the ID of your phrog you can go to its page directly here:

    PHROG number


Find a virus of interest by searching in :

    Virus Id


    If you don't know the accession of your virus, use the search box below :

    Virus name



Find your protein :

    If you have a protein from the NCBI and you wish to know if it belongs to a PHROG, you can search it by its sequence or its NCBI ID.

    NCBI prot ID


    Sequence search (5 residues minimum)
    Proteins that are not from the NCBI can
    also be retrieved through sequence search
    (PVOG, NUCCORE, and VirSorter).





Download PHROGs as :

                   

! Each of these zipped archive files (.tar.gz) contains 38,880 files (one fasta file, one MSA and HMM profile for each PHROG).
The current release is version 4 of PHROGs annotations, take a look at the News page for more information.

Out of the initial set of 938,864, a subset of 70,524 proteins were not grouped with other proteins into PHROG and remained singletons (or ORFans). You can also download these singleton protein as a single fasta file :

Check out the manual on the Documentation page to compare your proteins to PHROGs.


Virus and PHROG tables in tabular delimited format for downloading :