Prophages & Defense Systems

Overview

This page displays prophages and defense systems found in the genome.

A prophage is a bacteriophage genome that is integrated within a prokaryote genome. We use Phigaro to detect such regions.

A defense system is a molecular system used to defend the prokaryote against bacteriophages. We use DefenseFinder to detect such regions.

To gather more informations about Cas system, we use CRISPRCasFinder to detect CRISPR sequences around a Cas system.

What is Phigaro?

Phigaro is a standalone command-line application that is able to detect prophage regions taking raw genome and metagenome assemblies as an input. It also produces dynamic annotated “prophage genome maps” and marks possible transposon insertion spots inside prophages. It is applicable for mining prophage regions from large metagenomic datasets. Phigaro uses the pVOG HMM profiles to detect bacteriophage genes.

Know more about Phigaro.

Reference:

Elizaveta V. Starikova, Polina O. Tikhonova, Nikita A. Prianichnikov, Chris M. Rands, Evgeny M. Zdobnov, Vadim M. Govorun Phigaro: high throughput prophage sequence annotation

Note

By default Phigaro predicts genes by using Prodigal. However we use the gene calling provided by our own pipeline.

What is DefenseFinder?

DefenseFinder is a program to systematically detect known anti-phage systems based on MacSyFinder. The decision rules are typically defined by a list of mandatory, accessory, or forbidden proteins necessary for the detection of a given system.

Know more about DefenseFinder.

Reference:

Tesson, F., Hervé, A., Mordret, E., Touchon, M., d’Humières, C., Cury, J., & Bernheim, A. (2022). Systematic and quantitative view of the antiviral arsenal of prokaryotes. Nature communications, 13(1), 2561.

Defense systems detected by DefenseFinder are:

What is CRISPRCasFinder?

CRISPRCasFinder is a tool that allows to identify CRISPR arrays and Cas proteins. The CRISPR detection is based on Vmatch (a software for large scale sequence analysis) which identifies all regularly-interspaced repeated sequences. CRISPRCasFinder associates an evidence level with each CRISPR detected using 3 criteria:

  • An entropy-based conservation index of repeats (EBcon);

  • The number of spacers ;

  • The overall percentage identity of spacers.

../../_images/CRISPR_confidence_lvl.PNG

More information about CRISPRCasFinder here.

Note

In MicroScope, CRISPRCasFinder is used only to detect CRISPR systems. Cas systems are detected by DefenseFinder.

Reference:

D. Couvin et al. 2018. CRISPRCasFinder, an update of CRISPRFinder, includes a portable version, enhanced performance and integrates search for Cas proteins, Nucleic Acids Research.

How to access Prophage & Defense System predictions?

Prophage & Defense System predictions are available through the Comparative Genomics section, in the main navigation menu. This page presents the prophages and the defense systems found in the current genome and allows to explore their content.

What is the Prophages table?

This table enumerates all prophages predicted for the selected genome:

../../_images/prophages_prediction.png
  • MoveTo: allows to display the region in the Genome Browser

  • Prophage Id: identifier of the prophage in the genome; clicking on this element will open an interface to explore the content of this region (see below)

  • Replicon name: identification of the replicon

  • Replicon type: chromosome, plasmid or WGS

  • Begin / End: position of the prophage on the replicon

  • Length: length of the prophage

  • Prophage Family: family of the bacteriophage

How to explore a prophage?

The prophage visualization interface can be accessed by clicking on the Prophage Id field of the Prophages table. This interface displays the detailed description of a selected prophage.

../../_images/prophage_vizualization.png

The table Genomic objects provides information regarding the genomic objects composing the prophage such as :

  • Label, Begin, End, Gene, Product: correspond to the annotation of the object in MicroScope

  • pVOG: the pVOG corresponding to the genomic object (if any); clicking on this will open the detailed description of the pVOG

  • Eval: E-value of the match between the genomic object and the pVOG

You can export the genes by clicking on Export to Gene Cart.

What is the Defense Systems table?

This table enumerates all defense systems predicted for the selected genome

../../_images/defensefinder1_systemstab.png
  • MoveTo: Allows to display the region in the Genome Browser.

  • System name: Name of the defense system; clicking on it will open a detailled description of this system (see below).

  • System type: Type of the defense system; clicking on it will open a description of this type of system on the DefenseFinder Wiki website.

  • Replicon name: Name of the replicon.

  • Replicon type: Type of the replicon (chromosome, plasmid, WGS).

  • Begin and End: Location of the defense system on the replicon.

  • Length: Length of the defense system.

  • Mandatory proteins in system: List of mandatory proteins of the system identified in the genome.

  • Nb of mandatory present: Number of mandatory proteins of the system identified in the genome.

  • Accessory proteins in system: List of accessory proteins of the system identified in the genome.

  • Nb of accessory present: Number of accessory proteins of the system identified in the genome.

  • Neutral proteins in system: List of neutral proteins of the system identified in the genome.

  • Nb of neutral present: Number of neutral proteins of the system identified in the genome.

How to explore a defense system?

The defense system visualization interface can be accessed by clicking on the System name field of the Defense Systems table. This interface displays the detailed description of a selected defense system.

../../_images/defensefinder1_GOtab.png

The table Genomic Objects provides information regarding the genomic objects composing the defense system such as:

  • Label: Label of the genomic object. Click on it allow to access to its annotation page.

  • Begin and End: Location of the genomic object on the sequence.

  • Gene: Gene name if any.

  • Product: Description of the gene product of the genomic object.

  • Protein name: Name of the protein detected by DefenseFinder.

  • Eval: e-value of the match with MacSyFinder models.

  • Status: Status of the protein in the system, as defined by MacSyFinder (mandatory, accessory, neutral).

You can export the genes by clicking on Export to Gene Cart.

What is the CRISPR table?

This table displays all CRISPR detected by CRISPRCasFinder and all Cas detected by DefenseFinder for the selected genome.

../../_images/crisprcasfinder4_crisprtab.png
  • System label: Identifier of the system in the organism. Click on it will open a page which presents a detailled description of a CRISPR or a detailled description of a Cas system (see below).

  • Replicon name: Name of the replicon.

  • Replicon type: Type of the replicon (chromosome, plasmid, WGS).

  • Begin and End: Location of the system on the replicon.

  • Length: Length of the system.

  • Nb spacers / genes: Number of CRISPR spacers or Number of Cas genes.

  • Consensus repeat / Present gene: Consensus repeat sequence predicted by CRISPRCasFinder or List of mandatory Cas genes predicted by DefenseFinder.

  • Evidence level: Evidence level as computed by CRISPRCasFinder.

How to explore a CRISPR-Cas system?

The table CRISPR Sequences provides all repeats and spacers contained in the selected CRISPR.

../../_images/crisprcasfinder4_crisprseq.png
  • Sequence type: CRISPR_dr if the sequence is a direct repeat or CRISPR_spacer if the sequence is a spacer.

  • Begin / End: Location of the sequence on the replicon.

  • Length: Length of the sequence.

  • Sequence: Nucleic acid sequence.

The table Genomic objects provides information regarding the genomic objects composing the Cas system. You can export the genes by clicking on Export to Gene Cart.

../../_images/crisprcasfinder4_GOtab.png
  • Label: Label of the genomic object. Click on it allow to access to its annotation page.

  • Begin and End: Location of the genomic object on the sequence.

  • Gene: Gene name if any.

  • Product: Description of the gene product of the genomic object.

  • Protein name: Name of the protein detected by MacSyFinder.

  • Eval: E-value of the match with DefenseFinder models.

  • Status: Status of the gene in the system, as defined by MacSyFinder (mandatory, accessory, neutral).