Bioinformatic tools

On this page we will present a collection of our in-home, free available bioinformatic tools and databases. Please remember to cite the appropriate publication when using our tools. Thank You!

Literature-based, manually-curated database of PCR primers

Link: http://lcpdb.ddlemb.com/

Description:
A literature-based, manually-curated database of PCR primers for the detection of antibiotic resistance genes is comprised of hundreds of PCR primer pairs designed for the amplification of various genes conferring resistance to antibiotics. Three parameters were assigned for each primer pair: specificity (S), efficacy (E) and taxonomic efficacy (TE). These parameters were evaluated using a novel bioinformatic tool – UniPriVal – that was used to validate each primer pair against various reference databases. Then, primer pairs specific for each gene were ranked based on their model success metric (MSM) value. It is important to mention, that due to the correlation between E and TE parameters, the MSM metric is biased toward primer pairs with higher E/TE values relative to S values. Despite this limitation, the internal validation system of the LCPDb application enables the quantified ranking of PCR primer pairs, which assists selection of the best primers for each application. At the end we would like to add that, although we performed thorough literature review to identify PCR primers, we are aware that the database is still incomplete and needs further development. Therefore, users are invited to directly contact (using the “Submit” webpage) the database developers to add missing primers as well as to test their own primer pairs with the UniPriVal tool.

Last update: 2019.06.19
Reference:
– LCPDb-ARG: Gorecki, A., Decewicz, P., Dziurzynski, M., Janeczko, A., Drewniak, L., & Dziewit, L. 2019. Literature-based, manually-curated database of PCR primers for the detection of antibiotic resistance genes in various environments. Water Research, 161, 211-221. doi:10.1016/j.watres.2019.06.009. Read me.
LCPDb-MET: Dziurzynski M., Gorecki A., Decewicz P., Ciuchcinski K., Dabrowska M., Dziewit L. 2022. Development of the LCPDb-MET database facilitating selection of PCR primers for the detection of metal metabolism and resistance genes in bacteria. Ecological Indicators. 145: 109606. Read me.


Alphaproteobacteria (pro)phages Database

Download:

Description:
This projects aims to construct interactive website database for all known and publicly available active phages and manually verified and reannotated prophages within Alphaproteobacteria.

Alphaproteobacteria as a taxonomy clade is well represented by bacteria inhabiting various environments and thus they are metabolicly diverse. Within, there are phototrophic (i.e. Rhodobacter), endosymbiotic (i.e. Wolbachia), metylothrophic (i.e. Paracoccus) and parasitic (i.e. Rickettsia) bacteria. These are also commonly used in biotechnology like nitrogen-fixing rhyzobia, in biodeterioration – the species degrading and denitrifing polluted soils or molecular biology – Agrobacterium tumefaciens which is able to insert exogenous DNA to plant cells. They are also diverse within the genome structure as the range of genome size is between 140 kb to 9000 kb. The number and sizes of replicons also varies – from none to even 9. Same with %GC content: 27.5 – 71.5%.

Once published, the database site link will be attached here. Right know, in Download section you may find raw data that were used to construct the database or has already been published (e.g. (pro)phages infecting Sinorhizobium spp. and Paracoccus spp.).

Last update: 2019.05.20
Reference:
– Decewicz P., Radlinska M., Dziewit L. 2017. Characterization of Sinorhizobium sp. LM21 prophages and virus-encoded DNA methyltransferases in the light of comparative genomic analyses of the sinorhizobial virome. Viruses 9: 161. Read me
– Decewicz P., Dziewit L., Golec P., Kozlowska P., Bartosik D., Radlinska M. 2019. Characterization of the virome of Paracoccus spp. (Alphaproteobacteria) by combined in silico and in vivo approaches. Scientific Reports 9: 7899. Read me.


MAISEN

Link: http://maisen.ddlemb.com/

Description:

MAISEN is a web tool dedicated to the annotation of bacterial and archeal nucleotide sequences. It allows you to browse structural and functional annotation hits from various databases in one place. You can either uploaded annotated GenBank file or a FASTA file that will undergo a simple structural annotation (prodigal, tRNAscan-SE, ARAGORN, PHANOTATE). Proteins encoded in your sequence will be then searched against a set of reference databases (NBCI non-redundant protein database, UniProtKB/Swiss-Prot database,  Conserved Domains database) in order to provide you with a precomputed similarity searches for curating automatic annotations of your sequence.

Last update: 2022.04.14

Reference: Dziurzynski M., Decewicz P., Ciuchcinski K., Gorecki A., Dziewit L. 2021. Simple, reliable, and time-efficient manual annotation of bacterial genomes with MAISEN. Mengoni A., G. Bacci and M. Fondi (Editors). Chapter in a book ”Bacterial Pangenomics, second edition” in a book series Methods in Molecular Biology 2242: 221-229. Springer Protocols/Humana Press. Read me.


Stress response genes database

Download:

Description:

A manually curated database of stress response genes was created to annotate genes of plasmid DNA extracted from metaplasmidomes from polar environments. Based on a literature review, genes putatively involved in adaptation to cold environments, with a focus on the response to changing environmental conditions, were identified. We named selected genes according to Uniprot and NCBI databases. If available, references were added. Next, a collection of antibiotic resistance genes (ARGs) that had originally been extracted from the comprehensive antibiotic resistance gene database (CARD) (Jia et al., 2017) and the heavy-metal resistance gene database (BacMet) (Pal et al. 2014) was added. The compiled  database contains 2,451 sequences of proteins that putatively encode stress-response genes, including (i) 2,191 antibiotic-resistant genes and proteins, (ii) 119 heavy-metal resistant genes, (iii) 49 reactive oxygen species protection genes, (iv) 30 UV radiation protection genes, (v) 23 cold shock genes, (vi) 13 anti-freeze genes, (vii) 13 osmoregulatory genes, (viii) 4 phasins, (ix) 3 trehalose synthetases, (x) 3 hydroxyalkanoic acid synthetases, and (xi) 3 ice nucleation genes. In the end, we would like to add that, although we performed a thorough literature review to identify the stress response genes, we are aware that the database is still incomplete and needs further development.

Last update: 2020.03.24

Reference: Gorecki A., Holm S., Dziurzynski M., Winkel M., Yang S., Liebner S., Wagner D., Dziewit L., Horn F. 2021. Metaplasmidome-encoded functions of Siberian low-centered polygonal tundra soils. ISME Journal 15: 3258–3270. Read me.


SigMa

Link: https://github.com/pdec/SigMa

Description:

SigMa was designed for iterative prophage identification in bacterial genomes. It is meant to allow the incorporation of validated (manually or automatically) prophage predictions of analyzed sets of genomes prior next iteration(s). By default it allows to search genome GenBank files against sets of reference nucleotide, protein and protein profile sequences that are then considered a reference signal which is mapped against query sequence.

Last update: 2023.02.16


Mobile genetic elements of psychrotolerant prokaryotes

Download:

Description:

Here, we present a set of manually curated datasets of prokaryotic psychrotolerant genomes, their plasmids and metagenomes from cold environments recovered from public repositories, mainly PATRICBRC (currently BV-BRC), GenBank and ENA Metagenomics. Datasets consist of 3978 psychrotolerants (Bacteria – 3822, Archeae – 156), 484 plasmids of psychrotolerants (PsychroPlasDb) and 2831 metagenomes.

We further analyzed these datasets in order to distinguish plasmid-like contigs, phage genomes (including prophages), and transposases. Briefly, metagenomes were quality-filtered with fastp and then assembled with MEGAHIT. Assembled contigs were gene-called with prodigal-gv and their proteins were then searched with MMseqs2 against the following reference databases to filter out contigs possibly carrying plasmids (MOBsuite‘s MOB and REP reference protein sequences; 1e-5, 75% seq. identity, 95% query coverage), phages (PHROGs head and packaging, connector, tail, and lysis protein profiles; 1e-5, 70% query coverage), and transposases (TnCentral‘s transposase proteins; 1e-5, 75% seq. identity, 95% query coverage). Plasmid and phage candidate contigs were then annotated with Bakta to remove contaminants (e.g. contigs encoding rRNA genes). Additionally, phage contigs were analyzed with CheckV to determine the quality of recovered phage genomes and their contamination with host genomes (at least medium-quality ones and shorter than 200kb were considered so far). Psychrotolerant genomes were also searched with PhiSpy and SigMa to identify prophages and predictions for Psychrobacter representatives were also manually curated. As a result of the above, we present sets of 14523 plasmids, 23608 phages,  6979 MOB and TRA proteins, 7770 REP proteins, 4554 and 9899 phage TerL and MCP proteins, as well as 129905 transposases.

We are aware that the database is still incomplete and needs further curation of predictions and in order to achieve that, we are currently working on a major release update of MAISEN that will incorporate SigMa and allow for manual curation of the above-mentioned predicted MGEs.

Last update: 2023.05.06


Plasmids of psychrotolerant bacteria

Download:

Description:

Sequences of plasmids from cold-active bacteria from Nucleotide database (NCBI) in fasta format.

Last update: 2023.08.11


LCPDb-MGE – database of PCR primers for identification of mobile genetic elements

Download:

Description:

The database of PCR primers was based on a systematic literature review of articles in PubMed. It Contains information about starter names, sequences, overhangs (if exist), product sizes, references (PMID), keywords used in search, and MGE name. Each pair of starters is also assigned to 3 categories according to the identified MGE. Category 1 assigns primers to transposable elements, plasmids, phages, or integrons. Category 2 is a subcategory specifying the type of MGE or its fragment. Category 3 is used only for phages to indicate that they are specific to a particular phage.

Last update: 2023.08.11