Integration of bioinformatics to biodegradation
© Arora and Bae; licensee BioMed Central Ltd. 2014
Received: 21 March 2014
Accepted: 19 April 2014
Published: 27 April 2014
Bioinformatics and biodegradation are two primary scientific fields in applied microbiology and biotechnology. The present review describes development of various bioinformatics tools that may be applied in the field of biodegradation. Several databases, including the University of Minnesota Biocatalysis/Biodegradation database (UM-BBD), a database of biodegradative oxygenases (OxDBase), Biodegradation Network-Molecular Biology Database (Bionemo) MetaCyc, and BioCyc have been developed to enable access to information related to biochemistry and genetics of microbial degradation. In addition, several bioinformatics tools for predicting toxicity and biodegradation of chemicals have been developed. Furthermore, the whole genomes of several potential degrading bacteria have been sequenced and annotated using bioinformatics tools.
Millions of toxic chemicals have been produced for use in a variety of industries . These chemicals have often been released into the environment due to anthropogenic activities, where they contaminate soil and water . Furthermore, many chemicals persist in the environment, causing severe problems to living organisms; accordingly, it is crucial that these compounds be removed from the environment .
Biodegradation is the break-down of chemicals or xenobiotic compounds by microbes and plants . Biodegrading microbes degrade toxic chemicals via either mineralization or co-metabolism . In the process of mineralization, microbes completely degrade toxic chemicals by utilizing them as carbon and energy sources, whereas co-metabolism results in biotransformation of toxic compounds into less toxic compounds [4, 5].
Microbial remediation is an emerging technology for the removal of toxic chemicals from the environment [4–6]. A large number of microbes capable of utilizing toxic chemicals as their sole sources of carbon and energy have been isolated, many of which break complex chemical compounds down to carbon dioxide and water through a series of chemical reactions catalyzed by microbial enzymes [5–8], such as monooxygenases, dioxygenases, reductases, deaminases, and dehalogenases. The genes encoding these enzymes have been identified in a variety of microbes and cloned into bacteria to increase the efficiency of bioremediation. The degradation of a specific toxic chemical requires a specific microbe that depends on the structure of that chemical and the presence of the enzyme systems in bacteria for degradation of the compound. Therefore, knowledge regarding chemicals (classification, identification, environmental properties, toxicity, distribution, and associated risks) as well as their microbial biodegradation (xenobiotics degrading bacteria, enzymes, genes, proteins) can improve bioremediation process.
Bioinformatics, which has been incorporated into each branch of life sciences, provides a platform for researchers to develop valuable computational tools for human and environmental welfare [9, 10]. In the last few decades, bioinformatics has been integrated with biodegradation and several bioinformatics tools useful in the field of biodegradation have been developed. These include databases [11–14], chemical toxicity prediction systems [15, 16], biodegradation pathway prediction systems [17–20], and next-generation sequencing [21–24]. Here, we discuss the relationship of bioinformatics tools with biodegradation.
List of chemical databases
Databases for chemical identification, structure and classification
Information about 370,000 chemicals.
ECHA Classification & Labeling Inventory
Information about the classification and labeling of substances reported and registered by manufacturers and importers.
NCLASS (the Nordic N-Class Database on Environmental Hazard Classification)
Information describing chemicals that have been or are currently being considered by the European commission on classification and labeling for environmental effects.
Databases describing environmental properties of chemicals and their toxicity, distribution, management and risk of occupational disease
Hazardous Substances Data Bank (HSDB)
Toxicology information for 5,000 chemicals.
Toxicology Literature Online (TOXLINE)
References derived from toxicology literature.
Chemical Carcinogenesis Research Information System (CCRIS)
Carcinogenicity and mutagenicity tests for 8,000 chemicals.
Developmental and Reproductive Toxicology Database (DART)
References related to developmental and reproductive toxicology literature.
Genetic Toxicology Data Bank (GENE-TOX)
Data related to genetic toxicology for 3,000 chemicals.
Integrated Risk Information System (IRIS)
Data describing hazard identification and dose–response assessments of about 500 chemicals.
International Toxicity Estimates for Risk (ITER)
Risk information for 600 chemicals from authoritative groups worldwide.
A cluster of databases on toxicology, hazardous chemicals, environmental health, and toxic releases.
A comprehensive database of about 60,000 toxic compounds.
This innovative database may be used for in vitro acute toxicity studies
Comparative Toxicogenomics Database (CTD)
This database describes genetic bases by which environmental chemicals affect human diseases.
Carcinogenic Potency Database
This database contains the results of 6540 chronic, long-term animal cancer tests on 1547 chemicals.
International Uniform Chemical Information Database (IUCLID)
Physico-chemical properties, environmental fate, toxicity and ecotoxicity of 2,600 chemicals.
An occupational health database that provides information on chemicals and related occupational diseases.
A Geographic Information System that provides the amount and location of toxic chemicals released into the environment using maps of the United States.
Toxics Release Inventory (TRI)
Data focused on specific toxic chemicals and their management as waste.
The Household Products Database
Information on the health effects of 13,000 consumer brands.
European chemical Substances Information System (ESIS)
Information about chemicals covering a variety of aspects.
ECOTOX (AQUIRE, PHYTOTOX, TERRETOX)
Chemical toxicity data for aquatic life, terrestrial plants and wildlife.
Information on properties of chemicals including toxicity, ecotoxicity, environmental fate and behavior and physical chemical properties.
Environmental properties of chemicals.
Aggregated Computational Toxicology Resource (ACToR)
All publically available chemical toxicity data.
EPA Human Health Benchmarks for Pesticides (HHBP)
Information describing human health benchmarks for pesticides to determine whether the detection of a pesticide in drinking water or source waters for drinking water indicate potential health risks.
EPA Office of Pesticide Programs’ Aquatic Life Benchmarks (OPPALB)
Aquatic ecotoxicity benchmarks values from risk assessments developed by the EPA for individual pesticides.
Chemical Safety Information from Intergovernmental Organizations - INCHEM
Internationally peer reviewed information derived from intergovernmental organizations describing chemicals commonly used throughout the world
JECDB: Japan Existing Chemical Data Base
Toxicity test reports from Japan's existing chemicals safety program.
Substances in Preparations In the Nordic countries (SPIN)
Provides information regarding chemicals in the products of Nordic Countries
US EPA: Substance Registry Services (SRS)
A central system of the USEPA and the portal for discovering chemical information at the EPA
Biodegradative databases store information related to biodegradation of chemicals including xenobiotics-degrading bacteria, metabolic degradation pathways of toxic chemicals, enzymes and genes involved in the biodegradation. These databases include the University of Minnesota Biocatalysis/Biodegradation database (UM-BBD), a database of biodegradative oxygenases (OxDBase), Biodegradation Network-Molecular Biology database (Bionemo), MetaCyc, and BioCyc.
Another database, OxDBase (http://www.imtech.res.in/raghava/oxdbase/), which was developed by the CSIR-Institute of Microbial Technology, Chandigarh, India, stores information regarding oxygenases derived from published literature and databases . Oxygenases are the most important enzymes involved in aerobic degradation of aromatic compounds . There are two types of oxygenases, monooxygenases and dioxygenases. Monooxygenases catalyze incorporation of one atom of molecular oxygen into substrate whereas dioxygenases catalyze incorporation of two atoms of molecular oxygen . Dioxygenases are further divided into aromatic ring hydroxylating dioxygenases (ARHD) and aromatic ring cleavage dioxygenases (ARCD). ARHD catalyze hydroxylation of aromatic rings, whereas ARCD catalyze ring cleavage of aromatic rings . ARCDs are further divided into extradiol and intradiol. Intradiol ARCDs cleave aromatic rings between two hydroxyl groups, whereas extradiol cleaves rings between hydroxylated carbons and adjacent non-hydroxylated carbons . OxDBase provides information about 237 distinct oxygenases, including monooxygenases (118) and dioxygenases (ARCD, ARHD, intradiol and extradiol) (119). All enzyme entries contain information about (a) reaction(s) in which enzymes are involved, (b) their common names and synonyms, (c) structures and gene links, (d) families and subfamilies, (e) literature citations and (f) links to several external databases including the Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.jp/kegg/), UM-BBD, BRENDA, and ENZYME. This database is user-friendly and increases our understanding of aerobic degradation of aromatic compounds .
The Bionemo database (http://bionemo.bioinfo.cnio.es) was developed by the structural Computational Biology Group at the Spanish National Cancer Research Center . Bionemo is a manually curated database that provides information regarding proteins and genes involved in biodegradation metabolism . The protein information involves sequences, domains and structures for proteins, whereas the genomic information involves sequences, regulatory elements and transcription units for genes . Bionemo complements UM-BBD, which focuses on the biochemical aspects of biodegradation . Bionemo has been developed by manually associating sequence database entries to biodegradation reactions based on the information extracted from published articles . Information related to the transcription units and their regulation of biodegradation genes is linked to the underlying biochemical network. This database is composed of (i) 145 biochemical pathways, (ii) 945 reactions in which 342 reactions are with associated complexes, (iii) 537 enzymatic complexes, (iv) 1107 proteins, (v) 234 microbial species (vi), 212 transcription units (vii), 90 transcription factors, (viii) 90 effectors, (XI) 128 TF DNA binding sites and (X) 100 promoters. Like other databases, Bionemo is cross linked to the following databases: (i) UMBBD for metabolic reaction; (ii) GenBank for DNA sequences; (iii) Uniport for protein; (iv) NCBI Taxonomy for microbial species and (v) PubMed for references . The information provided by Bionemo may be helpful for cloning, primer design and directed evolution experiments. The full database is downloadable as a PostgresSQL dump .
MetaCyc is a database of metabolic pathways derived from the scientific experimental literature that comprises more than 2097 experimentally determined metabolic pathways from more than 2460 different organisms. This is the largest curated database of metabolic pathways of all domains of life . This database provides information regarding the metabolic pathways involved in primary and secondary metabolism with associated compounds, enzymes and genes . This database is freely available at http://metacyc.org/. MetaCyc can be used for multiple scientific applications. Specifically, it can (i) provide reference data for computational prediction of the metabolic pathways of organisms from their sequenced genomes, (ii) support metabolic engineering, (iii) facilitate comparison of biochemical networks, and (iv) serve as an encyclopedia of metabolism . This database was developed and curated by the BioCyc group at SRI international.
BioCyc (http://biocyc.org/) is a collection of more than 2988 organism-specific Pathway/Genome Databases (PGDBs). Each PGDB contains the full genome and predicted metabolic pathway of a single organism . The pathway tool software predicts pathways using MetaCyc as a reference database . The predicted metabolic pathway contains information about metabolites, enzymes, and reactions. In addition, BioCyc PGDBs contain information about predicted operons, transport systems and pathway-hole fillers . BioCyc pathway tool based web sites offer multiple tools for querying and analysis of PGDBs, including analysis of gene expression, metabolomics, and other large-scale datasets . This database was developed by the Bioinformatics Research Group at SRI International.
Pathway prediction systems
Pathway prediction systems
Predicts microbial degradation pathways for xenobiotic compounds based on biotransformation rules.
Predicts pathways for microbial biodegradation of environmental compounds and biosynthesis of plant secondary metabolites.
Biochemical Network Integrated Computational Explorer (BNICE)
Predicts novel thermodynamic feasible pathways on the basis of reaction rules of the enzyme commission classification system.
A Monte Carlo algorithm that identifies metabolic pathways from target compounds using a database of known enzymatic reactions. Also provides amino acid sequences of corresponding enzymes from phylogenetically closely related organisms.
From Metabolite to Metabolite (FMM)
Online tool that predicts the pathway between two compounds based on the KEGG database.
Algorithm that identifies pathways within existing metabolic networks by tracking the conservation of atoms moving through them.
Computational framework that advises on optimization of the host’s metabolic network to add a particular metabolic pathway by adding or deleting reactions
Predicts all paths between two compounds
The UM-BBD-Pathway Prediction System (PPS) is a part of UM-BBD that may be accessed at http://umbbd.ethz.ch/predict/. The PPS can be used to predict metabolic pathways for microbial degradation of chemical compounds . Predictions are based on biotransformation rules derived from reactions found in the UM-BBD database or in the scientific literature . Users can predict both aerobic and anaerobic degradation pathways of chemicals and can select whether they will view all or only the more likely aerobic transformations . Users can also obtain the most accurate prediction for those compounds similar to compounds with biodegradation pathways that have been reported in the scientific literature . For example, the degradation pathways of 4-nitrophenol have been thoroughly investigated, while those of 2-fluro-4-nitrophenol and 2-bromo-4-nitrophenol have not. However, the structures of 2-fluro-4-nitrophenol and 2-bromo-4-nitrophenol are similar to 4-nitophenol; therefore, PPS can provide very accurate predictions for degradation of 2-flouro-4-nitrophenol and 2-bromo-4-nitrophenol. For the prediction, users may enter a compound into the system by either drawing the structure and generating SMILES or entering SMILES directly.
Another pathway prediction system, PathPred (http://www.genome.jp/tools/pathpred/), is a knowledge based prediction system that uses data derived from the Kyoto Encyclopedia of Genes and Genomes (KEGG) in the form of the KEGG REACTION database and KEGG repair database . The KEGG REACTION database contains not only all known enzymatic reactions taken from the IUBMB enzyme nomenclature, but also additional reactions taken from the KEGG metabolic pathways . KEGG RPAIR is a collection of biochemical structure transformation patterns (RDM patterns) for substrate–product pairs (reactant pairs) in KEGG REACTION. PathPred is a web-based server that predicts plausible enzyme-catalyzed reaction pathways from a query compound using information regarding RDM patterns and chemical structure alignments of substrate-product pairs. This server provides plausible reactions and transformed compounds and displays all predicted reaction pathways in a tree-shaped graph. PathPred based predictions are very accurate for compounds that have biochemical similarity to KEGG compounds . PathPred contains reference pathways (i) for microbial biodegradation of environmental compounds and (ii) for biosynthesis of plant secondary metabolites. The users can select one of the reference pathways according to their purpose . There are multiple user friendly methods for searching a pathway for query. Specifically, a query compound can be input (i) in the MDL mol file format, (ii) the SMILES representation, or (iii) by the KEGG compound identifier. In the case of the xenobiotics biodegradation reference pathway, users should use the compound to undergo biodegradation as a query, while in the case of the reference pathway of biosynthesis of secondary metabolites the query should be the end product of biosynthesis. The prediction results are linked to genomic information . The PathPred server provides new and alternative reactions, regardless of whether enzymes for these reactions are known or not. If the enzyme is unknown, users can use the E-zyme tool (http://www.genome.jp/tools/e-zyme/) to assign a possible EC number (up to the EC sub-subclass). After assigning EC numbers, it is also possible to search the putative genes in the genome based on sequence similarity of known genes with the same EC sub-subclass .
Biochemical Network Integrated Computational Explorer (BNICE) is computational approach for development of novel pathways based on the reaction rules of the Enzyme Commission classification system . BNICE generates all possible pathways from a given target or starting molecule. In the next step, BNICE screens out all possible pathways for thermodynamic feasibility based on the Gibbs free energies of the reaction and selects feasible novel thermodynamic pathways . Soh and Hatzimanikatis  suggested that the pathways generated by BNICE can be further evaluated using established pathway analysis approaches, such as thermodynamics-based flux balance analysis (FBA) GrowMatch, which allows investigation of the overall effects of these novel pathways on metabolic network performance in host organisms . FBA can help predict maximum yield, phenotypic changes, effects of gene knockouts, changes in bioenergetics of the system for metabolic engineering, synthetic biology, and biodegradation of xenobiotics . BNICE can be applied in multiple areas: (i) to discover novel pathways for metabolic engineering; (ii) for ‘retrosynthesis’ of metabolic chemicals, (iii) to investigate evolution between metabolic pathways of various organisms; (iv) to analyze metabolic pathways; (v) for mining of omics data; (vi) to select targets for enzyme engineering; and for (viii) analysis of degradation pathways of xenobiotic compounds .
A recently developed web tool, Metabolic Tinker (http://osslab.ex.ac.uk/tinker.aspx) can be used to design synthetic metabolic pathways between user-defined target and source compounds . Metabolic Tinker uses a tailored heuristic search strategy to search for thermodynamically feasible paths in the entire known metabolic universe . The program contains a directed graph known as Universal Reaction Network (URN), which represents the entire set of known reactions and compounds from the Rhea database . Nodes and edges on this graph represent metabolites and reactions, respectively, and thus the entire graph represents the current known metabolic universe . Metabolic tinker searches possible biochemical paths between two compounds within this URN using standard search algorithms developed in computer science and graph theory . The Rhea/CHEBI identification codes of both the source and target compounds are needed to complete the search .
Computational methods for predicting chemical toxicity
The computational methods for estimating chemical toxicity are evolving rapidly . In recent years, several models have been developed in which computational programs have been used to predict the toxicity of chemical compounds [22–24, 67, 68]. Quantitative structure-regulatory activity relationship (QSAR) models calculate toxicity based on the physical characteristics of the structure of chemicals such as the molecular weight or the number of benzene rings (molecular descriptors) using mathematical algorithms . Following are the some examples of commercial and publicly-available models:
Sarah Nexus for prediction of the mutagenicity of chemicals .
VirtualToxLab for prediction of the toxic potential (endocrine and metabolic disruption, some aspects of carcinogenicity and cardiotoxicity) of drugs, chemicals and natural products .
Toxicity Estimation Software Tool (TEST) for prediction of the acute toxicity of organic chemicals based on their molecular structures .
TOPKAT for prediction of the ecotoxicity, mutagenicity, and reproductive/developmental toxicity of chemicals .
Ecological Structure Activity Relationships (ECOSAR) for estimation of the aquatic toxicity (acute short-term), toxicity and chronic (long-term or delayed) toxicity of industrial chemicals to aquatic organisms such as fish, aquatic invertebrates, green algae and aquatic plants by using computerized structure activity relationships 
Estimation Programs Interface (EPI) suite for prediction of physical/chemical properties and environmental fate (eco-toxicity). The software calculates chemical property data using programs including KOWWIN, AOPWIN, HENRYWIN, MPBPWIN, BIOWIN, KOCWIN, WSKOWWIN, WATERNT, BCFBAF, HYDROWIN and ECOSAR .
CAESAR for assessment of chemical toxicity under the REACH .
ToxiPred: A server for prediction of aqueous toxicity of small chemical molecules in Tetrahymena pyriformis.
Genome sequences of xenobiotic degrading bacteria
The automated Sanger method for sequencing is known as first generation sequencing, whereas newer methods developed for sequencing are considered next generation sequencing (NGS) . Commercially available NGS technologies include Roche/454, Illumina/Solexa, SOLiD/Life/APG, Helicos BioSciences, and the Polonator Instrument .
The initial steps of NGS involve generation of short reads and their subsequent alignment to a reference genome. The latter step is crucial for NGS technologies, and a variety of computational tools have been applied for genome sequence assembly including SSAKE , SOAPdenovo , AbySS , and Velvet . Once the sequence reads are assembled into contigs, the next steps are gene prediction and functional annotation. The most common gene prediction system for microbial systems is GLIMMER (Gene Locator and Interpolated Markov ModelER), which identifies the coding region on the microbial genome based on interpolated Markov models [83, 84]. The predicted coding region sequences may be analyzed and evaluated manually or by automatic annotation software to identify the homologous genes. A variety of automatic pipelines are available for bacterial annotation, including online tools such as RAST , BASys , WeGAS  and MaGe/Microscope , as well as offline tools such as AGeS , DIYA  and PIPA . Furthermore, MICheck  may be used to check for syntactic errors in annotated sequences.
NGS ignited a revolution in biodegradation and bioremediation with the concept of “from genomics to metabolomics.” Bacterial genomics is the study of the whole genomes of bacteria in which genes involved in biodegradation and other metabolic processes can be predicted. The whole genomes of several xenobiotic degrading bacteria have been sequenced using NGS technology, and several xenobiotic-degrading genes have been identified through gene predictions and annotation of the bacterial genomes [93–97]. In silico analysis of the bacterial genome leads to prediction of metabolic pathways for the biodegradation of xenobiotics and gives a holistic view of the metabolic network of particular bacteria . Several metabolic pathways may be predicted from the genomes of xenobiotic degrading bacteria [99, 100]. For example, the whole genome of Cupriavidus necator JMP134 (previously known as Ralstonia eutropha, Strain JMP134), which utilizes a variety of aromatic and chloroaromatic compounds as the sole carbon and energy sources, was sequenced and several genes coding the enzymes involved in the degradation of various xenobiotic compounds were identified [100, 101]. The genome of strain JMP134 comprises four replicons (two chromosomes and two plasmids) with a total of 6631 protein coding genes. The C. necator JMP134 genome contains 300 genes putatively involved in central ring-cleavage pathways of various aromatic compounds .
In silico analysis of the genome of Pseudomonas putida KT2440 showed that the presence of the following pathways for degradation of aromatic compounds: (i) the ortho pathway for the catabolism of protocatechuate (pca genes) and catechol (cat genes), (ii) the phenylacetate pathway (pha genes), and (iii) the homogentisate pathway (hmg genes) . Additionally, the gene clusters for catabolism of N-heterocyclic aromatic compounds (nic cluster) and in a central meta-cleavage pathway (pcm genes) were also identified in the genome of this microorganism .
Whole-genome sequences are not only useful for prediction of genes and their functions, but also for identification of novel biocatalysts . Combining the genomic approach with proteomic approaches will lead to new insights into metabolism at the organism level . Kim et al.  used metabolic, genomic and proteomic approaches to construct a complete and integrated pathway for pyrene degradation in Mycobacterium vanbaalenii PYR-1 and identified 27 enzymes that were used to construct a complete pathway for pyrene degradation based on genomic and proteomic data .
Several databases have been developed for providing the information on chemicals and their biodegradation. Users can use these databases to retrieve the information according to their research interests. For example, users can retrieve the information on toxicity, risk assessment, and environmental properties of the chemicals using chemical databases. Furthermore several bioinformatics tools have been developed for the prediction of the toxicity of chemicals. Users can use these tools for prediction of the toxicity of the chemicals. In addition, several pathway prediction systems are available for predicting the degradation pathways for those chemicals whose degradation pathways are not known in literature. The UM-BBD and PathPred are well known pathway prediction systems for biodegradation purpose. Using these pathway prediction systems, users can predict not only the degradation pathways, but also identify enzymes involved in the degradation pathways. This approach would be very useful for metabolic engineering and also to develop the strategy for bioremediation. The major problem related to the pathway predictions is that the predicted pathways are yet not experimentally verified. In the future, experimental studies should be carried out to verify the predicted pathways. Furthermore, the genomes of the several xenobiotics-degrading bacteria have been sequenced using NGS and the genes and enzymes involved in the biodegradation have been identified using gene-annotation. In future, molecular techniques along with bioinformatics tools may provide new insights into the genetics of the biodegradation.
This work was supported by a Grant from the Next-Generation Biogreen 21 Program (PJ00806302), Rural Development Administration, Republic of Korea.
- Ellis LB, Wackett LP: Use of the University of Minnesota Biocatalysis/Biodegradation Database for study of microbial degradation. Microb Inform Exp. 2012, 2: 1-10.1186/2042-5783-2-1.PubMed CentralView ArticlePubMedGoogle Scholar
- Arora P, Shi W: Tools of bioinformatics in biodegradation. Rev Environ Sci Biotechnol. 2010, 9: 211-213. 10.1007/s11157-010-9211-x.View ArticleGoogle Scholar
- Andrady AL: Biodegradation of plastics: monitoring what happens. Plastics Additives. 1998, 1: 32-40. 10.1007/978-94-011-5862-6_5. Springer NetherlandsView ArticleGoogle Scholar
- Arora PK, Sasikala C, Ramana CV: Degradation of chlorinated nitroaromatic compounds. Appl Microbiol Biotechnol. 2012, 93 (6): 2265-2277. 10.1007/s00253-012-3927-1.View ArticlePubMedGoogle Scholar
- Arora PK, Srivastava A, Singh VP: Bacterial degradation of nitrophenols and their derivatives. J Hazard Mater. 2014, 266: 42-59.View ArticlePubMedGoogle Scholar
- Arora PK, Bae H: Bacterial degradation of chlorophenols and their derivatives. Microb Cell Fact. 2014, 13: 31-10.1186/1475-2859-13-31.PubMed CentralView ArticlePubMedGoogle Scholar
- Karigar CH, Rao SS: Role of microbial enzymes in the bioremediation of pollutants: a review. Enzyme Res. 2011, 2011: 11-View ArticleGoogle Scholar
- Arora PK, Srivastava A, Singh VP: Application of monooxygenases in dehalogenation, desulphurization, denitrification and hydroxylation of aromatic compounds. J Bioremed Biodegrad. 2010, 1: 112-View ArticleGoogle Scholar
- Katara P: Role of bioinformatics and pharmacogenomics in drug discovery and development process. Netw Modeling Anal Health Inform Bioinforma. 2013, 2 (4): 225-230. 10.1007/s13721-013-0039-5.View ArticleGoogle Scholar
- Debes JD, Urrutia R: Bioinformatics tools to understand human diseases. Surgery. 2004, 135: 579-585. 10.1016/j.surg.2003.11.010.View ArticlePubMedGoogle Scholar
- Ellis LBM, Roe D, Wackett LP: The University of Minnesota Biocatalysis/Biodegradation Database: the first decade. Nucleic Acids Res. 2006, 34: D517-D521. 10.1093/nar/gkj076.PubMed CentralView ArticlePubMedGoogle Scholar
- Arora PK, Kumar M, Chauhan A, Raghava GP, Jain RK: OxDBase: a database of oxygenases involved in biodegradation. BMC Res Notes. 2009, 2: 67-10.1186/1756-0500-2-67.PubMed CentralView ArticlePubMedGoogle Scholar
- Carbajosa G, Trigo A, Valencia A, Cases I: Bionemo: molecular information on biodegradation metabolism. Nucleic Acids Res. 2009, 37 (Database issue): D598-602.PubMed CentralView ArticlePubMedGoogle Scholar
- Caspi R, Altman T, Dreher K, Fulcher CA, Subhraveti P, Keseler IM, Kothari A, Kubo A, Krummenacker M, Latendresse M, Mueller LA, Ong Q, Paley S, Subhraveti P, Weaver DS, Weerasinghe D, Zhang P, Karp PD: The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res. 2012, 40 (D1): D742-D753. 10.1093/nar/gkr1014.PubMed CentralView ArticlePubMedGoogle Scholar
- Greene N: Computer systems for the prediction of toxicity: an update. Adv Drug Deliv Rev. 2002, 54 (3): 417-431. 10.1016/S0169-409X(02)00012-1.View ArticlePubMedGoogle Scholar
- Mohan CG, Gandhi T, Garg D, Shinde R: Computer-assisted methods in chemical toxicity prediction. Mini Rev Med Chem. 2007, 7 (5): 499-507. 10.2174/138955707780619554.View ArticlePubMedGoogle Scholar
- Gao J, Ellis LB, Wackett LP: The University of Minnesota pathway prediction system: multi-level prediction and visualization. Nucleic Acids Res. 2011, 39 (Suppl 2): W406-W411.PubMed CentralView ArticlePubMedGoogle Scholar
- Moriya Y, Shigemizu D, Hattori M, Tokimatsu T, Kotera M, Goto S, Kanehisa M: PathPred: an enzyme-catalyzed metabolic pathway prediction server. Nucleic Acids Res. 2010, 38: W138-W143. 10.1093/nar/gkq318.PubMed CentralView ArticlePubMedGoogle Scholar
- Finley SD, Broadbelt LJ, Hatzimanikatis V: Computational framework for predictive biodegradation. Biotechnol Bioeng. 2009, 104: 1086-1097. 10.1002/bit.22489.PubMed CentralView ArticlePubMedGoogle Scholar
- Chou CH, Chang WC, Chiu CM, Huang CC, Huang HD: FMM: a web server for metabolic pathway reconstruction and comparative analysis. Nucleic Acids Res. 2009, 37: W129-W134. 10.1093/nar/gkp264.PubMed CentralView ArticlePubMedGoogle Scholar
- McClymont K, Soyer OS: Metabolic tinker: an online tool for guiding the design of synthetic metabolic pathways. Nucleic Acids Res. 2013, 41 (11): e113-10.1093/nar/gkt234.PubMed CentralView ArticlePubMedGoogle Scholar
- Zheng M, Liu Z, Xue C, Zhu W, Chen K, Luo X, Jiang H: Mutagenic probability estimation of chemical compounds by a novel molecular electrophilicity vector and support vector machine. Bioinformatics. 2006, 22: 2099-2106. 10.1093/bioinformatics/btl352.View ArticlePubMedGoogle Scholar
- Wang Y, Lu J, Wang F, Shen Q, Zheng M, Luo X, Zhu W, Jiang H, Chen K: Estimation of carcinogenicity using molecular fragments tree. J Chem Inf Model. 2012, 52: 1994-2003. 10.1021/ci300266p.View ArticlePubMedGoogle Scholar
- Chen L, Lu J, Zhang J, Feng KR, Zheng MY, Cai YD: Predicting chemical toxicity effects based on chemical-chemical interactions. PLoS One. 2013, 8 (2): e56517-10.1371/journal.pone.0056517.PubMed CentralView ArticlePubMedGoogle Scholar
- The ChemIDplus. [http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?CHEM ]
- Schöning G: Classification & labelling inventory: role of ECHA and notification requirements. Ann Ist Super Sanita. 2011, 47 (2): 140-145.PubMedGoogle Scholar
- The NCLASS (the Nordic N-Class Database on Environmental Hazard Classification). [http://apps.kemi.se/nclass/default.asp]
- The Hazardous Substances Data Bank (HSDB). [http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?HSDB]
- The Toxicology Literature Online (TOXLINE). [http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?TOXLINE]
- The Chemical Carcinogenesis Research Information System (CCRIS). [http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?CCRIS]
- The Developmental and Reproductive Toxicology Database (DART). [http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?DARTETIC]
- The Genetic Toxicology Data Bank (GENE-TOX). [http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?GENETOX]
- The Integrated Risk Information System (IRIS). [http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?IRIS]
- Wullenweber A, Kroner O, Kohrman M, Maier A, Dourson M, Rak A, Wexler P, Tomljanovic C: Resources for global risk assessment: The International Toxicity Estimates for Risk (ITER) and Risk Information Exchange (RiskIE) databases. Toxicol Appl Pharmacol. 2008, 233: 45-53. 10.1016/j.taap.2007.12.035.View ArticlePubMedGoogle Scholar
- Wexler P: TOXNET: an evolving web resource for toxicology and environmental health information. Toxicology. 2001, 157: 3-10. 10.1016/S0300-483X(00)00337-1.View ArticlePubMedGoogle Scholar
- Schmidt U, Struck S, Gruening B, Hossbach J, Jaeger IS, Parol R, Lindequist U, Teuscher E, Preissner R: SuperToxic: a comprehensive database of toxic compounds. Nucleic Acids Res. 2009, 37 (Database issue): D295-D299.PubMed CentralView ArticlePubMedGoogle Scholar
- Kinsner-Ovaskainen A, Rzepka R, Rudowski R, Coecke S, Cole T, Prieto P: Acutoxbase, an innovative database for in vitro acute toxicity studies. Toxicol In Vitro. 2009, 23: 476-485. 10.1016/j.tiv.2008.12.019.View ArticlePubMedGoogle Scholar
- The CTD (Comparative Toxicogenomics Database). [http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?CTD]
- The Carcinogenic Potency Database. [http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?CPDB.htm]
- The IUCLID - International Uniform Chemical Information Database. [http://iuclid.eu/]
- The Haz Map. [http://hazmap.nlm.nih.gov/]
- Hochstein C, Szczur M: TOXMAP: a GIS-based gateway to environmental health resources. Med Ref Serv Q. 2006, 25 (3): 13-31. 10.1300/J115v25n03_02.PubMed CentralView ArticlePubMedGoogle Scholar
- The Toxics Release Inventory (TRI). [http://toxnet.nlm.nih.gov/cgi-bin/sis/htmlgen?TRI]
- The Household Products Database. [http://hpd.nlm.nih.gov/]
- The ESIS, European chemical Substances Information System. [http://esis.jrc.ec.europa.eu/]
- The ECOTOX (AQUIRE, PHYTOTOX, TERRETOX). [http://cfpub.epa.gov/ecotox/]
- The eChemPortal. [http://www.echemportal.org/echemportal/index?pageID=0&request_locale=en]
- The EnviChem. [http://www.echemportal.org/echemportal/participant/participantinfo.action?participantID=5&pageID=2]
- The ACToR (Aggregated Computational Toxicology Resource). [http://actor.epa.gov/actor/faces/BasicInfo.jsp]
- The EPA Human Health Benchmarks for Pesticides (HHBP). [http://iaspub.epa.gov/apex/pesticides/f?p=HHBP:home]
- The EPA Office of Pesticide Programs’ Aquatic Life Benchmarks (OPPALB). [http://www.epa.gov/oppefed1/ecorisk_ders/aquatic_life_benchmark.htm]
- The Chemical Safety Information from Intergovernmental Organizations-INCHEM. [http://www.inchem.org/pages/about.html]
- The JECDB: Japan Existing Chemical Data Base. [http://dra4.nihs.go.jp/mhlw_data/jsp/SearchPageENG.jsp]
- The SPIN (Substances in Preparations In the Nordic countries). [http://www.spin2000.net/]
- The US EPA: Substance Registry Services (SRS). [http://iaspub.epa.gov/sor_internet/registry/substreg/home/overview/home.do]
- Medema MH, van Raaphorst R, Takano E, Breitling R: Computational tools for the synthetic design of biochemical pathways. Nat Rev Microbiol. 2012, 10 (3): 191-202. 10.1038/nrmicro2717.View ArticlePubMedGoogle Scholar
- Soh KC, Hatzimanikatis V: DREAMS of metabolism. Trends Biotechnol. 2010, 28 (10): 501-508. 10.1016/j.tibtech.2010.07.002.View ArticlePubMedGoogle Scholar
- Dale JM, Popescu L, Karp PD: Machine learning methods for metabolic pathway prediction. BMC Bioinformatics. 2010, 11 (1): 15-10.1186/1471-2105-11-15.PubMed CentralView ArticlePubMedGoogle Scholar
- Green ML, Karp PD: A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases. BMC Bioinformatics. 2004, 5 (1): 76-10.1186/1471-2105-5-76.PubMed CentralView ArticlePubMedGoogle Scholar
- Piškur J, Schnackerz KD, Andersen G, Björnberg O: Comparative genomics reveals novel biochemical pathways. Trends Genet. 2007, 23 (8): 369-372. 10.1016/j.tig.2007.05.007.View ArticlePubMedGoogle Scholar
- Cheng Q, Harrison R, Zelikovsky A: MetNetAligner: a web service tool for metabolic network alignments. Bioinformatics. 2009, 25 (15): 1989-1990. 10.1093/bioinformatics/btp287.View ArticlePubMedGoogle Scholar
- Osterman A, Overbeek R: Missing genes in metabolic pathways: a comparative genomics approach. Curr Opin Chem Biol. 2003, 7 (2): 238-251. 10.1016/S1367-5931(03)00027-9.View ArticlePubMedGoogle Scholar
- Hatzimanikatis V, Li C, Ionita JA, Henry CS, Jankowski MD, Broadbelt LJ: Exploring the diversity of complex metabolic networks. Bioinformatics. 2005, 21 (8): 1603-1609. 10.1093/bioinformatics/bti213.View ArticlePubMedGoogle Scholar
- Rodrigo G, Carrera J, Prather KJ, Jaramillo A: DESHARKY: automatic design of metabolic pathways for optimal cell growth. Bioinformatics. 2008, 24 (21): 2554-2556. 10.1093/bioinformatics/btn471.View ArticlePubMedGoogle Scholar
- Heath AP, Bennett GN, Kavraki LE: Finding metabolic pathways using atom tracking. Bioinformatics. 2010, 26: 1548-1555. 10.1093/bioinformatics/btq223.PubMed CentralView ArticlePubMedGoogle Scholar
- Pharkya P, Burgard AP, Maranas CD: OptStrain: a computational framework for redesign of microbial production systems. Genome Res. 2004, 14: 2367-2376. 10.1101/gr.2872004.PubMed CentralView ArticlePubMedGoogle Scholar
- Benfenati E: Predicting toxicity through computers: a changing world. Chem Cent J. 2007, 1 (1): 1-7. 10.1186/1752-153X-1-1.View ArticleGoogle Scholar
- Mishra NK: Computational modeling of P450s for toxicity prediction. Expert Opin Drug Metab Toxicol. 2011, 7 (10): 1211-1231. 10.1517/17425255.2011.611501.View ArticlePubMedGoogle Scholar
- Eriksson L, Jaworska J, Worth A, Cronin M, McDowell RM, Gramatica P: Methods for reliability, uncertainty assessment, and applicability evaluations of regression based and classification QSARs. Environ Health Perspect. 2003, 111: 1361-1375. 10.1289/ehp.5758.PubMed CentralView ArticlePubMedGoogle Scholar
- The Sarah Nexus. [http://www.lhasalimited.org/products/sarah-nexus.htm]
- Vedani A, Smiesko M, Spreafico M, Peristera O, Dobler M: Virtual ToxLab–in silico prediction of the toxic (endocrine-disrupting) potential of drugs, chemicals and natural products: two years and 2,000 compounds of experience: aprogress report. ALTEX. 2009, 26 (3): 167-176.PubMedGoogle Scholar
- The Toxicity Estimation Software Tool (TEST). [http://www.epa.gov/nrmrl/std/qsar/qsar.html]
- Prival MJ: Evaluation of the TOPKAT system for predicting the carcinogenicity of chemicals. Environ Mol Mutagen. 2001, 37 (1): 55-69. 10.1002/1098-2280(2001)37:1<55::AID-EM1006>3.0.CO;2-5.View ArticlePubMedGoogle Scholar
- The Ecological Structure Activity Relationships. [http://www.epa.gov/oppt/newchems/tools/21ecosar.htm]
- The Estimation Programme Interface (EPI) Suite. US EPA. [http://www.epa.gov/opptintr/exposure/pubs/episuite.htm]
- Cassano A, Manganaro A, Martin T, Young D, Piclin N, Pintore M, Bigoni D, Benfenati E: CAESAR models for developmental toxicity. Chem Cent J. 2010, 4 (Suppl 1): S4-10.1186/1752-153X-4-S1-S4.PubMed CentralView ArticlePubMedGoogle Scholar
- Mishra NK, Singla D, Agarwal S, Consortium OSDD, Raghava GPS: ToxiPred: a server for prediction of aqueous toxicity of small chemical molecules in T. Pyriformis J Transl Toxicol. 2014, 1: 21-27.Google Scholar
- Metzker ML: Sequencing technologies–the next generation. Nat Rev Genet. 2010, 11: 31-46. 10.1038/nrg2626.View ArticlePubMedGoogle Scholar
- Warren RL, Sutton GG, Jones SJ, Holt RA: Assembling millions of short DNA sequences using SSAKE. Bioinformatics. 2007, 23: 500-550. 10.1093/bioinformatics/btl629.View ArticlePubMedGoogle Scholar
- Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, Li Y, Li S, Shan G, Kristiansen K, Li S, Yang H, Wang J, Wang J: De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 2010, 20: 265-272. 10.1101/gr.097261.109.PubMed CentralView ArticlePubMedGoogle Scholar
- Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I: ABySS: a parallel assembler for short read sequence data. Genome Res. 2009, 19: 1117-1123. 10.1101/gr.089532.108.PubMed CentralView ArticlePubMedGoogle Scholar
- Zerbino DR, Birney E: Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008, 18: 821-829. 10.1101/gr.074492.107.PubMed CentralView ArticlePubMedGoogle Scholar
- Delcher AL, Harmon D, Kasif S, White O, Salzberg SL: Improved microbial gene identification with GLIMMER. Nucleic Acids Res. 1999, 27: 4636-4641. 10.1093/nar/27.23.4636.PubMed CentralView ArticlePubMedGoogle Scholar
- Richardson EJ, Watson M: The automatic annotation of bacterial genomes. Brief Bioinform. 2013, 14 (1): 1-12. 10.1093/bib/bbs007.PubMed CentralView ArticlePubMedGoogle Scholar
- Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M, Meyer F, Olsen GJ, Olson R, Osterman AL, Overbeek RA, McNeil LK, Paarmann D, Paczian T, Parrello B, Pusch GD, Reich C, Stevens R, Vassieva O, Vonstein V, Wilke A, Zagnitko O: The RAST Server: rapid annotations using subsystems technology. BMC Genomics. 2008, 9: 75-10.1186/1471-2164-9-75.PubMed CentralView ArticlePubMedGoogle Scholar
- Van Domselaar GH, Stothard P, Shrivastava S, Cruz JA, Guo A, Dong X, Lu P, Szafron D, Greiner R, Wishart DS: BASys: a web server for automated bacterial genome annotation. Nucleic Acids Res. 2005, 33: W455-W459. 10.1093/nar/gki593.PubMed CentralView ArticlePubMedGoogle Scholar
- Lee D, Seo H, Park C, Park K: WeGAS: a web-based microbial genome annotation system. Biosci Biotechnol Biochem. 2009, 73: 213-216. 10.1271/bbb.80567.View ArticlePubMedGoogle Scholar
- Vallenet D, Labarre L, Rouy Z, Barbe V, Bocs S, Cruveiller S, Lajus A, Pascal G, Scarpelli C, Médigue C: MaGe: a microbial genome annotation system supported by synteny results. Nucleic Acids Res. 2006, 34: 53-65. 10.1093/nar/gkj406.PubMed CentralView ArticlePubMedGoogle Scholar
- Kumar K, Desai V, Cheng L, Khitrov M, Grover D, Satya RV, Yu C, Zavaljevski N, Reifman J: AGeS: a software system for microbial genome sequence annotation. PLoS One. 2011, 6: e17469-10.1371/journal.pone.0017469.PubMed CentralView ArticlePubMedGoogle Scholar
- Stewart AC, Osborne B, Read TD: DIYA: a bacterial annotation pipeline for any genomics lab. Bioinformatics. 2009, 25: 962-963. 10.1093/bioinformatics/btp097.PubMed CentralView ArticlePubMedGoogle Scholar
- Yu C, Zavaljevski N, Desai V, Johnson S, Stevens FJ, Reifman J: The development of PIPA: an integrated and automated pipeline for genome-wide protein function annotation. BMC Bioinformatics. 2008, 9: 52-10.1186/1471-2105-9-52.PubMed CentralView ArticlePubMedGoogle Scholar
- Cruveiller S, Le Saux J, Vallenet D, Lajus A, Bocs S, Médigue C: MICheck: a web tool for fast checking of syntactic annotations of bacterial genomes. Nucleic Acids Res. 2005, 33: W471-W479. 10.1093/nar/gki498.PubMed CentralView ArticlePubMedGoogle Scholar
- Lee SH, Jin HM, Lee HJ, Kim JM, Jeon CO: Complete genome sequence of the BTEX-degrading bacterium Pseudoxanthomonas spadix BD-a59. J Bacteriol. 2012, 194 (2): 544-10.1128/JB.06436-11.PubMed CentralView ArticlePubMedGoogle Scholar
- Köhler KA, Rückert C, Schatschneider S, Vorhölter FJ, Szczepanowski R, Blank LM, Niehaus K, Goesmann A, Pühler A, Kalinowski J, Schmid A: Complete genome sequence of Pseudomonas sp. strain VLB120 a solvent tolerant, styrene degrading bacterium, isolated from forest soil. J Biotechnol. 2013, 168 (4): 729-730. 10.1016/j.jbiotec.2013.10.016.View ArticlePubMedGoogle Scholar
- Schneiker S, Santos VA M d, Bartels D, Bekel T, Brecht M, Buhrmester J, Chernikova TN, Denaro R, Ferrer M, Gertler C, Goesmann A, Golyshina OV, Kaminski F, Khachane AN, Lang S, Linke B, McHardy AC, Meyer F, Nechitaylo T, Pühler A, Regenhardt D, Rupp O, Sabirova JS, Selbitschka W, Yakimov MM, Timmis KN, Vorhölter FJ, Weidner S, Kaiser O, Golyshin PN: Genome sequence of the ubiquitous hydrocarbon-degrading marine bacterium Alcanivorax borkumensis. Nat Biotechnol. 2006, 24: 997-1004. 10.1038/nbt1232.View ArticlePubMedGoogle Scholar
- Vikram S, Kumar S, Vaidya B, Pinnaka AK, Raghava GPS: Draft genome sequence of the 2-chloro-4-nitrophenol-degrading bacterium Arthrobacter sp. strain SJCon. Genome Announc. 2013, 1 (2): e0005813-View ArticlePubMedGoogle Scholar
- Kumar S, Vikram S, Raghava GPS: Genome sequence of the nitroaromatic compound-degrading bacterium Burkholderia sp. strain SJ98. J Bacteriol. 2012, 194 (12): 3286-10.1128/JB.00497-12.PubMed CentralView ArticlePubMedGoogle Scholar
- Vilchez‒Vargas R, Junca H, Pieper DH: Metabolic networks, microbial ecology and ‘omics’ technologies: towards understanding in situ biodegradation processes. Environ Microbiol. 2010, 12 (12): 3089-3104. 10.1111/j.1462-2920.2010.02340.x.View ArticleGoogle Scholar
- Romero-Silva MJ, Méndez V, Agulló L, Seeger M: Genomic and functional analyses of the gentisate and protocatechuate ring-cleavage pathways and related 3-hydroxybenzoate and 4-hydroxybenzoate peripheral pathways in Burkholderia xenovorans LB400. PLoS One. 2013, 8 (2): e56038-10.1371/journal.pone.0056038.PubMed CentralView ArticlePubMedGoogle Scholar
- Pérez-Pantoja D, De la Iglesia R, Pieper DH, González B: Metabolic reconstruction of aromatic compounds degradation from the genome of the amazing pollutant-degrading bacterium Cupriavidus necator JMP134. FEMS Microbiol Rev. 2008, 32: 736-794. 10.1111/j.1574-6976.2008.00122.x.View ArticlePubMedGoogle Scholar
- Lykidis A, Pérez-Pantoja D, Ledger T, Mavromatis K, Anderson IJ, Ivanova NN, Hooper SD, Lapidus A, Lucas S, González B, Kyrpides NC: The complete multipartitegenome sequence of Cupriavidus necator JMP134, a versatile pollutant degrader. PLoS One. 2010, 5 (3): e9729-10.1371/journal.pone.0009729.PubMed CentralView ArticlePubMedGoogle Scholar
- Jiménez JI, Miñambres B, Garcia JL, Díaz E: Genomic analysis of the aromatic catabolic pathways from Pseudomonas putida KT2440. Environ Microbiol. 2002, 4 (12): 824-841. 10.1046/j.1462-2920.2002.00370.x.View ArticlePubMedGoogle Scholar
- Kim SJ, Kweon O, Jones RC, Freeman JP, Edmondson RD, Cerniglia CE: Complete and integrated pyrene degradation pathway in Mycobacterium vanbaalenii PYR-1 based on systems biology. J Bacteriol. 2007, 189: 464-472. 10.1128/JB.01310-06.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.