|
|
- PerturbationAnalyzer: A tool for investigating the effects of concentration perturbation on protein interaction networks.
- Publication Date: 2009 Nov 13 PMID: 19914922
Authors: Li, F. - Li, P. - Xu, W. - Peng, Y. - Bo, X. - Wang, S. Journal: Bioinformatics
SUMMARY: The propagation of perturbations in protein concentration through a protein interaction network can shed light on network dynamics and function. In order to facilitate this type of study, PerturbationAnalyzer, which is an open-source plugin for Cytoscape, has been developed. PerturbationAnalyzer can be used in manual mode for simulating user-defined perturbations, as well as in batch mode for evaluating network robustness and identifying significant proteins that cause large propagation effects in the PINs when their concentrations are perturbed. Results from PerturbationAnalyzer can be represented in an intuitive and customizable way and can also be exported for further exploration. PerturbationAnalyzer has great potential in mining the design principles of protein networks, and may be a useful tool for identifying drug targets. AVAILABILITY: PerturbationAnalyzer can be accessed from the Cytoscape website http://www.cytoscape.org/plugins/index.php or website http://biotech.bmi.ac.cn/PerturbationAnalyzer. CONTACT: boxc@bmi.ac.cn; sqwang@bmi.ac.cn.
post to: CiteULike
- PyNAST: A flexible tool for aligning sequences to a template alignment.
- Publication Date: 2009 Nov 13 PMID: 19914921
Authors: Caporaso, J. G. - Bittinger, K. - Bushman, F. D. - Desantis, T. Z. - Andersen, G. L. - Knight, R. Journal: Bioinformatics
MOTIVATION: The Nearest Alignment Space Termination (NAST) tool is commonly used in sequence-based microbial ecology community analysis, but due to the limited portability of the original implementation, it has not been as widely adopted as possible. PyNAST is a complete re-implementation of NAST, which includes three convenient interfaces: a Mac OS X GUI, a command line interface, and a simple API. RESULTS: The availability of PyNAST will make the popular NAST algorithm more portable and thereby applicable to data sets orders of magnitude larger by allowing users to install PyNAST on their own hardware. Additionally because users can align to arbitrary template alignments, a feature not available via the original NAST web interface, the NAST algorithm will be readily applicable to novel tasks outside of microbial community analysis. AVAILABILITY: PyNAST is available at http://pynast.sourceforge.net. CONTACT: rob.knight@colorado.edu.
post to: CiteULike
- edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.
- Publication Date: 2009 Nov 11 PMID: 19910308
Authors: Robinson, M. D. - McCarthy, D. J. - Smyth, G. K. Journal: Bioinformatics
SUMMARY: It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An over-dispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of over-dispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data. AVAILABILITY: The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org). CONTACT: mrobinson@wehi.edu.au.
post to: CiteULike
- Automatic clustering of docking poses in virtual screening process using self-organising map.
- Publication Date: 2009 Nov 12 PMID: 19910307
Authors: Bouvier, G. - Evrard-Todeschi, N. - Girault, J. P. - Bertho, G. Journal: Bioinformatics
MOTIVATION: Scoring functions provided by the docking software are still a major limiting factor in virtual screening process to classify compounds. Score analysis of the docking is not able to find out all active compounds. This is due to a bad estimation of the ligand binding energies. Making the assumption that active compounds should have specific contacts with their target to display activity, it would be possible to discriminate active compounds from inactive ones with careful analysis of interatomic contacts between the molecule and the target. However, compounds clustering is very tedious due to the large number of contacts extracted from the different conformations proposed by docking experiments. RESULTS: Structural analysis of docked structures is processed in three steps: (1) a Kohonen self-organising map (SOM) training phase using drug-protein contact descriptors followed by (2) an unsupervised cluster analysis and (3) a Newick file generation for results visualisation as a tree. The docking poses are then analysed and classified quickly and automatically by AuPosSOM (Automatic analysis of Poses using SOM). AuPosSOM can be integrated into strategies for virtual screening currently employed. We demonstrate that it is possible to discriminate active compounds from inactive ones using only mean protein contacts footprints calculation from the multiple conformations given by the docking software. Chemical structure of the compound and key binding residues information are not necessary to find out active molecules. Thus, contact activity relationship (CAR) can be employed as a new virtual screening process. AVAILABILITY: AuPosSOM is available at http://www.aupossom.com. CONTACT: contact@aupossom.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
post to: CiteULike
- Identification of substrates for Ser/Thr kinases using residue based statistical pair potentials.
- Publication Date: 2009 Nov 12 PMID: 19910306
Authors: Kumar, N. - Mohanty, D. Journal: Bioinformatics
MOTIVATION: In silico methods are being widely used for identifying substrates for various kinases and deciphering cell signaling networks. However, most of the available phosphorylation site prediction methods use motifs or profiles derived from a known data set of kinase substrates and hence, their applicability is limited to only those kinase families for which experimental substrate data is available. This prompted us to develop a novel multi-scale structure based approach which does not require training using experimental substrate data. RESULTS: In this work, for the first time, we have used residue based statistical pair potentials for scoring the binding energy of various substrate peptides in complex with kinases. Extensive benchmarking on Phospho.ELM data set indicate that our method outperforms other structure based methods and has a prediction accuracy comparable to available sequence based methods. We also demonstrate that the rank of the true substrate can be further improved, if the high scoring candidate substrates that are shortlisted based on pair potential score, are modeled using all atom forcefield and MM/PBSA approach. CONTACT: deb@nii.res.in.
post to: CiteULike
- Target prediction and a statistical sampling algorithm for RNA-RNA interaction.
- Publication Date: 2009 Nov 12 PMID: 19910305
Authors: Huang, F. W. - Qin, J. - Reidys, C. M. - Stadler, P. F. Journal: Bioinformatics
Motivation It has been proven that the accessibility of the target sites has a critical influence on RNA-RNA binding in general and the specificity and efficiency of miRNAs and siRNAs in particular. Recently, O(N(6)) time and O(N(4)) space dynamic programming algorithms have become available that compute the partition function of RNA-RNA interaction complexes, thereby providing detailed insights into their thermodynamic properties. Results Modifications to the grammars underlying earlier approaches enables the calculation of interaction probabilities for any given interval on the target RNA. The computation of the "hybrid probabilities" is complemented by a stochastic sampling algorithm that produces a Boltzmann weighted ensemble of RNA-RNA interaction structures. The sampling of k structures requires only negligible additional memory resources and runs in O(k . N(3)). Availability The algorithms described here are implemented in C as part of the rip package. The source code of rip2 can be downloaded from http://www.combinatorics.cn/cbpc/rip2.html and http://www.bioinf.uni-leipzig.de/Software/rip.html. Contact Christian Reidys duck@santafe.edu.
post to: CiteULike
- Co-expression Networks: Graph Properties and Topological Comparisons.
- Publication Date: 2009 Nov 12 PMID: 19910304
Authors: Xulvi-Brunet, R. - Li, H. Journal: Bioinformatics
MOTIVATION: For Microarray-based gene expression data have been generated widely to study different biological processes and systems. Gene co-expression networks are often used to extract information about groups of genes that are "functionally" related or co-regulated. However, the structural properties of such co-expression networks have not been rigorously studied and fully compared with known biological networks. In this paper, we aim at investigating the structural properties of co-expression networks inferred for the especies S. Cerevisiae and comparing them with the topological properties of the known, well-established transcriptional network, MIPS physical network and protein-protein interaction network of yeast. RESULTS: These topological comparisons indicate that co-expression networks are not distinctly related with either the protein-protein interaction or the MIPS physical interaction networks, showing important structural differences between them. When focusing on a more literal comparison, vertex by vertex and edge by edge, the conclusion is the same: the fact that two genes exhibit a high gene expression correlation degree does not seem to obviously correlate with the existence of a physical binding between the proteins produced by these genes or the existence of a MIPS physical interaction between the genes. The comparison of the yeast regulatory network with inferred yeast co-expression networks would suggest, however, that they could somehow be related. Conclusions: We conclude that the gene expression-based co-expression networks reflect more on the gene regulatory networks but less on the protein-protein interaction or MIPS physical interaction networks. CONTACT: Hongzhe Li, email: hongzhe@upenn.edu.
post to: CiteULike
- Quantifying Uncertainty in Genotype Calls.
- Publication Date: 2009 Nov 11 PMID: 19906825
Authors: Carvalho, B. - Louis, T. A. - Irizarry, R. A. Journal: Bioinformatics
MOTIVATION: Genome-wide association studies (GWAS) are used to discover genes underlying complex, heritable disorders for which less powerful study designs have failed in the past. The number of GWAS has skyrocketed recently with findings reported in top journals and the mainstream media. Microarrays are the genotype calling technology of choice in GWAS as they permit exploration of more than a million single nucleotide polymorphisms (SNPs) simultaneously. The starting point for the statistical analyses used by GWAS to determine association between loci and disease is making genotype calls (AA, AB, or BB). However, the raw data, microarray probe intensities, are heavily processed before arriving at these calls. Various sophisticated statistical procedures have been proposed for transforming raw data into genotype calls. We find that variability in microarray output quality across different SNPs, different arrays, and different sample batches have substantial influence on the accuracy of genotype calls made by existing algorithms. Failure to account for these sources of variability can adversely affect the quality of findings reported by the GWAS. RESULTS: We developed a method based on an enhanced version of the multi-level model used by CRLMM version 1 (Carvalho et al., 2007). Two key differences are that we now account for variability across batches and improve the call-specific assessment of each call. The new model permits the development of quality metrics for SNPs, samples, and batches of samples. Using three independent datasets we demonstrate that the CRLMM version 2 outperforms CRLMM version 1 and the algorithm provided by Affymetrix, Birdseed. The main advantage of the new approach is that it enables the identification of low quality SNPs , samples, and batches. AVAILABILITY: Software implementing of the method described in this paper is available as free and open source code in the crlmm R/BioConductor package. CONTACT: rafa@jhu.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
post to: CiteULike
- hPDI: a database of experimental human protein-DNA interactions.
- Publication Date: 2009 Nov 9 PMID: 19900953
Authors: Xie, Z. - Hu, S. - Blackshaw, S. - Zhu, H. - Qian, J. Journal: Bioinformatics
SUMMARY: The hPDI (human protein DNA Interactome) database holds experimental protein-DNA interaction data for humans identified by protein microarray assays. The unique characteristics of hPDI are that it contains consensus DNA-binding sequences not only for nearly 500 human TFs but also for more than 500 unconventional DNA-binding proteins, which are completely uncharacterized previously. Users can browse, search, and download a subset or the entire data via a web interface. This database is freely accessible for any academic purposes. AVAILABILITY: http://bioinfo.wilmer.jhu.edu/PDI/ CONTACT: jiang.qian@jhmi.edu.
post to: CiteULike
- Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments.
- Publication Date: 2009 Nov 6 PMID: 19897565
Authors: McGuffin, L. J. - Roche, D. B. Journal: Bioinformatics
MOTIVATION: The accurate prediction of the quality of 3D models is a key component of successful protein tertiary structure prediction methods. Currently, clustering or consensus based Model Quality Assessment Programs (MQAPs) are the most accurate methods for predicting 3D model quality; however they are often CPU intensive as they carry out multiple structural alignments in order to compare numerous models. In this study, we describe ModFOLDclustQ - a novel MQAP that compares 3D models of proteins without the need for CPU intensive structural alignments by utilising the Q measure for model comparisons. The ModFOLDclustQ method is benchmarked against the top established methods in terms of both accuracy and speed. In addition, the ModFOLDclustQ scores are combined with those from our older ModFOLDclust method to form a new method, ModFOLDclust2, that aims to provide increased prediction accuracy with negligible computational overhead. RESULTS: The ModFOLDclustQ method is competitive with leading clustering based MQAPs for the prediction of global model quality, yet it is up to 150 times faster than the previous version of the ModFOLDclust method at comparing models of small proteins (<60 residues) and over 5 times faster at comparing models of large proteins (>800 residues). Furthermore, a significant improvement in accuracy can be gained over the previous clustering based MQAPs by combining the scores from ModFOLDclustQ and ModFOLDclust to form the new ModFOLDclust2 method, with little impact on the overall time taken for each prediction. AVAILABILITY: The ModFOLDclustQ and ModFOLDclust2 methods are available to download from: http://www.reading.ac.uk/bioinf/downloads/ CONTACT: l.j.mcguffin@reading.ac.uk.
post to: CiteULike
- Identification of microRNA activity by Targets' Reverse EXpression.
- Publication Date: 2009 Nov 6 PMID: 19897564
Authors: Volinia, S. - Visone, R. - Galasso, M. - Rossi, E. - Croce, C. M. Journal: Bioinformatics
MOTIVATION: Non-coding miRNAs act as regulators of global protein output. While their major effect is on protein levels of target genes, it has been proven that they also specifically impact on the messenger RNA level of targets. Prominent interest in microRNAs strongly motivates the need for increasing the options available to detect their cellular activity. RESULTS: We used the effect of miRNAs over their targets for the detection of miRNA activity using mRNAs expression profiles. Here we describe the method, called T-REX (from Targets' Reverse EXpression), compare it to other similar applications, show its effectiveness and apply it to build activity maps. We used six different target predictions from each of four algorithms: TargetScan, PicTar, DIANA-microT and DIANA Union. First, we proved the sensitivity and specificity of our technique in miRNA over-expression and knock-out animal models. Then, we used whole transcriptome data from acute myeloid leukemia to show that we could identify critical miRNAs in a real life, complex, clinically relevant dataset. Finally, we studied sixty-six different cellular conditions to confirm and extend the current knowledge on the role of miRNAs in cellular physiology and in cancer.
post to: CiteULike
- GATE: Software for the Analysis and Visualization of High-Dimensional Time-series Expression Data.
- Publication Date: 2009 Nov 5 PMID: 19892805
Authors: Macarthur, B. D. - Lachmann, A. - Lemischka, I. R. - Ma'ayan, A. Journal: Bioinformatics
SUMMARY: We present Grid Analysis of Time-series Expression (GATE), an integrated computational software platform for the analysis and visualization of high-dimensional bio-molecular time-series. GATE uses a correlation-based clustering algorithm to arrange molecular time-series on a two-dimensional hexagonal array and dynamically colors individual hexagons according to the expression level of the molecular component to which they are assigned, to create animated movies of systems-level molecular regulatory dynamics. In order to infer potential regulatory control mechanisms from patterns of correlation, GATE also allows interactive interrogation of movies against a wide variety of prior knowledge datasets. GATE movies can be paused and are interactive, allowing users to reconstruct networks and perform functional enrichment analyses. Movies created with GATE can be saved in Flash format and can be inserted directly into PDF manuscript files as interactive figures. AVAILABILITY: GATE is available for download and is free for academic use from http://amp.pharm.mssm.edu/maayan-lab/gate.htm CONTACT: avi.maayan@mssm.edu.
post to: CiteULike
- Methods for combining peptide intensities to estimate relative protein abundance.
- Publication Date: 2009 Nov 5 PMID: 19892804
Authors: Carrillo, B. - Yanofsky, C. - Laboissiere, S. - Nadon, R. - Kearney, R. E. Journal: Bioinformatics
MOTIVATION: Labeling techniques are being used increasingly to estimate relative protein abundances in quantitative proteomic studies. These techniques require the accurate measurement of correspondingly labeled peptide peak intensities to produce high-quality estimates of differential expression ratios. In mass spectrometers with counting detectors, the measurement noise varies with intensity and consequently accuracy increases with the number of ions detected. Consequently, the relative variability of peptide intensity measurements varies with intensity. This effect must be accounted for when combining information from multiple peptides to estimate relative protein abundance. RESULTS: We examined a variety of algorithms that estimate protein differential expression ratios from multiple peptide intensity measurements. Algorithms that account for the variation of measurement error with intensity were found to provide the most accurate estimates of differential abundance. A simple Sum-of-Intensities algorithm provided the best estimates of true protein ratios of all algorithms tested.
post to: CiteULike
- Modeling the interplay of single-stranded binding proteins and nucleic acid secondary structure.
- Publication Date: 2009 Nov 4 PMID: 19889798
Authors: Forties, R. A. - Bundschuh, R. Journal: Bioinformatics
MOTIVATION: There are many important proteins which bind singlestranded nucleic acids, such as the nucleocapsid protein in HIV and the RecA DNA repair protein in bacteria. The presence of such proteins can strongly alter the secondary structure of the nucleic acid molecules. Therefore, accurate modeling of the interaction between single-stranded nucleic acids and such proteins is essential to fully understanding many biological processes. RESULTS: We develop a model for predicting nucleic acid secondary structure in the presence of single stranded binding proteins, and implement it as an extension of the Vienna RNA Package. All parameters needed to model nucleic acid secondary structures in the absence of proteins have been previously determined. This leaves the footprint and sequence dependent binding affinity of the protein as adjustable parameters of our model. Using this model we are able to predict the probability of the protein binding at any position in the nucleic acid sequence, the impact of the protein on nucleic acid base pairing, the end-to-end distance distribution for the nucleic acid, and FRET distributions for fluorophores attached to the nucleic acid. AVAILABILITY: Source code for our modified version of the Vienna RNA package is freely available at http://bioserv.mps.ohio-state.edu/Vienna+P, implemented in C and running on Linux. CONTACT: bundschuh@mps.ohio-state.edu.
post to: CiteULike
- ARH: Predicting Splice Variants from Genome-wide Data with Modified Entropy.
- Publication Date: 2009 Nov 4 PMID: 19889797
Authors: Rasche, A. - Herwig, R. Journal: Bioinformatics
MOTIVATION: Exon arrays allow the quantitative study of alternative splicing on a genome-wide scale. A variety of splicing prediction methods has been proposed for Affymetrix exon arrays mainly focusing on geometric correlation measures or analysis of variance. In this paper we introduce an information theoretic concept that is based on modification of the well-known entropy function. RESULTS: We have developed an alternative splicing robust prediction method based on entropy (ARH). We can show that this measure copes with bias inherent in the analysis of alternative splicing such as the dependency of prediction performance on the number of exons or variable exon expression. In order to judge the performance of ARH. we have compared it with eight existing splicing prediction methods using experimental benchmark data and demonstrate that ARH is a well-performing new method for the prediction of splice variants. Availability and Implementation: ARH is implemented in R and provided in the supplementary material. CONTACT: rasche@molgen.mpg.de SUPPLEMENTARY INFORMATION: The supplementary material provides additional figures and tables, the R implementation of ARH, a basic implementation for the method comparison and the AEdb true positive set.
post to: CiteULike
|
|
«
|
November
2009
|
»
|
| Su |
Mo |
Tu |
We |
Th |
Fr |
Sa |
| 1 |
2 |
3 |
4 |
5 |
6 |
7 |
| 8 |
9 |
10 |
11 |
12 |
13 |
14 |
| 15 |
16 |
17 |
18 |
19 |
20 |
21 |
| 22 |
23 |
24 |
25 |
26 |
27 |
28 |
| 29 |
30 |
|
|
|
|
|
|