Protein analysis tools and resources provide useful information for research in molecular biology, structural biology, computational biology. These tools help our understanding about the unique features of proteins, such as 3D shapes, protein sequence and functions, domains and domain mapping in addition to providing large comprehensive databases.
PDB Protein Databank
Protein Data Bank (PDB) has archived information about the 3D shapes of proteins, nucleic acids, and complex assemblies.
Uniprot is a freely accessible resource to find protein sequence and functional information. It has tools to blast, align, map and search peptides, core data such as protein knowledgebase, sequence clusters and sequence archive, and support data as literature citations, taxonomy, and keywords. The ExPASy Bioinformatics Resources Portal is an expandable and integrable portal that has access to many scientific resources, databases and software tools in different areas of life sciences.
ExPASy was launched by the SIB Swiss Institute of Bioinformatics. It has a visual guidance interface to select elements such as a DNA, RNA, protein, cell, organism, or population.
SMART stands for “Simple Modular Architecture Research Tool” and allowing the identification and annotation of genetically mobile domains and the analysis of domain architectures. More than five hundred domain families are found and are completely annotated regarding their phylogenetic distributions, functional class, tertiary structures and functionally important residues. Each domain found in a non-redundant protein database, as well as search parameters and taxonomic information, are stored in a relational database system. User interfaces allow searches for proteins containing specific combinations of domains.
InterPro classifies proteins into families and predicts domains and important sites, which results in their functional analysis. To classify proteins, InterPro uses predictive models, known as signatures, from several different databases that make up the InterPro consortium.
NGS Data Analysis
Next Interactions provides a tailored analysis platform with custom scripts written in Perl and R to characterize interaction sites from the NGS-Y2H readout data. Other tools can be applied as well for this purpose and also for analysis that goes beyond we can provide.
The main Galaxy site (http://usegalaxy.org) is an installation of the Galaxy software combined with many common tools to analyze data free of charge. Researchers can use either the public mainframes or download a copy of the server for their own use. Large datasets can be analyzed because of the large CPU and disk space.
FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. It provides a modular set of analyses that you can use to give a quick impression of whether your data has any problems of which you should be aware before doing any further analysis.
Integrative Genome Viewer (IGV)
Integrated Genome Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations. IGV is applied in our pipeline to visualize NGS reads that map to protein binding sites that are uncovered in the screen. An automated procedure extracts the enrichment of paired-end NGS reads.
Network and Pathway Analysis
PPI networks have been used to further the study of molecular evolution, to gain insight into the robustness of cells to perturbation, and for assignment of new protein functions.
Qisampler performs a systematic statistical evaluation of scoring systems in a dataset. Qisampler is an R script that systematically evaluates several scoring schemes for high throughput experiments versus given golden sets using a sampling strategy. Modularity of the input format allows the use of this application with various dataset types, such as protein-protein interactions (PPIs), gene-expression microarray, or deep sequencing datasets.
STRING is a database of known and predicted protein-protein interactions. The interactions include direct (physical) and indirect (functional) associations; they stem from computational prediction, from knowledge transfer between organisms, and from interactions aggregated from other (primary) databases. The STRING database aims to provide a critical assessment and integration of protein-protein interactions.
The Gene Ontology (GO) knowledgebase is the world’s largest source of information on the functions of genes and constitutes a foundation for computational analysis of large-scale molecular biology and genetics experiments in biomedical research.
KEGG Pathway Database
KEGG (Kyoto Encyclopedia of Genes and Genomes) is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances.
BioCyc is a collection of 14728 Pathway/Genome Databases (PGDBs), plus software tools for exploring them. Key aspects of BioCyc data: Quality data curated from tens of thousands of publications, including curated databases for E. coli, B. subtilis, H. sapiens, and S. cerevisiae. It contains computationally predicted metabolic pathways and operons.
IntAct provides a freely available, open-source database system and analysis tools for molecular interaction data. All interactions are derived from literature curation or direct user submissions and are freely available
Consult our Protein Interaction Screening Experts.
Contact Next Interactions for advice from our scientists about the best approach for elucidating the protein interactions you need to understand for your research. We can advise you on methods such as the NGS-yeast two-hybrid system, yeast two-hybrid system, how to compliment interactome coverage by mass spectrometry, and more.
Next-Generation Sequencing for Binary Protein-Protein Interactions.
Suter B, Zhang X, Pesce CG, Mendelsohn AR, Dinesh-Kumar SP, Mao JH. Front Genet. 2015 Dec 17;6:346. doi: 10.3389/fgene.2015.00346. PMID: 26734059
Development and application of a DNA microarray-based yeast two-hybrid system.
Suter B, Fontaine JF, Yildirimman R, Raskó T, Schaefer MH, Rasche A, Porras P, Vázquez-Álvarez BM, Russ J, Rau K, Foulle R, Zenkner M, Saar K, Herwig R, Andrade-Navarro MA, Wanker EE. Nucleic Acids Res. 2013 Feb 1;41(3):1496-507. PMID: 23275563
QiSampler: evaluation of scoring schemes for high-throughput datasets using a repetitive sampling strategy on gold standards. Fontaine JF, Suter B, Andrade-Navarro MA. BMC Res Notes. 2011 Mar 9;4:57. PMID: 21388526
Two-hybrid technologies in proteomics research. Suter B, Kittanakom S, Stagljar I. Curr Opin Biotechnol. 2008 Aug;19(4):316-23. Review. PMID: 18619540
QTY code enables design of detergent-free chemokine receptors that retain ligand-binding activities.
Zhang S, Tao F, Qing R, Tang H, Skuhersky M, Corin K, Tegler L, Wassie A, Wassie B, Kwon Y, Suter B, Entzian C, Schubert T, Yang G, Labahn J, Kubicek J, Maertens B. Proc Natl Acad Sci U S A. 2018 Sep 11;115(37):E8652-E8659. doi: 10.1073/pnas.1811031115. Epub 2018 Aug 28.
An effector from the Huanglongbing-associated pathogen targets citrus proteases.
Clark K, Franco JY, Schwizer S, Pang Z, Hawara E, Liebrand TWH, Pagliaccia D, Zeng L, Gurung FB, Wang P, Shi J, Wang Y, Ancona V, van der Hoorn RAL, Wang N, Coaker G, Ma W. Nat Commun. 2018 Apr 30;9(1):1718. doi: 10.1038/s41467-018-04140-9.
Effects of Acetylation and Phosphorylation on Subunit Interactions in Three Large Eukaryotic Complexes.
Šoštarić N, O’Reilly FJ, Giansanti P, Heck AJR, Gavin AC, van Noort V.
Mol Cell Proteomics. 2018 Dec;17(12):2387-2401. doi: 10.1074/mcp.RA118.000892. Epub 2018 Sep 4.
A user-friendly platform for yeast two-hybrid library screening using next generation sequencing.
Erffelinck ML, Ribeiro B, Perassolo M, Pauwels L, Pollier J, Storme V, Goossens A. PLoS One. 2018 Dec 21;13(12):e0201270. doi: 10.1371/journal.pone.0201270. eCollection 2018.
Development and application of a recombination-based library versus library high- throughput yeast two-hybrid (RLL-Y2H) screening system.
Yang F, Lei Y, Zhou M, Yao Q, Han Y, Wu X, Zhong W, Zhu C, Xu W, Tao R, Chen X, Lin D, Rahman K, Tyagi R, Habib Z, Xiao S, Wang D, Yu Y, Chen H, Fu Z, Cao G. Nucleic Acids Res. 2018 Feb 16;46(3):e17. doi: 10.1093/nar/gkx1173.
CrY2H-seq: a massively multiplexed assay for deep-coverage interactome mapping.
Trigg SA, Garza RM, MacWilliams A, Nery JR, Bartlett A, Castanon R, Goubil A, Feeney J, O’Malley R, Huang SC, Zhang ZZ, Galli M, Ecker JR. Nat Methods. 2017 Aug;14(8):819-825. doi: 10.1038/nmeth.4343. Epub 2017 Jun 26.
Protein interaction perturbation profiling at amino-acid resolution.
Woodsmith J, Apelt L, Casado-Medrano V, Özkan Z, Timmermann B, Stelzl U.
Nat Methods. 2017 Dec;14(12):1213-1221. doi: 10.1038/nmeth.4464. Epub 2017 Oct 16.
Pooled-matrix protein interaction screens using Barcode Fusion Genetics.
Yachie N, Petsalaki E, Mellor JC, Weile J, Jacob Y, Verby M, Ozturk SB, Li S, Cote AG, Mosca R, Knapp JJ, Ko M, Yu A, Gebbia M, Sahni N, Yi S, Tyagi T, Sheykhkarimli D, Roth JF, Wong C, Musa L, Snider J, Liu YC, Yu H, Braun P, Stagljar I, Hao T, Calderwood MA, Pelletier L, Aloy P, Hill DE, Vidal M, Roth FP. Mol Syst Biol. 2016 Apr 22;12(4):863. doi: 10.15252/msb.20156660.
A Y2H-seq approach defines the human protein methyltransferase interactome.
Weimann M, Grossmann A, Woodsmith J, Özkan Z, Birth P, Meierhofer D, Benlasfer N, Valovka T, Timmermann B, Wanker EE, Sauer S, Stelzl U. Nat Methods. 2013 Apr;10(4):339-42. doi: 10.1038/nmeth.2397. Epub 2013 Mar 3.
Quantitative Interactor Screening with next-generation Sequencing (QIS-Seq) identifies Arabidopsis thaliana MLO2 as a target of the Pseudomonas syringae type III effector HopZ2. Lewis JD, Wan J, Ford R, Gong Y, Fung P, Nahal H, Wang PW, Desveaux D, Guttman DS. BMC Genomics. 2012 Jan 9;13:8. doi: 10.1186/1471-2164-13-8.
Next-generation sequencing to generate interactome datasets.
Yu H, Tardivo L, Tam S, Weiner E, Gebreab F, Fan C, Svrzikapa N, Hirozane-Kishikawa T, Rietman E, Yang X, Sahalie J, Salehi-Ashtiani K, Hao T, Cusick ME, Hill DE, Roth FP, Braun P, Vidal M. Nat Methods. 2011 Jun;8(6):478-80. doi: 10.1038/nmeth.1597. Epub 2011 Apr 24.