Abstract

Protein modification is an extremely important post-translational regulation that adjusts the physical and chemical properties, conformation, stability and activity of a protein; thus altering protein function. Due to the high throughput of mass spectrometry (MS)-based methods in identifying site-specific post-translational modifications (PTMs), dbPTM (http://dbPTM.mbc.nctu.edu.tw/) is updated to integrate experimental PTMs obtained from public resources as well as manually curated MS/MS peptides associated with PTMs from research articles. Version 3.0 of dbPTM aims to be an informative resource for investigating the substrate specificity of PTM sites and functional association of PTMs between substrates and their interacting proteins. In order to investigate the substrate specificity for modification sites, a newly developed statistical method has been applied to identify the significant substrate motifs for each type of PTMs containing sufficient experimental data. According to the data statistics in dbPTM, >60% of PTM sites are located in the functional domains of proteins. It is known that most PTMs can create binding sites for specific protein-interaction domains that work together for cellular function. Thus, this update integrates protein–protein interaction and domain–domain interaction to determine the functional association of PTM sites located in protein-interacting domains. Additionally, the information of structural topologies on transmembrane (TM) proteins is integrated in dbPTM in order to delineate the structural correlation between the reported PTM sites and TM topologies. To facilitate the investigation of PTMs on TM proteins, the PTM substrate sites and the structural topology are graphically represented. Also, literature information related to PTMs, orthologous conservations and substrate motifs of PTMs are also provided in the resource. Finally, this version features an improved web interface to facilitate convenient access to the resource.

INTRODUCTION

Protein post-translational modification (PTM) plays an essential role in various cellular processes that adjusts the physical and chemical properties, folding, conformation, stability and activity of proteins; thus altering protein function (1). More than 200 different types of PTMs have been identified by mass spectrometry (MS)-based proteomics (2). The biological functions of this ubiquitous regulatory mechanisms include phosphorylation for signal transduction, attachment of fatty acids for membrane anchoring and association, glycosylation for changing protein half-life, targeting substrates, promotion of cell–cell and cell–matrix interactions, acetylation and methylation of histone for gene regulation and ubiquitylation for protein degradation (3). With the high-throughput MS or MS/MS-based methods in proteomics, several databases associated with a specific modification type have been established. Phospho.ELM (4), Phosphorylation Site Database (5), PhosphoSitePlus (6), PHOSIDA (7) and PhosPhAt (8) were developed for accumulating experimentally verified phosphorylation sites. NetworKIN (9) and RegPhos (10) designed an integrative method to identify the kinase-substrate phosphorylation networks. O-GLYCBASE (11) and dbOGAP (12) are the databases of glycoproteins, most of which include experimentally verified O-linked glycosylation sites. UbiProt (13) stores experimental ubiquitylated proteins and ubiquitylation sites, which are implicated in protein degradation through an intracellular ATP-dependent proteolytic system. PupDB (14) is a prokaryotic ubiquitin-like protein (Pup) database which stores a collection of experimentally identified pupylated proteins and pupylation sites from published articles. It also integrates the information of pupylated proteins with corresponding structures and functional annotations. An increasing number of proteomic studies have suggested that protein S-nitrosylation plays important role in the nitric oxide (NO)-related redox pathway. With this, a new database named dbSNO (15) was established by manually curating S-nitrosylation peptides from research articles.

With regard to public resources of multiple PTM types currently available, UniProtKB/Swiss-Prot (2,16) includes as much information of PTMs as is available with functional and structural annotations. SysPTM (17) has designed a systematic platform for multi-type PTM research and data mining. Additionally, Human Protein Reference Database (HPRD) (18) contains a wealth of information relevant to the function of human proteins in health and disease, as well as the annotation of PTMs. With the importance of protein modifications in biological processes, we have previously proposed dbPTM (19) which integrates published databases in order to obtain experimentally validated protein modifications, as well as putative PTM substrate sites predicted by a series of accurate computational tools (20–22). Version 2.0 of dbPTM was extended to a knowledge base comprising the modified sites, solvent accessibility of substrate, protein secondary and tertiary structures, protein domains and protein variations (23).

Due to the high throughput of MS/MS-based methods in identifying site-specific PTMs, this version (dbPTM 3.0) not only integrates experimental PTMs from public resources but also manually curates MS/MS peptides associated with PTMs from research articles using a text mining approach. The dbPTM 3.0 aims to be an informative resource for investigating the substrate specificity of PTM sites and functional association of PTMs between substrates and their interacting proteins. In order to investigate the substrate specificity for modification sites, a newly developed method, MDDLogo (24), has been applied to identify the significant substrate motifs for each type of PTMs. According to the data statistics in dbPTM, >60% of PTM sites are located in protein functional domains. Many PTMs can create binding sites for specific protein-interaction domains that work together for cellular function and read the state of proteome to cellular organization (25). Thus, this update integrates both protein–protein interaction (PPI) and domain–domain interaction information to determine the functional association of PTM sites located in protein-interacting domains. Additionally, in order to delineate the structural correlation between the reported PTM sites and transmembrane (TM) topologies, the information of structural topologies on TM proteins is integrated in dbPTM 3.0. To facilitate the investigation of PTMs on TM proteins, PTM sites as well as the structural topology of TM proteins are graphically represented. Furthermore, the web interface is enhanced to facilitate access to the resource and is now freely accessible at http://dbPTM.mbc.nctu.edu.tw/.

IMPROVEMENTS

The highlighted improvements and advances in dbPTM 3.0 are presented in Figure 1 including data integration from public PTM resources and research articles, investigation of PTM substrate site specificity, investigation of PTM-associated protein interactions, as well as the investigation of the effects of PTM on TM proteins. To facilitate the study of PTMs and their functions, the web interface is redesigned and enhanced. Published literature information related to PTMs, orthologous conservations and substrate motifs of PTM sites are also provided in this online resource. The details of each improved process are depicted as follows.

The highlighted improvements and advances in dbPTM 3.0.
Figure 1.

The highlighted improvements and advances in dbPTM 3.0.

Data integration from public PTM resources and research articles

Supplementary Figure S1 shows the detailed system flow of the construction of dbPTM 3.0. Due to the inaccessibility of database contents in several online PTM resources, a total 11 biological databases related to PTMs are integrated in dbPTM, including UniProtKB/Swiss-Prot (2), version 9.0 of Phospho.ELM (4), PhosphoSitePlus (6), PHOSIDA (26), version 6.0 of O-GLYCBASE (11), dbOGAP (12), dbSNO (15), version 1.0 of UbiProt (13), PupDB (14), version 1.1 of SysPTM (17) and release 9.0 of HPRD (27). A brief description and the data statistics of the integrated databases are given in Supplementary Table S1. To solve the heterogeneity among the data collected from different sources, the reported modification sites are mapped to the UniProtKB protein entries using sequence comparison. With the high throughput of MS-based methods in post-translational proteomics, this update also includes manually curated MS/MS-identified peptides associated with PTMs from research articles through a literature survey. First, a table list of PTM-related keywords is constructed by referring to the UniProtKB/SwissProt PTM list (http://www.uniprot.org/docs/ptmlist.txt) and the annotations of RESID (28). Then, all fields in the PubMed database are searched based on the keywords of the constructed table list. This is then followed by downloading the full text of the research articles. For the various experiments of proteomic identification, a text-mining system is developed to survey full-text literature that potentially describes the site-specific identification of modified sites. Approximately 800 original and review articles associated with MS/MS proteomics and protein modifications are retrieved from PubMed (July 2012). Next, the full-length articles are manually reviewed for precisely extracting the MS/MS peptides along with the modified sites. Furthermore, in order to determine the locations of PTMs on a full-length protein sequence, the experimentally verified MS/MS peptides are then mapped to UniProtKB protein entries based on its database identifier (ID) and sequence identity. In the process of data mapping, MS/MS peptides that cannot align exactly to a protein sequence are discarded. Finally, each mapped PTM site is attributed with a corresponding literature (PubMed ID).

Detection of PTM substrate site specificities

Due to the difficulty of detecting the conserved motifs for a specific PTM with a large data size, MDDLogo (24) was used to identify the substrate motifs for each type of PTMs containing >500 modified peptides. MDDLogo exploits maximal dependence decomposition (MDD) in order to discover conserved motifs from a group of aligned signal sequences. MDD groups a set of aligned signal sequences into subgroups that capture the most significant dependencies between positions. MDD adopts Chi-squared test formula to evaluate the dependence of amino acid occurrence between two positions Ai and Aj that surround the PTM substrate sites. MDDLogo has demonstrated its effectiveness in identifying substrate motifs of plant and virus phosphorylation (29,30), as well as the mouse S-nitrosylation (31). In order to extract the motifs that have conserved biochemical property of amino acids when doing MDD, it categorizes the 20 types of amino acids into five groups such as aliphatic, polar and uncharged, acid, basic and aromatic groups, as shown in Supplementary Figure S2. An example of MDD clustering on S-nitrosylation data shows that position −7 has the maximal dependence with the occurrence of basic amino acids, including lysine (k), arginine (r) and histidine (H). Subsequently, all data can be divided into two subgroups: one has the occurrence of basic amino acids in position −7 and the other does not have the occurrence of basic amino acids in position −7. The MDD clustering is a recursive process to divide the data sets into tree-like subgroups.

Integration of protein domains, domain–domain interactions and PPIs

Protein-interaction domains usually recognize short peptide motifs of a target protein but do not bind stably until the peptides have the appropriate PTMs; this can create binding sites for specific protein-interaction domains that work together for cellular function and read the state of proteome to cellular organization (25). For instance, the SH2 domain can bind to phosphotyrosine (pTyr)-associated peptides in a manner that depends on ligand phosphorylation and the motif of the flanking amino acids (32,33). Thus, this update integrates the information of protein functional domains and PPIs to infer the PTM-dependent protein interactions. To investigate the preference of functional domains for PTM, this study refers to the domain annotations in InterPro (34). InterPro is an integrated resource, which was developed initially as a means of rationalizing the complementary efforts of the PROSITE (35), PRINTS (36), Pfam (37) and ProDom (38) databases, for providing protein ‘signatures’ such as protein families, domains and functional sites. For the information of experimentally verified PPIs, five databases including DIP (39), MINT (40), IntAct (41), HPRD (18) and STRING (42) are integrated in dbPTM (see Supplementary Table S2). Additionally, the domain–domain interactions of InterDom (43) are also integrated to determine the functional association for the PTM sites which locate in protein-interacting domains.

Integration of TM proteins with structural topology

TM proteins play crucial roles in various cellular processes (44). A genome-wide study has discovered that ∼20–30% of the proteins encoded by a typical genome are TM proteins (45). However, due to the experimental difficulties in obtaining high-quality structures, TM proteins are notably under-represented in Protein Data Bank (46). The biological roles of PTMs playing on TM proteins include phosphorylation for signal transduction and ion transport, acetylation for structure stability, attachment of fatty acids for membrane anchoring and association, as well as the glycosylation for receptors targeting, cell–cell interactions and virus infection (44,47). With the importance of PTMs functioning on TM proteins, the experimentally curated information of membrane topologies is collected from TMPad (48), TOPDB (49), PDB_TM (50) and OPM (51). In order to provide a comprehensive investigation of TM proteins, a potential set of TM proteins is extracted from UniProtKB (52) by choosing protein entries which contain the keyword ‘TRANSMEM’ in feature (‘FT’) line, the localization of ‘membrane’ and the information of TM topology. The potential TM proteins are further filtered using a TM prediction program MEMSAT (53) to determine its membrane topologies. As shown in Supplementary Table S3, the filtering process resulted in 2216 experimental and 43 142 potential TM proteins with membrane topologies. To facilitate the investigation of PTMs on TM proteins, the structural topology of TM proteins is graphically represented using PHP GD library, as well as the PTM substrate sites. Moreover, the tertiary structures of TM proteins and PTM sites are visualized using the Jmol program (54).

Integration of external biological databases

For a given protein, the basic biological functions can be obtained from the annotations of UniProtKB. To provide more information about protein functional and structural annotations relevant to the modified proteins and the PTM substrate sites, the data contents of Gene Ontology (GO) (55), Protein Data Bank (PDB) (46) and Clusters of Orthologous Groups (COGs) (56) have been integrated in dbPTM. In this study, the information regarding the molecular function, cellular components and biological process for a modified protein can be accessed by a crosslink that refers to the corresponding entry from QuickGO (57) via a UniProtKB accession number. In order to facilitate the investigation of structural characteristics surrounding the PTM substrate sites, protein tertiary structure obtained from PDB was graphically presented by Jmol program. For proteins with tertiary structures (5% of UniProtKB/Swiss-Prot proteins), the protein structural properties, such as solvent accessibility and secondary structure of residues, were calculated by DSSP (58). With respect to the previous studies investigating the structural characteristics of PTMs (59–61) in proteins without known tertiary structures, two effective tools, RVP-net (62) and PSIPRED (63), are used to predict the solvent accessibility and secondary structure, respectively. In order to observe whether a PTM sites located in the conserved regions among orthologous protein sequences, the COGs of proteins were integrated and the ClustalW (64) program was adopted to implement the alignment of multiple protein sequences in each COG cluster.

DATA CONTENT AND UTILITY

Data statistics of the integrated PTM sites

In order to provide the most comprehensive data of PTMs, this update not only integrates experimental PTMs from 11 external PTM-related resources but also manually curates MS/MS peptides associated with PTMs from ∼800 research articles. After removing the redundancy data among these heterogeneous resources, there are totally 208 521 experimental PTM sites in dbPTM 3.0. All the experimental PTM sites are further categorized by PTM types and the number of non-redundant PTM sites is calculated. As the data statistics of representative PTM types shown in Table 1, protein phosphorylation contains the most abundant data of experimentally verified substrate sites. Due to the high throughput of Ms/MS-based proteomics in the site-specific identification of modified peptides, several PTMs have a significantly increasing number of experimental data, including protein ubiquitylation, acetylation, methylation, N-linked and O-linked glycosylation, as well as the emerging S-nitrosylation. In addition to the experimental PTM sites, UniProtKB/Swiss-Prot provides putative PTM sites by using sequence similarity or evolutionary potential, which are annotated as ‘by similarity’, ‘potential’ or ‘probable’ in the ‘MOD_RES’ fields. A total of 226 122 putative sites for all PTM types are integrated in dbPTM. Moreover, a KinasePhos-like method (19–22) has been adopted to construct the profile hidden Markov models (HMMs) for 18 types of PTM. Especially in protein phosphorylation, >70 kinase-specific prediction models are constructed and used to identify the putative phosphorylation sites with their kinases. These models were applied to search the potential PTM sites against UniProtKB/Swiss-Prot protein sequences. As given in Table 1, totally 2 509 267 putative sites for all PTM types are detected by HMMs with 90% predictive specificity. All the experimental PTM sites and putative PTM sites are available and downloadable in the web interface.

Table 1.

Data statistics of experimental and putative PTM sites in dbPTM

PTM typesNumber of experimental substrate sitesNumber of putative substrate sites from UniProtKB/Swiss-ProtNumber of HMM-predicted sites
Phoshorylation142 44674 1741 414 879
Ubiquitylation23 64717028865
N-linked glycosylation15 24287 529418 253
Acetylation968319 9811156
O-linked glycosylation35083695373 758
Amidation25331445114 034
Hydroxylation162912749743
Methylation1585547922 332
Pyrrolidone carboxylic acid82974212 322
Sumoylation72580013 042
Gamma-carboxyglutamic acid4488141942
Palmitoylation312525233 830
Sulfation20780070 005
Myristoylation1781275988
C-linked glycosylation156993923
Prenylation13013276741
Nitration80931432
Deamidation521652022
S-nitrosylation3096170
Oxidation333180
ADP-ribosylation140164
N6-succinyllysine8869
Formylation56125
GPI anchoring34849
Bromination3356
N6-malonyllysine33167
Citrullination32110
N6-carboxylysine301566
Glutathionylation1932
FAD19163
Others121815 825
Total208 521226 1222 509 267
PTM typesNumber of experimental substrate sitesNumber of putative substrate sites from UniProtKB/Swiss-ProtNumber of HMM-predicted sites
Phoshorylation142 44674 1741 414 879
Ubiquitylation23 64717028865
N-linked glycosylation15 24287 529418 253
Acetylation968319 9811156
O-linked glycosylation35083695373 758
Amidation25331445114 034
Hydroxylation162912749743
Methylation1585547922 332
Pyrrolidone carboxylic acid82974212 322
Sumoylation72580013 042
Gamma-carboxyglutamic acid4488141942
Palmitoylation312525233 830
Sulfation20780070 005
Myristoylation1781275988
C-linked glycosylation156993923
Prenylation13013276741
Nitration80931432
Deamidation521652022
S-nitrosylation3096170
Oxidation333180
ADP-ribosylation140164
N6-succinyllysine8869
Formylation56125
GPI anchoring34849
Bromination3356
N6-malonyllysine33167
Citrullination32110
N6-carboxylysine301566
Glutathionylation1932
FAD19163
Others121815 825
Total208 521226 1222 509 267
Table 1.

Data statistics of experimental and putative PTM sites in dbPTM

PTM typesNumber of experimental substrate sitesNumber of putative substrate sites from UniProtKB/Swiss-ProtNumber of HMM-predicted sites
Phoshorylation142 44674 1741 414 879
Ubiquitylation23 64717028865
N-linked glycosylation15 24287 529418 253
Acetylation968319 9811156
O-linked glycosylation35083695373 758
Amidation25331445114 034
Hydroxylation162912749743
Methylation1585547922 332
Pyrrolidone carboxylic acid82974212 322
Sumoylation72580013 042
Gamma-carboxyglutamic acid4488141942
Palmitoylation312525233 830
Sulfation20780070 005
Myristoylation1781275988
C-linked glycosylation156993923
Prenylation13013276741
Nitration80931432
Deamidation521652022
S-nitrosylation3096170
Oxidation333180
ADP-ribosylation140164
N6-succinyllysine8869
Formylation56125
GPI anchoring34849
Bromination3356
N6-malonyllysine33167
Citrullination32110
N6-carboxylysine301566
Glutathionylation1932
FAD19163
Others121815 825
Total208 521226 1222 509 267
PTM typesNumber of experimental substrate sitesNumber of putative substrate sites from UniProtKB/Swiss-ProtNumber of HMM-predicted sites
Phoshorylation142 44674 1741 414 879
Ubiquitylation23 64717028865
N-linked glycosylation15 24287 529418 253
Acetylation968319 9811156
O-linked glycosylation35083695373 758
Amidation25331445114 034
Hydroxylation162912749743
Methylation1585547922 332
Pyrrolidone carboxylic acid82974212 322
Sumoylation72580013 042
Gamma-carboxyglutamic acid4488141942
Palmitoylation312525233 830
Sulfation20780070 005
Myristoylation1781275988
C-linked glycosylation156993923
Prenylation13013276741
Nitration80931432
Deamidation521652022
S-nitrosylation3096170
Oxidation333180
ADP-ribosylation140164
N6-succinyllysine8869
Formylation56125
GPI anchoring34849
Bromination3356
N6-malonyllysine33167
Citrullination32110
N6-carboxylysine301566
Glutathionylation1932
FAD19163
Others121815 825
Total208 521226 1222 509 267

Enhanced web interface

To facilitate the use of the dbPTM resource, the web interface has been redesigned and enhanced to allow efficient access to the protein of interest. Supplementary Figure S3 shows the content of a typical dbPTM query: (i) quick search by IDs and keywords, (ii) basic information, (iii) graphical visualization of PTM sites with structural characteristics and functional domains, (iv) table of experimental PTM sites with reported literature, (v) orthologous conservation of PTM substrate sites, (vi) PPIs and domain–domain interactions and (vii) literature related to PTMs. The combined visualization of PTM sites and function domains for a protein sequence can help users to understand the functional associations of PTM substrate sites. According to the multiple sequence alignment result of orthologous proteins, users can investigate whether a PTM site located in evolutionary conserved regions, which indicates that the orthologous sites in other species could be involved in the same modification. Additionally, this update incorporates the protein functional domains and domain–domain interactions to infer the PTM-dependent protein interactions. Moreover, the literatures associated with PTMs are categorized by the modification type.

In addition to the database query by the protein name, gene name, UniProtKB ID or accession, the protein sequence is allowed for homology search against UniProtKB protein sequence database using Blast (65) program. For browse function of dbPTM web site, a summary table of PTM types and their modified residues is provided for users to efficiently access the number of data in a specific modified amino acid of a PTM type. The annotations of PTM types are referred to the UniProtKB/Swiss-Prot PTM list (http://www.uniprot.org/docs/ptmlist.txt). As depicted in Supplementary Figure S4, the acetylation of lysine (K) is chosen to obtain more detailed information such as the location of the modification in protein sequence, the modified chemical formula, the mass difference and the substrate site specificity, which is the preference of amino acids surrounding the modification sites. The structural characteristics, such as solvent accessibility and secondary structure surrounding the PTM substrate sites, are also provided. Additionally, the substrate site specificity of the acetylated lysines is investigated in detail with reference to the subcellular localizations of acetylated proteins. Previous work has demonstrated that the co-localization of acetyltransferases and substrate proteins could be a promising method to investigate the substrate site specificities and could be adopted to improve the computational identification of protein acetylation sites (66).

Investigation of PTM substrate site specificities

Given a window length, n, the fragment of 2n + 1 residues centering on PTM site (position 0) is extracted and the positional frequencies of amino acids are calculated and presented as sequence logos by WebLogo (67). Supplementary Figure S5 shows the substrate motif and structural characteristics of experimental phosphorylation sites. According to the kinase classification extracted from KinBase (http://kinase.com/) and RegPhos (10), the substrate site specificity of protein phosphorylation could be further categorized into >200 kinase groups. As given in Supplementary Figure S5, most of the kinase-specific substrate motifs have conserved amino acids surrounding the phosphorylation sites. For the PTMs other than phosphorylation, there are no annotations of catalytic enzymes or transferases due to the experimental difficulty in identifying the catalytic enzymes for a specific PTM. Based on the basic concept of sequence conservation, a sequence logo could display the substrate motif for each PTM type with a group of aligned sequences. However, it is difficult to explore conserved motifs for large-scale sequence data; for instance, a sequence logo for all phosphorylation data involved with various catalytic kinases fails to obviously present the kinase-specific substrate specificity. Thus, for the PTM containing sufficient data of experimental substrate sites, MDDLogo was performed to cluster a group of aligned substrate sequences into subgroups containing statistically significant motifs. As the example of protein S-nitrosylation presented in Figure 2, 10 sequence logos, which were identified from 3095 S-nitrosylated peptides with a 13-mer window length, contain a conserved motif of positively charged amino acids (K, R and H) surrounding the S-nitrosocysteine. Interestingly, the first and sixth groups contain the conserved motifs of negatively charged amino acids (D and E) accompanied by positively charged amino acids at two specific positions. Consistent with previous studies (68–73), the S-nitrosylated cysteines may be located within an acid-base motif flanked by acidic and basic amino acids.

The MDDLogo-identified substrate motifs of protein S-nitrosylation sites.
Figure 2.

The MDDLogo-identified substrate motifs of protein S-nitrosylation sites.

Investigation of PTM-associated domains and protein interactions

According to the data statistics in dbPTM, >60% of experimentally verified PTM sites locate in the functional domains of proteins. Such statistics could be analyzed in detail for each type of PTMs. For instance of protein S-nitrosylation, which is an emerging PTM playing crucial role in the regulation of NO-related cellular processes, the statistics shows that ∼70% of the reported S-nitrosylation sites locate within the functional domains. Furthermore, the detailed distribution of functional domains covering S-nitrosylation sites is given in Supplementary Table S4. It is observed that the most preferred functional domain is the ‘nucleotide-binding alpha–beta plait’ with InterPro ID: IPR012677 which covers 47 S-nitrosylation sites. Another preferred functional domain is the ‘RNA recognition motif, RNP-1’ domain with InterPro ID: IPR000504 which covers 46 S-nitrosylation sites. This investigation indicates that these S-nitrosylation sites may play important roles in the domains of proteins involving in DNA or RNA binding (74). In addition, Supplementary Table S5 shows the distribution of functional domains covering substrate sites for several representative PTMs, including acetylation, methylation, hydroxylation, N-linked and O-linked glycosylation, phosphorylation and ubiquitylation.

Many PTMs provide binding sites for specific protein-interaction domains, which often contain a conserved structure for the modified site and a more flexible surface for the flanking amino acids, synergize to regulate cellular processes (75–78). In order to investigate the PTM-associated protein interactions, the information of domain–domain interactions collected from InterDom is adopted in this study. As the case study of ‘Histone H3’ (UniProtKB ID: H31_HUMAN) presented in Figure 3, ‘Heterochromatin protein 1 homolog alpha’ (‘HP1’, UniProtKB ID: CBX5_HUMAN) and ‘WD repeat-containing protein 5’ (‘WDR5’, UniProtKB ID: WDR5_HUMAN) interact with ‘Histone H3’. When investigating the protein interaction between ‘HP1’ and ‘Histone H3’ in detail, there is a domain–domain interaction between ‘Chromodomain’ (InterPro ID: IPR000953) and ‘Histone H3’ (InterPro ID: IPR000164). Among the PTMs located in the domain of ‘Histone H3’, a previous study has demonstrated that the ‘HP1 chromodomain’ can bind to the ‘Histone H3’ methylated at lysine 10 (79). Another protein interaction shows that there is a domain–domain interaction between the ‘WD40 Repeat’ (InterPro ID: IPR001680) and ‘Histone Core’ (InterPro ID: IPR007125). It has been proposed that the structural motif for the specific recognition of methylated ‘Histone H3’ lysine 5 by ‘WD40 Repeat’ of ‘WDR5’ is essential to vertebrate development (80,81). This investigation indicates that the other PTM sites could be the potential binding sites for protein-interaction domains.

A case study of domain–domain interactions and PTM-associated protein interactions on Histone H3 (UniProtKB ID: H31_HUMAN).
Figure 3.

A case study of domain–domain interactions and PTM-associated protein interactions on Histone H3 (UniProtKB ID: H31_HUMAN).

Investigation of PTM sites on TM proteins

According to the data statistics of PTM sites and TM proteins in dbPTM, a total of 9644 and 68 775 PTM substrate sites locate on the 2088 experimental and 33 747 potential TM proteins, respectively. In order to investigate the structural distribution of PTM sites on TM proteins, the structural topologies of a TM protein are mainly categorized into four types: extracellular, cytoplasmic, TM and unknown regions. Supplementary Table S6 provides the structural distribution of PTMs containing >10 substrate sites on experimental TM proteins. Interestingly, without the consideration of substrate sites located in unknown region, all of the N-linked (GlcNAc …) glycosylation sites are located in the extracellular region, as well as the O-linked and C-linked glycosylation sites. This investigation is reasonable to understand the biological effect of glycosylation functioning on TM proteins for receptor targeting and cell–cell interactions (47). Otherwise, the phosphorylation sites are mainly located in cytoplasmic regions, which induce signal transduction and ion transport. The structural distribution of PTM sites could be the means to infer the potential roles of PTMs functioning on TM proteins. Actually, a previous work has demonstrated that the incorporation of membrane topology could improve the performance of predicting O-linked glycosylation sites on TM proteins (82). Supplementary Figure S6 shows a graphical visualization of the PTMs and membrane topology on human Beta-2 adrenergic receptor (ADRB2). Furthermore, two modification sites Tyr141 (pTyr) and Cys341 (S-palmitoyl cysteine) are further highlighted in red on the tertiary structure (PDB ID: 2R4R) using Jmol viewer, which indicates the solvent accessibility and distance between them.

CONCLUSION

The expansion of the dbPTM database increases its usefulness for researchers investigating the impact of PTMs on protein function and cellular processes. Additionally, the enhanced web interface enables both wet-lab biologists and bioinformatics researchers to efficiently explore the further information about protein PTMs. Table 2 summarizes the advancements and new features supported in dbPTM 3.0. In the future, we expect dbPTM to continue to grow with the increasing availability of data in resources such as Phospho.ELM, PhosphoSitePlus and UniProtKB. One area that we can envision dbPTM improving greatly in prospective works is implementing a more accurate method for the discovery of PTM substrate motifs. Also, enhancements on the text mining algorithm will enable the system to select MS/MS peptides from research articles associated with protein modifications with a higher confidence rate. In order to provide more adequate information for PTM function, the descriptions associated with the biological function of PTMs will be extracted from research articles using an information retrieval system. Moreover, the thermodynamic parameters for proteins (83), PPIs (84) and protein–nucleic acid interactions (85) could be integrated for the investigation of PTM-associated protein stability.

Table 2.

Advances and improvements in this update (dbPTM 3.0)

FeaturesdbPTM 1.0dbPTM 2.0dbPTM 3.0
Protein entryUniProtKB/Swiss-Prot (release 46)UniProtKB/Swiss-Prot (release 55)UniProtKB release 2012-04
Experimental PTM resourceUniProtKB/Swiss-Prot, Phospho.ELM and O-GLYCBASEUniProtKB/Swiss-Prot, Phospho.ELM, PHOSIDA, HPRD, O-GLYCBASE and UbiProtUniProtKB/Swiss-Prot, HPRD, SysPTM, Phospho.ELM, PhosphoSitePlus, PHOSIDA, O-GLYCBASE, dbOGAP, dbSNO, UbiProt and PupDB
Literature survey of PTMs>5000 modified peptides extracted from ∼800 articles
Literatures related to PTMSYesYes (categorized by PTM types)
Computationally predicted PTMsPhosphorylation, glycosylation and sulfation20 types of PTM18 types of PTM
Protein tertiary structureProtein Data Bank (PDB)Protein Data Bank (PDB)Protein Data Bank (PDB)
Structural properties of PTM sitesAmino acid frequencyAmino acid frequency, solvent accessibility and secondary structureAmino acid frequency, solvent accessibility, secondary structure and intrinsic disorder region
PTM annotationRESID (373 PTM annotations)RESID (431 PTM annotations)RESID (431 PTM annotations)
Kinase family annotationKinBaseKinBase and RegPhos
Protein functional domainInterProInterProInterPro and InterProScan
Protein–protein interactionDIP, MINT, IntAct, HPRD and STRING
Domain–domain interactionInterDom
Functional association of PTMPTM-associated domains and PTM-dependent protein interactions
PTM substrate motifWebLogoWebLogo and MDDLogo
Evolutionary conservation of PTM sitesClustalWClustalW and COG
Transmembrane topologyTMPad, PDBTM, TOPDB and OPM
Graphical visualizationPTM, solvent accessibility, protein variation and protein domainPTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation and sequence logoPTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation, sequence logo, PTM substrate motifs, domain–domain interaction, protein–protein interaction, transmembrane topology and tertiary structure of PTMs
FeaturesdbPTM 1.0dbPTM 2.0dbPTM 3.0
Protein entryUniProtKB/Swiss-Prot (release 46)UniProtKB/Swiss-Prot (release 55)UniProtKB release 2012-04
Experimental PTM resourceUniProtKB/Swiss-Prot, Phospho.ELM and O-GLYCBASEUniProtKB/Swiss-Prot, Phospho.ELM, PHOSIDA, HPRD, O-GLYCBASE and UbiProtUniProtKB/Swiss-Prot, HPRD, SysPTM, Phospho.ELM, PhosphoSitePlus, PHOSIDA, O-GLYCBASE, dbOGAP, dbSNO, UbiProt and PupDB
Literature survey of PTMs>5000 modified peptides extracted from ∼800 articles
Literatures related to PTMSYesYes (categorized by PTM types)
Computationally predicted PTMsPhosphorylation, glycosylation and sulfation20 types of PTM18 types of PTM
Protein tertiary structureProtein Data Bank (PDB)Protein Data Bank (PDB)Protein Data Bank (PDB)
Structural properties of PTM sitesAmino acid frequencyAmino acid frequency, solvent accessibility and secondary structureAmino acid frequency, solvent accessibility, secondary structure and intrinsic disorder region
PTM annotationRESID (373 PTM annotations)RESID (431 PTM annotations)RESID (431 PTM annotations)
Kinase family annotationKinBaseKinBase and RegPhos
Protein functional domainInterProInterProInterPro and InterProScan
Protein–protein interactionDIP, MINT, IntAct, HPRD and STRING
Domain–domain interactionInterDom
Functional association of PTMPTM-associated domains and PTM-dependent protein interactions
PTM substrate motifWebLogoWebLogo and MDDLogo
Evolutionary conservation of PTM sitesClustalWClustalW and COG
Transmembrane topologyTMPad, PDBTM, TOPDB and OPM
Graphical visualizationPTM, solvent accessibility, protein variation and protein domainPTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation and sequence logoPTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation, sequence logo, PTM substrate motifs, domain–domain interaction, protein–protein interaction, transmembrane topology and tertiary structure of PTMs
Table 2.

Advances and improvements in this update (dbPTM 3.0)

FeaturesdbPTM 1.0dbPTM 2.0dbPTM 3.0
Protein entryUniProtKB/Swiss-Prot (release 46)UniProtKB/Swiss-Prot (release 55)UniProtKB release 2012-04
Experimental PTM resourceUniProtKB/Swiss-Prot, Phospho.ELM and O-GLYCBASEUniProtKB/Swiss-Prot, Phospho.ELM, PHOSIDA, HPRD, O-GLYCBASE and UbiProtUniProtKB/Swiss-Prot, HPRD, SysPTM, Phospho.ELM, PhosphoSitePlus, PHOSIDA, O-GLYCBASE, dbOGAP, dbSNO, UbiProt and PupDB
Literature survey of PTMs>5000 modified peptides extracted from ∼800 articles
Literatures related to PTMSYesYes (categorized by PTM types)
Computationally predicted PTMsPhosphorylation, glycosylation and sulfation20 types of PTM18 types of PTM
Protein tertiary structureProtein Data Bank (PDB)Protein Data Bank (PDB)Protein Data Bank (PDB)
Structural properties of PTM sitesAmino acid frequencyAmino acid frequency, solvent accessibility and secondary structureAmino acid frequency, solvent accessibility, secondary structure and intrinsic disorder region
PTM annotationRESID (373 PTM annotations)RESID (431 PTM annotations)RESID (431 PTM annotations)
Kinase family annotationKinBaseKinBase and RegPhos
Protein functional domainInterProInterProInterPro and InterProScan
Protein–protein interactionDIP, MINT, IntAct, HPRD and STRING
Domain–domain interactionInterDom
Functional association of PTMPTM-associated domains and PTM-dependent protein interactions
PTM substrate motifWebLogoWebLogo and MDDLogo
Evolutionary conservation of PTM sitesClustalWClustalW and COG
Transmembrane topologyTMPad, PDBTM, TOPDB and OPM
Graphical visualizationPTM, solvent accessibility, protein variation and protein domainPTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation and sequence logoPTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation, sequence logo, PTM substrate motifs, domain–domain interaction, protein–protein interaction, transmembrane topology and tertiary structure of PTMs
FeaturesdbPTM 1.0dbPTM 2.0dbPTM 3.0
Protein entryUniProtKB/Swiss-Prot (release 46)UniProtKB/Swiss-Prot (release 55)UniProtKB release 2012-04
Experimental PTM resourceUniProtKB/Swiss-Prot, Phospho.ELM and O-GLYCBASEUniProtKB/Swiss-Prot, Phospho.ELM, PHOSIDA, HPRD, O-GLYCBASE and UbiProtUniProtKB/Swiss-Prot, HPRD, SysPTM, Phospho.ELM, PhosphoSitePlus, PHOSIDA, O-GLYCBASE, dbOGAP, dbSNO, UbiProt and PupDB
Literature survey of PTMs>5000 modified peptides extracted from ∼800 articles
Literatures related to PTMSYesYes (categorized by PTM types)
Computationally predicted PTMsPhosphorylation, glycosylation and sulfation20 types of PTM18 types of PTM
Protein tertiary structureProtein Data Bank (PDB)Protein Data Bank (PDB)Protein Data Bank (PDB)
Structural properties of PTM sitesAmino acid frequencyAmino acid frequency, solvent accessibility and secondary structureAmino acid frequency, solvent accessibility, secondary structure and intrinsic disorder region
PTM annotationRESID (373 PTM annotations)RESID (431 PTM annotations)RESID (431 PTM annotations)
Kinase family annotationKinBaseKinBase and RegPhos
Protein functional domainInterProInterProInterPro and InterProScan
Protein–protein interactionDIP, MINT, IntAct, HPRD and STRING
Domain–domain interactionInterDom
Functional association of PTMPTM-associated domains and PTM-dependent protein interactions
PTM substrate motifWebLogoWebLogo and MDDLogo
Evolutionary conservation of PTM sitesClustalWClustalW and COG
Transmembrane topologyTMPad, PDBTM, TOPDB and OPM
Graphical visualizationPTM, solvent accessibility, protein variation and protein domainPTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation and sequence logoPTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation, sequence logo, PTM substrate motifs, domain–domain interaction, protein–protein interaction, transmembrane topology and tertiary structure of PTMs

AVAILABILITY

The data content of dbPTM will be regularly maintained and semiannually updated. The resource is now available at http://dbPTM.mbc.nctu.edu.tw/.

FUNDING

National Science Council of the Republic of China financial support, [contract no. 101-2628-E-155-002-MY2, NSC 101-2311-B-009-003-MY3, NSC 100-2627-B-009-002, NSC 101-2911-I-009-101 and NSC 101-2319-B-400-001]. Funding for open access charge: National Science Council of Taiwan.

Conflict of interest statement. None declared.

REFERENCES

1
Mann
M
Jensen
ON
Proteomic analysis of post-translational modifications
Nat. Biotechnol.
2003
, vol. 
21
 (pg. 
255
-
261
)
2
Farriol-Mathis
N
Garavelli
JS
Boeckmann
B
Duvaud
S
Gasteiger
E
Gateau
A
Veuthey
AL
Bairoch
A
Annotation of post-translational modifications in the Swiss-Prot knowledge base
Proteomics
2004
, vol. 
4
 (pg. 
1537
-
1550
)
3
Seo
J
Lee
KJ
Post-translational modifications and their biological functions: proteomic analysis and systematic approaches
J. Biochem. Mol. Biol.
2004
, vol. 
37
 (pg. 
35
-
44
)
4
Dinkel
H
Chica
C
Via
A
Gould
CM
Jensen
LJ
Gibson
TJ
Diella
F
Phospho.ELM: a database of phosphorylation sites—update 2011
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D261
-
D267
)
5
Wurgler-Murphy
SM
King
DM
Kennelly
PJ
The Phosphorylation Site Database: a guide to the serine-, threonine-, and/or tyrosine-phosphorylated proteins in prokaryotic organisms
Proteomics
2004
, vol. 
4
 (pg. 
1562
-
1570
)
6
Hornbeck
PV
Kornhauser
JM
Tkachev
S
Zhang
B
Skrzypek
E
Murray
B
Latham
V
Sullivan
M
PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse
Nucleic Acids Res.
2012
, vol. 
40
 (pg. 
D261
-
D270
)
7
Gnad
F
Ren
S
Cox
J
Olsen
JV
Macek
B
Oroshi
M
Mann
M
PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites
Genome Biol.
2007
, vol. 
8
 pg. 
R250
 
8
Heazlewood
JL
Durek
P
Hummel
J
Selbig
J
Weckwerth
W
Walther
D
Schulze
WX
PhosPhAt: a database of phosphorylation sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor
Nucleic Acids Res.
2008
, vol. 
36
 (pg. 
D1015
-
D1021
)
9
Linding
R
Jensen
LJ
Pasculescu
A
Olhovsky
M
Colwill
K
Bork
P
Yaffe
MB
Pawson
T
NetworKIN: a resource for exploring cellular phosphorylation networks
Nucleic Acids Res.
2008
, vol. 
36
 (pg. 
D695
-
D699
)
10
Lee
TY
Bo-Kai Hsu
J
Chang
WC
Huang
HD
RegPhos: a system to explore the protein kinase-substrate phosphorylation network in humans
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D777
-
D787
)
11
Gupta
R
Birch
H
Rapacki
K
Brunak
S
Hansen
JE
O-GLYCBASE version 4.0: a revised database of O-glycosylated proteins
Nucleic Acids Res.
1999
, vol. 
27
 (pg. 
370
-
372
)
12
Wang
J
Torii
M
Liu
H
Hart
GW
Hu
ZZ
dbOGAP—an integrated bioinformatics resource for protein O-GlcNAcylation
BMC Bioinformatics
2011
, vol. 
12
 pg. 
91
 
13
Chernorudskiy
AL
Garcia
A
Eremin
EV
Shorina
AS
Kondratieva
EV
Gainullin
MR
UbiProt: a database of ubiquitylated proteins
BMC Bioinformatics
2007
, vol. 
8
 pg. 
126
 
14
Tung
CW
PupDB: a database of pupylated proteins
BMC Bioinformatics
2012
, vol. 
13
 pg. 
40
 
15
Lee
TY
Chen
YJ
Lu
CT
Ching
WC
Teng
YC
Huang
HD
dbSNO: a database of cysteine S-nitrosylation
Bioinformatics
2012
, vol. 
28
 (pg. 
2293
-
2295
)
16
Apweiler
R
Bairoch
A
Wu
CH
Barker
WC
Boeckmann
B
Ferro
S
Gasteiger
E
Huang
H
Lopez
R
Magrane
M
, et al. 
UniProt: the Universal Protein knowledgebase
Nucleic Acids Res.
2004
, vol. 
32
 (pg. 
D115
-
D119
)
17
Li
H
Xing
X
Ding
G
Li
Q
Wang
C
Xie
L
Zeng
R
Li
Y
SysPTM: a systematic resource for proteomic research on post-translational modifications
Mol. Cell Proteomics
2009
, vol. 
8
 (pg. 
1839
-
1849
)
18
Keshava Prasad
TS
Goel
R
Kandasamy
K
Keerthikumar
S
Kumar
S
Mathivanan
S
Telikicherla
D
Raju
R
Shafreen
B
Venugopal
A
, et al. 
Human Protein Reference Database—2009 update
Nucleic Acids Res.
2009
, vol. 
37
 (pg. 
D767
-
D772
)
19
Lee
TY
Huang
HD
Hung
JH
Huang
HY
Yang
YS
Wang
TH
dbPTM: an information repository of protein post-translational modification
Nucleic Acids Res.
2006
, vol. 
34
 (pg. 
D622
-
D627
)
20
Huang
HD
Lee
TY
Tzeng
SW
Wu
LC
Horng
JT
Tsou
AP
Huang
KT
Incorporating hidden Markov models for identifying protein kinase-specific phosphorylation sites
J. Comput. Chem.
2005
, vol. 
26
 (pg. 
1032
-
1041
)
21
Huang
HD
Lee
TY
Tzeng
SW
Horng
JT
KinasePhos: a web tool for identifying protein kinase-specific phosphorylation sites
Nucleic Acids Res.
2005
, vol. 
33
 (pg. 
W226
-
W229
)
22
Wong
YH
Lee
TY
Liang
HK
Huang
CM
Wang
TY
Yang
YH
Chu
CH
Huang
HD
Ko
MT
Hwang
JK
KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns
Nucleic Acids Res.
2007
, vol. 
35
 (pg. 
W588
-
W594
)
23
Lee
TY
Hsu
JB
Chang
WC
Wang
TY
Hsu
PC
Huang
HD
A comprehensive resource for integrating and displaying protein post-translational modifications
BMC Res. Notes
2009
, vol. 
2
 pg. 
111
 
24
Lee
TY
Lin
ZQ
Hsieh
SJ
Bretana
NA
Lu
CT
Exploiting maximal dependence decomposition to identify conserved motifs from a group of aligned signal sequences
Bioinformatics
2011
, vol. 
27
 (pg. 
1780
-
1787
)
25
Seet
BT
Dikic
I
Zhou
MM
Pawson
T
Reading protein modifications with interaction domains
Nat. Rev. Mol. Cell Biol.
2006
, vol. 
7
 (pg. 
473
-
483
)
26
Gnad
F
Gunawardena
J
Mann
M
PHOSIDA 2011: the posttranslational modification database
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D253
-
D260
)
27
Mishra
GR
Suresh
M
Kumaran
K
Kannabiran
N
Suresh
S
Bala
P
Shivakumar
K
Anuradha
N
Reddy
R
Raghavan
TM
, et al. 
Human protein reference database—2006 update
Nucleic Acids Res.
2006
, vol. 
34
 (pg. 
D411
-
D414
)
28
Garavelli
JS
The RESID Database of Protein Modifications as a resource and annotation tool
Proteomics
2004
, vol. 
4
 (pg. 
1527
-
1533
)
29
Lee
TY
Bretana
NA
Lu
CT
PlantPhos: using maximal dependence decomposition to identify plant phosphorylation sites with substrate site specificity
BMC Bioinformatics
2011
, vol. 
12
 pg. 
261
 
30
Bretana
NA
Lu
CT
Chiang
CY
Su
MG
Huang
KY
Lee
TY
Weng
SL
Identifying protein phosphorylation sites with kinase substrate specificity on human viruses
PLoS One
2012
, vol. 
7
 pg. 
e40694
 
31
Lee
TY
Chen
YJ
Lu
TC
Huang
HD
SNOSite: exploiting maximal dependence decomposition to identify cysteine S-nitrosylation with substrate site specificity
PLoS One
2011
, vol. 
6
 pg. 
e21849
 
32
Bradshaw
JM
Waksman
G
Molecular recognition by SH2 domains
Adv. Protein Chem.
2002
, vol. 
61
 (pg. 
161
-
210
)
33
Verkhivker
GM
Bouzida
D
Gehlhaar
DK
Rejto
PA
Schaffer
L
Arthurs
S
Colson
AB
Freer
ST
Larson
V
Luty
BA
, et al. 
Hierarchy of simulation models in predicting molecular recognition mechanisms from the binding energy landscapes: structural analysis of the peptide complexes with SH2 domains
Proteins
2001
, vol. 
45
 (pg. 
456
-
470
)
34
Hunter
S
Apweiler
R
Attwood
TK
Bairoch
A
Bateman
A
Binns
D
Bork
P
Das
U
Daugherty
L
Duquenne
L
, et al. 
InterPro: the integrative protein signature database
Nucleic Acids Res.
2009
, vol. 
37
 (pg. 
D211
-
D215
)
35
Bairoch
A
PROSITE: a dictionary of sites and patterns in proteins
Nucleic Acids Res.
1991
, vol. 
19
 
Suppl.
(pg. 
2241
-
2245
)
36
Attwood
TK
Beck
ME
Bleasby
AJ
Parry-Smith
DJ
PRINTS—a database of protein motif fingerprints
Nucleic Acids Res.
1994
, vol. 
22
 (pg. 
3590
-
3596
)
37
Sonnhammer
EL
Eddy
SR
Durbin
R
Pfam: a comprehensive database of protein domain families based on seed alignments
Proteins
1997
, vol. 
28
 (pg. 
405
-
420
)
38
Corpet
F
Gouzy
J
Kahn
D
The ProDom database of protein domain families
Nucleic Acids Res.
1998
, vol. 
26
 (pg. 
323
-
326
)
39
Xenarios
I
Salwinski
L
Duan
XJ
Higney
P
Kim
SM
Eisenberg
D
DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions
Nucleic Acids Res.
2002
, vol. 
30
 (pg. 
303
-
305
)
40
Chatr-Aryamontri
A
Ceol
A
Palazzi
LM
Nardelli
G
Schneider
MV
Castagnoli
L
Cesareni
G
MINT: the Molecular INTeraction database
Nucleic Acids Res.
2006
, vol. 
35
 (pg. 
D572
-
D574
)
41
Kerrien
S
Alam-Faruque
Y
Aranda
B
Bancarz
I
Bridge
A
Derow
C
Dimmer
E
Feuermann
M
Friedrichsen
A
Huntley
R
, et al. 
IntAct—open source resource for molecular interaction data
Nucleic Acids Res.
2007
, vol. 
35
 (pg. 
D561
-
D565
)
42
von Mering
C
Jensen
LJ
Kuhn
M
Chaffron
S
Doerks
T
Kruger
B
Snel
B
Bork
P
STRING 7—recent developments in the integration and prediction of protein interactions
Nucleic Acids Res.
2007
, vol. 
35
 (pg. 
D358
-
D362
)
43
Ng
SK
Zhang
Z
Tan
SH
Lin
K
InterDom: a database of putative interacting protein domains for validating predicted protein interactions and complexes
Nucleic Acids Res.
2003
, vol. 
31
 (pg. 
251
-
254
)
44
Vinothkumar
KR
Henderson
R
Structures of membrane proteins
Q. Rev. Biophys.
2010
, vol. 
43
 (pg. 
65
-
158
)
45
Wallin
E
von Heijne
G
Genome-wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms
Protein Sci.
1998
, vol. 
7
 (pg. 
1029
-
1038
)
46
Rose
PW
Beran
B
Bi
C
Bluhm
WF
Dimitropoulos
D
Goodsell
DS
Prlic
A
Quesada
M
Quinn
GB
Westbrook
JD
, et al. 
The RCSB Protein Data Bank: redesigned web site and web services
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D392
-
D401
)
47
Ackers
GK
Smith
FR
Effects of site-specific amino acid modification on protein interactions and biological function
Annu. Rev. Biochem.
1985
, vol. 
54
 (pg. 
597
-
629
)
48
Lo
A
Cheng
CW
Chiu
YY
Sung
TY
Hsu
WL
TMPad: an integrated structural database for helix-packing folds in transmembrane proteins
Nucleic Acids Res.
2011
, vol. 
39
 (pg. 
D347
-
D355
)
49
Tusnady
GE
Kalmar
L
Simon
I
TOPDB: topology data bank of transmembrane proteins
Nucleic Acids Res.
2008
, vol. 
36
 (pg. 
D234
-
D239
)
50
Tusnady
GE
Dosztanyi
Z
Simon
I
PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank
Nucleic Acids Res.
2005
, vol. 
33
 (pg. 
D275
-
D278
)
51
Lomize
MA
Lomize
AL
Pogozheva
ID
Mosberg
HI
OPM: orientations of proteins in membranes database
Bioinformatics
2006
, vol. 
22
 (pg. 
623
-
625
)
52
Bairoch
A
Apweiler
R
Wu
CH
Barker
WC
Boeckmann
B
Ferro
S
Gasteiger
E
Huang
H
Lopez
R
Magrane
M
, et al. 
The Universal Protein Resource (UniProt)
Nucleic Acids Res.
2005
, vol. 
33
 (pg. 
D154
-
D159
)
53
Nugent
T
Jones
DT
Transmembrane protein topology prediction using support vector machines
BMC Bioinformatics
2009
, vol. 
10
 pg. 
159
 
54
Herraez
A
Biomolecules in the computer: Jmol to the rescue
Biochem. Mol. Biol. Educ.
2006
, vol. 
34
 (pg. 
255
-
261
)
55
Consortium
TGO
The Gene Ontology: enhancements for 2011
Nucleic Acids Res.
2011
, vol. 
40
 (pg. 
D559
-
D564
)
56
Tatusov
RL
Fedorova
ND
Jackson
JD
Jacobs
AR
Kiryutin
B
Koonin
EV
Krylov
DM
Mazumder
R
Mekhedov
SL
Nikolskaya
AN
, et al. 
The COG database: an updated version includes eukaryotes
BMC Bioinformatics
2003
, vol. 
4
 pg. 
41
 
57
Binns
D
Dimmer
E
Huntley
R
Barrell
D
O’Donovan
C
Apweiler
R
QuickGO: a web-based tool for Gene Ontology searching
Bioinformatics
2009
, vol. 
25
 (pg. 
3045
-
3046
)
58
Kabsch
W
Sander
C
Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features
Biopolymers
1983
, vol. 
22
 (pg. 
2577
-
2637
)
59
Shien
DM
Lee
TY
Chang
WC
Hsu
JB
Horng
JT
Hsu
PC
Wang
TY
Huang
HD
Incorporating structural characteristics for identification of protein methylation sites
J. Comput. Chem.
2009
, vol. 
30
 (pg. 
1532
-
1543
)
60
Lu
CT
Chen
SA
Bretana
NA
Cheng
TH
Lee
TY
Carboxylator: incorporating solvent-accessible surface area for identifying protein carboxylation sites
J. Comput. Aided Mol. Des.
, vol. 
25
 (pg. 
987
-
995
)
61
Lee
TY
Hsu
JB
Lin
FM
Chang
WC
Hsu
PC
Huang
HD
N-Ace: using solvent accessibility and physicochemical properties to identify protein N-acetylation sites
J. Comput. Chem.
, vol. 
31
 (pg. 
2759
-
2771
)
62
Ahmad
S
Gromiha
MM
Sarai
A
RVP-net: online prediction of real valued accessible surface area of proteins from single sequences
Bioinformatics
2003
, vol. 
19
 (pg. 
1849
-
1851
)
63
McGuffin
LJ
Bryson
K
Jones
DT
The PSIPRED protein structure prediction server
Bioinformatics
2000
, vol. 
16
 (pg. 
404
-
405
)
64
Thompson
JD
Higgins
DG
Gibson
TJ
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice
Nucleic Acids Res.
1994
, vol. 
22
 (pg. 
4673
-
4680
)
65
Altschul
SF
Madden
TL
Schaffer
AA
Zhang
J
Zhang
Z
Miller
W
Lipman
DJ
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Nucleic Acids Res.
1997
, vol. 
25
 (pg. 
3389
-
3402
)
66
Lee
TY
Hsu
JB
Lin
FM
Chang
WC
Hsu
PC
Huang
HD
N-Ace: using solvent accessibility and physicochemical properties to identify protein N-acetylation sites
J. Comput. Chem.
2010
, vol. 
31
 (pg. 
2759
-
2771
)
67
Crooks
GE
Hon
G
Chandonia
JM
Brenner
SE
WebLogo: a sequence logo generator
Genome Res.
2004
, vol. 
14
 (pg. 
1188
-
1190
)
68
Hao
G
Derakhshan
B
Shi
L
Campagne
F
Gross
SS
SNOSID, a proteomic method for identification of cysteine S-nitrosylation sites in complex protein mixtures
Proc. Natl Acad. Sci. USA
2006
, vol. 
103
 (pg. 
1012
-
1017
)
69
Greco
TM
Hodara
R
Parastatidis
I
Heijnen
HF
Dennehy
MK
Liebler
DC
Ischiropoulos
H
Identification of S-nitrosylation motifs by site-specific mapping of the S-nitrosocysteine proteome in human vascular smooth muscle cells
Proc. Natl Acad. Sci. USA
2006
, vol. 
103
 (pg. 
7420
-
7425
)
70
Lane
P
Hao
G
Gross
SS
S-nitrosylation is emerging as a specific and fundamental posttranslational protein modification: head-to-head comparison with O-phosphorylation
Sci STKE
2001
, vol. 
2001
 pg. 
re1
 
71
Stamler
JS
Toone
EJ
Lipton
SA
Sucher
NJ
(S)NO signals: translocation, regulation, and a consensus motif
Neuron
1997
, vol. 
18
 (pg. 
691
-
696
)
72
Greco
TM
Hodara
R
Parastatidis
I
Heijnen
HG
Dennehy
MK
Liebler
DC
Ischiropoulos
H
Identification of S-nitrosylation motifs by site-specific mapping of the S-nitrosocysteine proteome in human vascular smooth muscle cells
Proc. Natl Acad. Sci. USA
2006
, vol. 
103
 (pg. 
7420
-
7425
)
73
Chen
Y-J
Ku
W-C
Lin
P-Y
Chou
H-C
Khoo
K-H
Chen
Y-J
S-alkylating labeling strategy for site-specific identification of the s-nitrosoproteome
J. Proteome Res.
2010
, vol. 
9
 (pg. 
6417
-
6439
)
74
delaTorre
A
Schroeder
RA
Bartlett
ST
Kuo
PC
Differential effects of nitric oxide-mediated S-nitrosylation on p50 and c-jun DNA binding
Surgery
1998
, vol. 
124
 (pg. 
137
-
141; discussion 141–132
)
75
Su
D
Hu
Q
Li
Q
Thompson
JR
Cui
G
Fazly
A
Davies
BA
Botuyan
MV
Zhang
Z
Mer
G
Structural basis for recognition of H3K56-acetylated histone H3-H4 by the chaperone Rtt106
Nature
2012
, vol. 
483
 (pg. 
104
-
107
)
76
Umehara
T
Nakamura
Y
Jang
MK
Nakano
K
Tanaka
A
Ozato
K
Padmanabhan
B
Yokoyama
S
Structural basis for acetylated histone H4 recognition by the human BRD2 bromodomain
J. Biol. Chem.
2010
, vol. 
285
 (pg. 
7610
-
7618
)
77
Owen
DJ
Ornaghi
P
Yang
JC
Lowe
N
Evans
PR
Ballario
P
Neuhaus
D
Filetici
P
Travers
AA
The structural basis for the recognition of acetylated histone H4 by the bromodomain of histone acetyltransferase gcn5p
EMBO J.
2000
, vol. 
19
 (pg. 
6141
-
6149
)
78
Durocher
D
Taylor
IA
Sarbassova
D
Haire
LF
Westcott
SL
Jackson
SP
Smerdon
SJ
Yaffe
MB
The molecular basis of FHA domain:phosphopeptide binding specificity and implications for phospho-dependent signaling mechanisms
Mol. Cell
2000
, vol. 
6
 (pg. 
1169
-
1182
)
79
Nielsen
PR
Nietlispach
D
Mott
HR
Callaghan
J
Bannister
A
Kouzarides
T
Murzin
AG
Murzina
NV
Laue
ED
Structure of the HP1 chromodomain bound to histone H3 methylated at lysine 9
Nature
2002
, vol. 
416
 (pg. 
103
-
107
)
80
Wysocka
J
Swigut
T
Milne
TA
Dou
Y
Zhang
X
Burlingame
AL
Roeder
RG
Brivanlou
AH
Allis
CD
WDR5 associates with histone H3 methylated at K4 and is essential for H3 K4 methylation and vertebrate development
Cell
2005
, vol. 
121
 (pg. 
859
-
872
)
81
Han
Z
Guo
L
Wang
H
Shen
Y
Deng
XW
Chai
J
Structural basis for the specific recognition of methylated histone H3 lysine 4 by the WD-40 protein WDR5
Mol. Cell
2006
, vol. 
22
 (pg. 
137
-
144
)
82
Chen
SA
Lee
TY
Ou
YY
Incorporating significant amino acid pairs to identify O-linked glycosylation sites on transmembrane proteins and non-transmembrane proteins
BMC Bioinformatics
2010
, vol. 
11
 pg. 
536
 
83
Gromiha
MM
An
J
Kono
H
Oobatake
M
Uedaira
H
Sarai
A
ProTherm: Thermodynamic Database for Proteins and Mutants
Nucleic Acids Res.
1999
, vol. 
27
 (pg. 
286
-
288
)
84
Kumar
MD
Gromiha
MM
PINT: protein-protein Interactions Thermodynamic Database
Nucleic Acids Res.
2006
, vol. 
34
 (pg. 
D195
-
D198
)
85
Prabakaran
P
An
J
Gromiha
MM
Selvaraj
S
Uedaira
H
Kono
H
Sarai
A
Thermodynamic database for protein-nucleic acid interactions (ProNIT)
Bioinformatics
2001
, vol. 
17
 (pg. 
1027
-
1034
)

Author notes

The authors wish it to be known that, in their opinion, the first three authors should be regarded as joint First Authors.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial reuse, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com.

Supplementary data

Comments

0 Comments
Submit a comment
You have entered an invalid code
Thank you for submitting a comment on this article. Your comment will be reviewed and published at the journal's discretion. Please check for further notifications by email.