dbPTM 3.0: an informative resource for investigating substrate site specificity and functional association of protein post-translational modifications

Data statistics of experimental and putative PTM sites in dbPTM

PTM types	Number of experimental substrate sites	Number of putative substrate sites from UniProtKB/Swiss-Prot	Number of HMM-predicted sites
Phoshorylation	142 446	74 174	1 414 879
Ubiquitylation	23 647	1702	8865
N-linked glycosylation	15 242	87 529	418 253
Acetylation	9683	19 981	1156
O-linked glycosylation	3508	3695	373 758
Amidation	2533	1445	114 034
Hydroxylation	1629	1274	9743
Methylation	1585	5479	22 332
Pyrrolidone carboxylic acid	829	742	12 322
Sumoylation	725	800	13 042
Gamma-carboxyglutamic acid	448	814	1942
Palmitoylation	312	5252	33 830
Sulfation	207	800	70 005
Myristoylation	178	1275	988
C-linked glycosylation	156	99	3923
Prenylation	130	1327	6741
Nitration	80	93	1432
Deamidation	52	165	2022
S-nitrosylation	3096	170	–
Oxidation	333	180	–
ADP-ribosylation	140	164	–
N6-succinyllysine	88	69	–
Formylation	56	125	–
GPI anchoring	34	849	–
Bromination	33	56	–
N6-malonyllysine	33	167	–
Citrullination	32	110	–
N6-carboxylysine	30	1566	–
Glutathionylation	19	32	–
FAD	19	163	–
Others	1218	15 825	–
Total	208 521	226 122	2 509 267

PTM types	Number of experimental substrate sites	Number of putative substrate sites from UniProtKB/Swiss-Prot	Number of HMM-predicted sites
Phoshorylation	142 446	74 174	1 414 879
Ubiquitylation	23 647	1702	8865
N-linked glycosylation	15 242	87 529	418 253
Acetylation	9683	19 981	1156
O-linked glycosylation	3508	3695	373 758
Amidation	2533	1445	114 034
Hydroxylation	1629	1274	9743
Methylation	1585	5479	22 332
Pyrrolidone carboxylic acid	829	742	12 322
Sumoylation	725	800	13 042
Gamma-carboxyglutamic acid	448	814	1942
Palmitoylation	312	5252	33 830
Sulfation	207	800	70 005
Myristoylation	178	1275	988
C-linked glycosylation	156	99	3923
Prenylation	130	1327	6741
Nitration	80	93	1432
Deamidation	52	165	2022
S-nitrosylation	3096	170	–
Oxidation	333	180	–
ADP-ribosylation	140	164	–
N6-succinyllysine	88	69	–
Formylation	56	125	–
GPI anchoring	34	849	–
Bromination	33	56	–
N6-malonyllysine	33	167	–
Citrullination	32	110	–
N6-carboxylysine	30	1566	–
Glutathionylation	19	32	–
FAD	19	163	–
Others	1218	15 825	–
Total	208 521	226 122	2 509 267

Table 1.

Open in new tab Download slide

Data statistics of experimental and putative PTM sites in dbPTM

PTM types	Number of experimental substrate sites	Number of putative substrate sites from UniProtKB/Swiss-Prot	Number of HMM-predicted sites
Phoshorylation	142 446	74 174	1 414 879
Ubiquitylation	23 647	1702	8865
N-linked glycosylation	15 242	87 529	418 253
Acetylation	9683	19 981	1156
O-linked glycosylation	3508	3695	373 758
Amidation	2533	1445	114 034
Hydroxylation	1629	1274	9743
Methylation	1585	5479	22 332
Pyrrolidone carboxylic acid	829	742	12 322
Sumoylation	725	800	13 042
Gamma-carboxyglutamic acid	448	814	1942
Palmitoylation	312	5252	33 830
Sulfation	207	800	70 005
Myristoylation	178	1275	988
C-linked glycosylation	156	99	3923
Prenylation	130	1327	6741
Nitration	80	93	1432
Deamidation	52	165	2022
S-nitrosylation	3096	170	–
Oxidation	333	180	–
ADP-ribosylation	140	164	–
N6-succinyllysine	88	69	–
Formylation	56	125	–
GPI anchoring	34	849	–
Bromination	33	56	–
N6-malonyllysine	33	167	–
Citrullination	32	110	–
N6-carboxylysine	30	1566	–
Glutathionylation	19	32	–
FAD	19	163	–
Others	1218	15 825	–
Total	208 521	226 122	2 509 267

PTM types	Number of experimental substrate sites	Number of putative substrate sites from UniProtKB/Swiss-Prot	Number of HMM-predicted sites
Phoshorylation	142 446	74 174	1 414 879
Ubiquitylation	23 647	1702	8865
N-linked glycosylation	15 242	87 529	418 253
Acetylation	9683	19 981	1156
O-linked glycosylation	3508	3695	373 758
Amidation	2533	1445	114 034
Hydroxylation	1629	1274	9743
Methylation	1585	5479	22 332
Pyrrolidone carboxylic acid	829	742	12 322
Sumoylation	725	800	13 042
Gamma-carboxyglutamic acid	448	814	1942
Palmitoylation	312	5252	33 830
Sulfation	207	800	70 005
Myristoylation	178	1275	988
C-linked glycosylation	156	99	3923
Prenylation	130	1327	6741
Nitration	80	93	1432
Deamidation	52	165	2022
S-nitrosylation	3096	170	–
Oxidation	333	180	–
ADP-ribosylation	140	164	–
N6-succinyllysine	88	69	–
Formylation	56	125	–
GPI anchoring	34	849	–
Bromination	33	56	–
N6-malonyllysine	33	167	–
Citrullination	32	110	–
N6-carboxylysine	30	1566	–
Glutathionylation	19	32	–
FAD	19	163	–
Others	1218	15 825	–
Total	208 521	226 122	2 509 267

Enhanced web interface

To facilitate the use of the dbPTM resource, the web interface has been redesigned and enhanced to allow efficient access to the protein of interest. Supplementary Figure S3 shows the content of a typical dbPTM query: (i) quick search by IDs and keywords, (ii) basic information, (iii) graphical visualization of PTM sites with structural characteristics and functional domains, (iv) table of experimental PTM sites with reported literature, (v) orthologous conservation of PTM substrate sites, (vi) PPIs and domain–domain interactions and (vii) literature related to PTMs. The combined visualization of PTM sites and function domains for a protein sequence can help users to understand the functional associations of PTM substrate sites. According to the multiple sequence alignment result of orthologous proteins, users can investigate whether a PTM site located in evolutionary conserved regions, which indicates that the orthologous sites in other species could be involved in the same modification. Additionally, this update incorporates the protein functional domains and domain–domain interactions to infer the PTM-dependent protein interactions. Moreover, the literatures associated with PTMs are categorized by the modification type.

In addition to the database query by the protein name, gene name, UniProtKB ID or accession, the protein sequence is allowed for homology search against UniProtKB protein sequence database using Blast (65) program. For browse function of dbPTM web site, a summary table of PTM types and their modified residues is provided for users to efficiently access the number of data in a specific modified amino acid of a PTM type. The annotations of PTM types are referred to the UniProtKB/Swiss-Prot PTM list (http://www.uniprot.org/docs/ptmlist.txt). As depicted in Supplementary Figure S4, the acetylation of lysine (K) is chosen to obtain more detailed information such as the location of the modification in protein sequence, the modified chemical formula, the mass difference and the substrate site specificity, which is the preference of amino acids surrounding the modification sites. The structural characteristics, such as solvent accessibility and secondary structure surrounding the PTM substrate sites, are also provided. Additionally, the substrate site specificity of the acetylated lysines is investigated in detail with reference to the subcellular localizations of acetylated proteins. Previous work has demonstrated that the co-localization of acetyltransferases and substrate proteins could be a promising method to investigate the substrate site specificities and could be adopted to improve the computational identification of protein acetylation sites (66).

Investigation of PTM substrate site specificities

Given a window length, n, the fragment of 2n + 1 residues centering on PTM site (position 0) is extracted and the positional frequencies of amino acids are calculated and presented as sequence logos by WebLogo (67). Supplementary Figure S5 shows the substrate motif and structural characteristics of experimental phosphorylation sites. According to the kinase classification extracted from KinBase (http://kinase.com/) and RegPhos (10), the substrate site specificity of protein phosphorylation could be further categorized into >200 kinase groups. As given in Supplementary Figure S5, most of the kinase-specific substrate motifs have conserved amino acids surrounding the phosphorylation sites. For the PTMs other than phosphorylation, there are no annotations of catalytic enzymes or transferases due to the experimental difficulty in identifying the catalytic enzymes for a specific PTM. Based on the basic concept of sequence conservation, a sequence logo could display the substrate motif for each PTM type with a group of aligned sequences. However, it is difficult to explore conserved motifs for large-scale sequence data; for instance, a sequence logo for all phosphorylation data involved with various catalytic kinases fails to obviously present the kinase-specific substrate specificity. Thus, for the PTM containing sufficient data of experimental substrate sites, MDDLogo was performed to cluster a group of aligned substrate sequences into subgroups containing statistically significant motifs. As the example of protein S-nitrosylation presented in Figure 2, 10 sequence logos, which were identified from 3095 S-nitrosylated peptides with a 13-mer window length, contain a conserved motif of positively charged amino acids (K, R and H) surrounding the S-nitrosocysteine. Interestingly, the first and sixth groups contain the conserved motifs of negatively charged amino acids (D and E) accompanied by positively charged amino acids at two specific positions. Consistent with previous studies (68–73), the S-nitrosylated cysteines may be located within an acid-base motif flanked by acidic and basic amino acids.

Figure 2.

The MDDLogo-identified substrate motifs of protein S-nitrosylation sites.

Investigation of PTM-associated domains and protein interactions

According to the data statistics in dbPTM, >60% of experimentally verified PTM sites locate in the functional domains of proteins. Such statistics could be analyzed in detail for each type of PTMs. For instance of protein S-nitrosylation, which is an emerging PTM playing crucial role in the regulation of NO-related cellular processes, the statistics shows that ∼70% of the reported S-nitrosylation sites locate within the functional domains. Furthermore, the detailed distribution of functional domains covering S-nitrosylation sites is given in Supplementary Table S4. It is observed that the most preferred functional domain is the ‘nucleotide-binding alpha–beta plait’ with InterPro ID: IPR012677 which covers 47 S-nitrosylation sites. Another preferred functional domain is the ‘RNA recognition motif, RNP-1’ domain with InterPro ID: IPR000504 which covers 46 S-nitrosylation sites. This investigation indicates that these S-nitrosylation sites may play important roles in the domains of proteins involving in DNA or RNA binding (74). In addition, Supplementary Table S5 shows the distribution of functional domains covering substrate sites for several representative PTMs, including acetylation, methylation, hydroxylation, N-linked and O-linked glycosylation, phosphorylation and ubiquitylation.

Many PTMs provide binding sites for specific protein-interaction domains, which often contain a conserved structure for the modified site and a more flexible surface for the flanking amino acids, synergize to regulate cellular processes (75–78). In order to investigate the PTM-associated protein interactions, the information of domain–domain interactions collected from InterDom is adopted in this study. As the case study of ‘Histone H3’ (UniProtKB ID: H31_HUMAN) presented in Figure 3, ‘Heterochromatin protein 1 homolog alpha’ (‘HP1’, UniProtKB ID: CBX5_HUMAN) and ‘WD repeat-containing protein 5’ (‘WDR5’, UniProtKB ID: WDR5_HUMAN) interact with ‘Histone H3’. When investigating the protein interaction between ‘HP1’ and ‘Histone H3’ in detail, there is a domain–domain interaction between ‘Chromodomain’ (InterPro ID: IPR000953) and ‘Histone H3’ (InterPro ID: IPR000164). Among the PTMs located in the domain of ‘Histone H3’, a previous study has demonstrated that the ‘HP1 chromodomain’ can bind to the ‘Histone H3’ methylated at lysine 10 (79). Another protein interaction shows that there is a domain–domain interaction between the ‘WD40 Repeat’ (InterPro ID: IPR001680) and ‘Histone Core’ (InterPro ID: IPR007125). It has been proposed that the structural motif for the specific recognition of methylated ‘Histone H3’ lysine 5 by ‘WD40 Repeat’ of ‘WDR5’ is essential to vertebrate development (80,81). This investigation indicates that the other PTM sites could be the potential binding sites for protein-interaction domains.

Figure 3.

A case study of domain–domain interactions and PTM-associated protein interactions on Histone H3 (UniProtKB ID: H31_HUMAN).

Open in new tab Download slide

Investigation of PTM sites on TM proteins

According to the data statistics of PTM sites and TM proteins in dbPTM, a total of 9644 and 68 775 PTM substrate sites locate on the 2088 experimental and 33 747 potential TM proteins, respectively. In order to investigate the structural distribution of PTM sites on TM proteins, the structural topologies of a TM protein are mainly categorized into four types: extracellular, cytoplasmic, TM and unknown regions. Supplementary Table S6 provides the structural distribution of PTMs containing >10 substrate sites on experimental TM proteins. Interestingly, without the consideration of substrate sites located in unknown region, all of the N-linked (GlcNAc …) glycosylation sites are located in the extracellular region, as well as the O-linked and C-linked glycosylation sites. This investigation is reasonable to understand the biological effect of glycosylation functioning on TM proteins for receptor targeting and cell–cell interactions (47). Otherwise, the phosphorylation sites are mainly located in cytoplasmic regions, which induce signal transduction and ion transport. The structural distribution of PTM sites could be the means to infer the potential roles of PTMs functioning on TM proteins. Actually, a previous work has demonstrated that the incorporation of membrane topology could improve the performance of predicting O-linked glycosylation sites on TM proteins (82). Supplementary Figure S6 shows a graphical visualization of the PTMs and membrane topology on human Beta-2 adrenergic receptor (ADRB2). Furthermore, two modification sites Tyr141 (pTyr) and Cys341 (S-palmitoyl cysteine) are further highlighted in red on the tertiary structure (PDB ID: 2R4R) using Jmol viewer, which indicates the solvent accessibility and distance between them.

CONCLUSION

The expansion of the dbPTM database increases its usefulness for researchers investigating the impact of PTMs on protein function and cellular processes. Additionally, the enhanced web interface enables both wet-lab biologists and bioinformatics researchers to efficiently explore the further information about protein PTMs. Table 2 summarizes the advancements and new features supported in dbPTM 3.0. In the future, we expect dbPTM to continue to grow with the increasing availability of data in resources such as Phospho.ELM, PhosphoSitePlus and UniProtKB. One area that we can envision dbPTM improving greatly in prospective works is implementing a more accurate method for the discovery of PTM substrate motifs. Also, enhancements on the text mining algorithm will enable the system to select MS/MS peptides from research articles associated with protein modifications with a higher confidence rate. In order to provide more adequate information for PTM function, the descriptions associated with the biological function of PTMs will be extracted from research articles using an information retrieval system. Moreover, the thermodynamic parameters for proteins (83), PPIs (84) and protein–nucleic acid interactions (85) could be integrated for the investigation of PTM-associated protein stability.

Table 2.

Advances and improvements in this update (dbPTM 3.0)

Features	dbPTM 1.0	dbPTM 2.0	dbPTM 3.0
Protein entry	UniProtKB/Swiss-Prot (release 46)	UniProtKB/Swiss-Prot (release 55)	UniProtKB release 2012-04
Experimental PTM resource	UniProtKB/Swiss-Prot, Phospho.ELM and O-GLYCBASE	UniProtKB/Swiss-Prot, Phospho.ELM, PHOSIDA, HPRD, O-GLYCBASE and UbiProt	UniProtKB/Swiss-Prot, HPRD, SysPTM, Phospho.ELM, PhosphoSitePlus, PHOSIDA, O-GLYCBASE, dbOGAP, dbSNO, UbiProt and PupDB
Literature survey of PTMs	–	–	>5000 modified peptides extracted from ∼800 articles
Literatures related to PTMS	–	Yes	Yes (categorized by PTM types)
Computationally predicted PTMs	Phosphorylation, glycosylation and sulfation	20 types of PTM	18 types of PTM
Protein tertiary structure	Protein Data Bank (PDB)	Protein Data Bank (PDB)	Protein Data Bank (PDB)
Structural properties of PTM sites	Amino acid frequency	Amino acid frequency, solvent accessibility and secondary structure	Amino acid frequency, solvent accessibility, secondary structure and intrinsic disorder region
PTM annotation	RESID (373 PTM annotations)	RESID (431 PTM annotations)	RESID (431 PTM annotations)
Kinase family annotation	–	KinBase	KinBase and RegPhos
Protein functional domain	InterPro	InterPro	InterPro and InterProScan
Protein–protein interaction	–	–	DIP, MINT, IntAct, HPRD and STRING
Domain–domain interaction	–	–	InterDom
Functional association of PTM	–	–	PTM-associated domains and PTM-dependent protein interactions
PTM substrate motif	–	WebLogo	WebLogo and MDDLogo
Evolutionary conservation of PTM sites	–	ClustalW	ClustalW and COG
Transmembrane topology	–	–	TMPad, PDBTM, TOPDB and OPM
Graphical visualization	PTM, solvent accessibility, protein variation and protein domain	PTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation and sequence logo	PTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation, sequence logo, PTM substrate motifs, domain–domain interaction, protein–protein interaction, transmembrane topology and tertiary structure of PTMs

Features	dbPTM 1.0	dbPTM 2.0	dbPTM 3.0
Protein entry	UniProtKB/Swiss-Prot (release 46)	UniProtKB/Swiss-Prot (release 55)	UniProtKB release 2012-04
Experimental PTM resource	UniProtKB/Swiss-Prot, Phospho.ELM and O-GLYCBASE	UniProtKB/Swiss-Prot, Phospho.ELM, PHOSIDA, HPRD, O-GLYCBASE and UbiProt	UniProtKB/Swiss-Prot, HPRD, SysPTM, Phospho.ELM, PhosphoSitePlus, PHOSIDA, O-GLYCBASE, dbOGAP, dbSNO, UbiProt and PupDB
Literature survey of PTMs	–	–	>5000 modified peptides extracted from ∼800 articles
Literatures related to PTMS	–	Yes	Yes (categorized by PTM types)
Computationally predicted PTMs	Phosphorylation, glycosylation and sulfation	20 types of PTM	18 types of PTM
Protein tertiary structure	Protein Data Bank (PDB)	Protein Data Bank (PDB)	Protein Data Bank (PDB)
Structural properties of PTM sites	Amino acid frequency	Amino acid frequency, solvent accessibility and secondary structure	Amino acid frequency, solvent accessibility, secondary structure and intrinsic disorder region
PTM annotation	RESID (373 PTM annotations)	RESID (431 PTM annotations)	RESID (431 PTM annotations)
Kinase family annotation	–	KinBase	KinBase and RegPhos
Protein functional domain	InterPro	InterPro	InterPro and InterProScan
Protein–protein interaction	–	–	DIP, MINT, IntAct, HPRD and STRING
Domain–domain interaction	–	–	InterDom
Functional association of PTM	–	–	PTM-associated domains and PTM-dependent protein interactions
PTM substrate motif	–	WebLogo	WebLogo and MDDLogo
Evolutionary conservation of PTM sites	–	ClustalW	ClustalW and COG
Transmembrane topology	–	–	TMPad, PDBTM, TOPDB and OPM
Graphical visualization	PTM, solvent accessibility, protein variation and protein domain	PTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation and sequence logo	PTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation, sequence logo, PTM substrate motifs, domain–domain interaction, protein–protein interaction, transmembrane topology and tertiary structure of PTMs

Table 2.

Advances and improvements in this update (dbPTM 3.0)

Features	dbPTM 1.0	dbPTM 2.0	dbPTM 3.0
Protein entry	UniProtKB/Swiss-Prot (release 46)	UniProtKB/Swiss-Prot (release 55)	UniProtKB release 2012-04
Experimental PTM resource	UniProtKB/Swiss-Prot, Phospho.ELM and O-GLYCBASE	UniProtKB/Swiss-Prot, Phospho.ELM, PHOSIDA, HPRD, O-GLYCBASE and UbiProt	UniProtKB/Swiss-Prot, HPRD, SysPTM, Phospho.ELM, PhosphoSitePlus, PHOSIDA, O-GLYCBASE, dbOGAP, dbSNO, UbiProt and PupDB
Literature survey of PTMs	–	–	>5000 modified peptides extracted from ∼800 articles
Literatures related to PTMS	–	Yes	Yes (categorized by PTM types)
Computationally predicted PTMs	Phosphorylation, glycosylation and sulfation	20 types of PTM	18 types of PTM
Protein tertiary structure	Protein Data Bank (PDB)	Protein Data Bank (PDB)	Protein Data Bank (PDB)
Structural properties of PTM sites	Amino acid frequency	Amino acid frequency, solvent accessibility and secondary structure	Amino acid frequency, solvent accessibility, secondary structure and intrinsic disorder region
PTM annotation	RESID (373 PTM annotations)	RESID (431 PTM annotations)	RESID (431 PTM annotations)
Kinase family annotation	–	KinBase	KinBase and RegPhos
Protein functional domain	InterPro	InterPro	InterPro and InterProScan
Protein–protein interaction	–	–	DIP, MINT, IntAct, HPRD and STRING
Domain–domain interaction	–	–	InterDom
Functional association of PTM	–	–	PTM-associated domains and PTM-dependent protein interactions
PTM substrate motif	–	WebLogo	WebLogo and MDDLogo
Evolutionary conservation of PTM sites	–	ClustalW	ClustalW and COG
Transmembrane topology	–	–	TMPad, PDBTM, TOPDB and OPM
Graphical visualization	PTM, solvent accessibility, protein variation and protein domain	PTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation and sequence logo	PTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation, sequence logo, PTM substrate motifs, domain–domain interaction, protein–protein interaction, transmembrane topology and tertiary structure of PTMs

Features	dbPTM 1.0	dbPTM 2.0	dbPTM 3.0
Protein entry	UniProtKB/Swiss-Prot (release 46)	UniProtKB/Swiss-Prot (release 55)	UniProtKB release 2012-04
Experimental PTM resource	UniProtKB/Swiss-Prot, Phospho.ELM and O-GLYCBASE	UniProtKB/Swiss-Prot, Phospho.ELM, PHOSIDA, HPRD, O-GLYCBASE and UbiProt	UniProtKB/Swiss-Prot, HPRD, SysPTM, Phospho.ELM, PhosphoSitePlus, PHOSIDA, O-GLYCBASE, dbOGAP, dbSNO, UbiProt and PupDB
Literature survey of PTMs	–	–	>5000 modified peptides extracted from ∼800 articles
Literatures related to PTMS	–	Yes	Yes (categorized by PTM types)
Computationally predicted PTMs	Phosphorylation, glycosylation and sulfation	20 types of PTM	18 types of PTM
Protein tertiary structure	Protein Data Bank (PDB)	Protein Data Bank (PDB)	Protein Data Bank (PDB)
Structural properties of PTM sites	Amino acid frequency	Amino acid frequency, solvent accessibility and secondary structure	Amino acid frequency, solvent accessibility, secondary structure and intrinsic disorder region
PTM annotation	RESID (373 PTM annotations)	RESID (431 PTM annotations)	RESID (431 PTM annotations)
Kinase family annotation	–	KinBase	KinBase and RegPhos
Protein functional domain	InterPro	InterPro	InterPro and InterProScan
Protein–protein interaction	–	–	DIP, MINT, IntAct, HPRD and STRING
Domain–domain interaction	–	–	InterDom
Functional association of PTM	–	–	PTM-associated domains and PTM-dependent protein interactions
PTM substrate motif	–	WebLogo	WebLogo and MDDLogo
Evolutionary conservation of PTM sites	–	ClustalW	ClustalW and COG
Transmembrane topology	–	–	TMPad, PDBTM, TOPDB and OPM
Graphical visualization	PTM, solvent accessibility, protein variation and protein domain	PTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation and sequence logo	PTM, solvent accessibility, secondary structure, protein variation, protein domain, tertiary structure, orthologous conservation, sequence logo, PTM substrate motifs, domain–domain interaction, protein–protein interaction, transmembrane topology and tertiary structure of PTMs

AVAILABILITY

The data content of dbPTM will be regularly maintained and semiannually updated. The resource is now available at http://dbPTM.mbc.nctu.edu.tw/.

FUNDING

National Science Council of the Republic of China financial support, [contract no. 101-2628-E-155-002-MY2, NSC 101-2311-B-009-003-MY3, NSC 100-2627-B-009-002, NSC 101-2911-I-009-101 and NSC 101-2319-B-400-001]. Funding for open access charge: National Science Council of Taiwan.

Conflict of interest statement. None declared.

REFERENCES

Mann

Jensen

Proteomic analysis of post-translational modifications

Nat. Biotechnol.

2003

, vol.

(pg.

255

261

)

Farriol-Mathis

Garavelli

Boeckmann

Duvaud

Gasteiger

Gateau

Veuthey

Bairoch

Annotation of post-translational modifications in the Swiss-Prot knowledge base

Proteomics

2004

, vol.

(pg.

1537

1550

)

Seo

Lee

Post-translational modifications and their biological functions: proteomic analysis and systematic approaches

J. Biochem. Mol. Biol.

2004

, vol.

(pg.

)

OpenURL Placeholder Text

Dinkel

Chica

Via

Gould

Jensen

Gibson

Diella

Phospho.ELM: a database of phosphorylation sites—update 2011

Nucleic Acids Res.

2011

, vol.

(pg.

D261

D267

)

Wurgler-Murphy

King

Kennelly

The Phosphorylation Site Database: a guide to the serine-, threonine-, and/or tyrosine-phosphorylated proteins in prokaryotic organisms

Proteomics

2004

, vol.

(pg.

1562

1570

)

Hornbeck

Kornhauser

Tkachev

Zhang

Skrzypek

Murray

Latham

Sullivan

PhosphoSitePlus: a comprehensive resource for investigating the structure and function of experimentally determined post-translational modifications in man and mouse

Nucleic Acids Res.

2012

, vol.

(pg.

D261

D270

)

Gnad

Ren

Cox

Olsen

Macek

Oroshi

Mann

PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites

Genome Biol.

2007

, vol.

pg.

R250

Heazlewood

Durek

Hummel

Selbig

Weckwerth

Walther

Schulze

PhosPhAt: a database of phosphorylation sites in Arabidopsis thaliana and a plant-specific phosphorylation site predictor

Nucleic Acids Res.

2008

, vol.

(pg.

D1015

D1021

)

Linding

Jensen

Pasculescu

Olhovsky

Colwill

Bork

Yaffe

Pawson

NetworKIN: a resource for exploring cellular phosphorylation networks

Nucleic Acids Res.

2008

, vol.

(pg.

D695

D699

)

Lee

Bo-Kai Hsu

Chang

Huang

RegPhos: a system to explore the protein kinase-substrate phosphorylation network in humans

Nucleic Acids Res.

2011

, vol.

(pg.

D777

D787

)

Gupta

Birch

Rapacki

Brunak

Hansen

O-GLYCBASE version 4.0: a revised database of O-glycosylated proteins

Nucleic Acids Res.

1999

, vol.

(pg.

370

372

)

Wang

Torii

Liu

Hart

dbOGAP—an integrated bioinformatics resource for protein O-GlcNAcylation

BMC Bioinformatics

2011

, vol.

pg.

Chernorudskiy

Garcia

Eremin

Shorina

Kondratieva

Gainullin

UbiProt: a database of ubiquitylated proteins

BMC Bioinformatics

2007

, vol.

pg.

126

Tung

PupDB: a database of pupylated proteins

BMC Bioinformatics

2012

, vol.

pg.

Lee

Chen

Ching

Teng

Huang

dbSNO: a database of cysteine S-nitrosylation

Bioinformatics

2012

, vol.

(pg.

2293

2295

)

Apweiler

Bairoch

Barker

Boeckmann

Ferro

Gasteiger

Huang

Lopez

Magrane

, et al.

UniProt: the Universal Protein knowledgebase

Nucleic Acids Res.

2004

, vol.

(pg.

D115

D119

)

Xing

Ding

Wang

Xie

Zeng

SysPTM: a systematic resource for proteomic research on post-translational modifications

Mol. Cell Proteomics

2009

, vol.

(pg.

1839

1849

)

Keshava Prasad

Goel

Kandasamy

Keerthikumar

Kumar

Mathivanan

Telikicherla

Raju

Shafreen

Venugopal

, et al.

Human Protein Reference Database—2009 update

Nucleic Acids Res.

2009

, vol.

(pg.

D767

D772

)

Lee

Huang

Hung

Huang

Yang

Wang

dbPTM: an information repository of protein post-translational modification

Nucleic Acids Res.

2006

, vol.

(pg.

D622

D627

)

Huang

Lee

Tzeng

Horng

Tsou

Huang

Incorporating hidden Markov models for identifying protein kinase-specific phosphorylation sites

J. Comput. Chem.

2005

, vol.

(pg.

1032

1041

)

Huang

Lee

Tzeng

Horng

KinasePhos: a web tool for identifying protein kinase-specific phosphorylation sites

Nucleic Acids Res.

2005

, vol.

(pg.

W226

W229

)

Wong

Lee

Liang

Huang

Wang

Yang

Chu

Huang

Hwang

KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns

Nucleic Acids Res.

2007

, vol.

(pg.

W588

W594

)

Lee

Hsu

Chang

Wang

Hsu

Huang

A comprehensive resource for integrating and displaying protein post-translational modifications

BMC Res. Notes

2009

, vol.

pg.

111

Lee

Lin

Hsieh

Bretana

Exploiting maximal dependence decomposition to identify conserved motifs from a group of aligned signal sequences

Bioinformatics

2011

, vol.

(pg.

1780

1787

)

Seet

Dikic

Zhou

Pawson

Reading protein modifications with interaction domains

Nat. Rev. Mol. Cell Biol.

2006

, vol.

(pg.

473

483

)

Gnad

Gunawardena

Mann

PHOSIDA 2011: the posttranslational modification database

Nucleic Acids Res.

2011

, vol.

(pg.

D253

D260

)

Mishra

Suresh

Kumaran

Kannabiran

Suresh

Bala

Shivakumar

Anuradha

Reddy

Raghavan

, et al.

Human protein reference database—2006 update

Nucleic Acids Res.

2006

, vol.

(pg.

D411

D414

)

Garavelli

The RESID Database of Protein Modifications as a resource and annotation tool

Proteomics

2004

, vol.

(pg.

1527

1533

)

Lee

Bretana

PlantPhos: using maximal dependence decomposition to identify plant phosphorylation sites with substrate site specificity

BMC Bioinformatics

2011

, vol.

pg.

261

Bretana

Chiang

Huang

Lee

Weng

Identifying protein phosphorylation sites with kinase substrate specificity on human viruses

PLoS One

2012

, vol.

pg.

e40694

Lee

Chen

Huang

SNOSite: exploiting maximal dependence decomposition to identify cysteine S-nitrosylation with substrate site specificity

PLoS One

2011

, vol.

pg.

e21849

Bradshaw

Waksman

Molecular recognition by SH2 domains

Adv. Protein Chem.

2002

, vol.

(pg.

161

210

)

Verkhivker

Bouzida

Gehlhaar

Rejto

Schaffer

Arthurs

Colson

Freer

Larson

Luty

, et al.

Hierarchy of simulation models in predicting molecular recognition mechanisms from the binding energy landscapes: structural analysis of the peptide complexes with SH2 domains

Proteins

2001

, vol.

(pg.

456

470

)

Hunter

Apweiler

Attwood

Bairoch

Bateman

Binns

Bork

Das

Daugherty

Duquenne

, et al.

InterPro: the integrative protein signature database

Nucleic Acids Res.

2009

, vol.

(pg.

D211

D215

)

Bairoch

PROSITE: a dictionary of sites and patterns in proteins

Nucleic Acids Res.

1991

, vol.

Suppl.

(pg.

2241

2245

)

Attwood

Beck

Bleasby

Parry-Smith

PRINTS—a database of protein motif fingerprints

Nucleic Acids Res.

1994

, vol.

(pg.

3590

3596

)

OpenURL Placeholder Text

Sonnhammer

Eddy

Durbin

Pfam: a comprehensive database of protein domain families based on seed alignments

Proteins

1997

, vol.

(pg.

405

420

)

Corpet

Gouzy

Kahn

The ProDom database of protein domain families

Nucleic Acids Res.

1998

, vol.

(pg.

323

326

)

Xenarios

Salwinski

Duan

Higney

Kim

Eisenberg

DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions

Nucleic Acids Res.

2002

, vol.

(pg.

303

305

)

Chatr-Aryamontri

Ceol

Palazzi

Nardelli

Schneider

Castagnoli

Cesareni

MINT: the Molecular INTeraction database

Nucleic Acids Res.

2006

, vol.

(pg.

D572

D574

)

Kerrien

Alam-Faruque

Aranda

Bancarz

Bridge

Derow

Dimmer

Feuermann

Friedrichsen

Huntley

, et al.

IntAct—open source resource for molecular interaction data

Nucleic Acids Res.

2007

, vol.

(pg.

D561

D565

)

von Mering

Jensen

Kuhn

Chaffron

Doerks

Kruger

Snel

Bork

STRING 7—recent developments in the integration and prediction of protein interactions

Nucleic Acids Res.

2007

, vol.

(pg.

D358

D362

)

Zhang

Tan

Lin

InterDom: a database of putative interacting protein domains for validating predicted protein interactions and complexes

Nucleic Acids Res.

2003

, vol.

(pg.

251

254

)

Vinothkumar

Henderson

Structures of membrane proteins

Q. Rev. Biophys.

2010

, vol.

(pg.

158

)

Wallin

von Heijne

Genome-wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms

Protein Sci.

1998

, vol.

(pg.

1029

1038

)

Rose

Beran

Bluhm

Dimitropoulos

Goodsell

Prlic

Quesada

Quinn

Westbrook

, et al.

The RCSB Protein Data Bank: redesigned web site and web services

Nucleic Acids Res.

2011

, vol.

(pg.

D392

D401

)

Ackers

Smith

Effects of site-specific amino acid modification on protein interactions and biological function

Annu. Rev. Biochem.

1985

, vol.

(pg.

597

629

)

Cheng

Chiu

Sung

Hsu

TMPad: an integrated structural database for helix-packing folds in transmembrane proteins

Nucleic Acids Res.

2011

, vol.

(pg.

D347

D355

)

Tusnady

Kalmar

Simon

TOPDB: topology data bank of transmembrane proteins

Nucleic Acids Res.

2008

, vol.

(pg.

D234

D239

)

Tusnady

Dosztanyi

Simon

PDB_TM: selection and membrane localization of transmembrane proteins in the protein data bank

Nucleic Acids Res.

2005

, vol.

(pg.

D275

D278

)

Lomize

Pogozheva

Mosberg

OPM: orientations of proteins in membranes database

Bioinformatics

2006

, vol.

(pg.

623

625

)

Bairoch

Apweiler

Barker

Boeckmann

Ferro

Gasteiger

Huang

Lopez

Magrane

, et al.

The Universal Protein Resource (UniProt)

Nucleic Acids Res.

2005

, vol.

(pg.

D154

D159

)

Nugent

Jones

Transmembrane protein topology prediction using support vector machines

BMC Bioinformatics

2009

, vol.

pg.

159

Herraez

Biomolecules in the computer: Jmol to the rescue

Biochem. Mol. Biol. Educ.

2006

, vol.

(pg.

255

261

)

Consortium

TGO

The Gene Ontology: enhancements for 2011

Nucleic Acids Res.

2011

, vol.

(pg.

D559

D564

)

Tatusov

Fedorova

Jackson

Jacobs

Kiryutin

Koonin

Krylov

Mazumder

Mekhedov

Nikolskaya

, et al.

The COG database: an updated version includes eukaryotes

BMC Bioinformatics

2003

, vol.

pg.

Binns

Dimmer

Huntley

Barrell

O’Donovan

Apweiler

QuickGO: a web-based tool for Gene Ontology searching

Bioinformatics

2009

, vol.

(pg.

3045

3046

)

Kabsch

Sander

Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features

Biopolymers

1983

, vol.

(pg.

2577

2637

)

Shien

Lee

Chang

Hsu

Horng

Hsu

Wang

Huang

Incorporating structural characteristics for identification of protein methylation sites

J. Comput. Chem.

2009

, vol.

(pg.

1532

1543

)

Chen

Bretana

Cheng

Lee

Carboxylator: incorporating solvent-accessible surface area for identifying protein carboxylation sites

J. Comput. Aided Mol. Des.

, vol.

(pg.

987

995

)

Lee

Hsu

Lin

Chang

Hsu

Huang

N-Ace: using solvent accessibility and physicochemical properties to identify protein N-acetylation sites

J. Comput. Chem.

, vol.

(pg.

2759

2771

)

Ahmad

Gromiha

Sarai

RVP-net: online prediction of real valued accessible surface area of proteins from single sequences

Bioinformatics

2003

, vol.

(pg.

1849

1851

)

McGuffin

Bryson

Jones

The PSIPRED protein structure prediction server

Bioinformatics

2000

, vol.

(pg.

404

405

)

Thompson

Higgins

Gibson

CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice

Nucleic Acids Res.

1994

, vol.

(pg.

4673

4680

)

Altschul

Madden

Schaffer

Zhang

Miller

Lipman

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs

Nucleic Acids Res.

1997

, vol.

(pg.

3389

3402

)

Lee

Hsu

Lin

Chang

Hsu

Huang

N-Ace: using solvent accessibility and physicochemical properties to identify protein N-acetylation sites

J. Comput. Chem.

2010

, vol.

(pg.

2759

2771

)

Crooks

Hon

Chandonia

Brenner

WebLogo: a sequence logo generator

Genome Res.

2004

, vol.

(pg.

1188

1190

)

Hao

Derakhshan

Shi

Campagne

Gross

SNOSID, a proteomic method for identification of cysteine S-nitrosylation sites in complex protein mixtures

Proc. Natl Acad. Sci. USA

2006

, vol.

103

(pg.

1012

1017

)

Greco

Hodara

Parastatidis

Heijnen

Dennehy

Liebler

Ischiropoulos

Identification of S-nitrosylation motifs by site-specific mapping of the S-nitrosocysteine proteome in human vascular smooth muscle cells

Proc. Natl Acad. Sci. USA

2006

, vol.

103

(pg.

7420

7425

)

Lane

Hao

Gross

S-nitrosylation is emerging as a specific and fundamental posttranslational protein modification: head-to-head comparison with O-phosphorylation

Sci STKE

2001

, vol.

2001

pg.

re1

OpenURL Placeholder Text

Stamler

Toone

Lipton

Sucher

(S)NO signals: translocation, regulation, and a consensus motif

Neuron

1997

, vol.

(pg.

691

696

)

Greco

Hodara

Parastatidis

Heijnen

Dennehy

Liebler

Ischiropoulos

Identification of S-nitrosylation motifs by site-specific mapping of the S-nitrosocysteine proteome in human vascular smooth muscle cells

Proc. Natl Acad. Sci. USA

2006

, vol.

103

(pg.

7420

7425

)