N-linked protein glycosylation is a common post-translational modification (PTMs) in many cellular processes. Atwood et al (RCMS 2005) describe a tandem mass spec-based methodology to analyze N-linked glycopeptides.
Enriched glycopeptides are treated with peptide N-glycosidase F, which removes the carbohydrate moieties from the peptide backbone. Deglycosylated peptides are analyzed with a tandem mass spec. The resulting MS/MS spectra are searched against a modified protein sequence database that allows only PTMs on N’s within the consensus sequence N-x-y, where x is any residue other than proline, and y is either serine or threonine.
To analyze this PTM on the deglycosylated peptides on SORCERER, we need to search for a monoisotopic mass shift of 0.9840 Da on N’s only in the {N[^P][ST]} consensus sequence.
To search this PTM on the SORCERER, we do the following 2 steps:
1) Create a new protein sequence database that replaces ‘N’ with ‘J’ in the consensus sequence.
2) Prepare this new sequence database for searching by defining ‘J’ to have the same mass as ‘N’ using a static modification setting on ‘J’.
3) Submit a search on SORCERER with a variable modification search on ‘J’ with a mass shift of +0.9840 Da.
Create New Protein Database
Use the MUSE script ‘nlinkglyco-fasta.mu’ (part of SORCERER PE v3.5) to create a new protein sequence database that replaces each N in the consensus sequence with J.
Simply log onto SORCERER, go to directory ‘/home/sorcerer/fasta/’ where the protein sequences are, and create a new fasta file from an existing one (for example, create ‘ipi.human_n2j.fasta’ from ‘ipi.HUMAN.fasta’) . Then use prepare this new fasta file for searching as you would any other protein sequence file.
Once you log onto the SORCERER, and type the following 2 commands (do not type the ’sorc$’ which is the SORCERER prompt):
sorc$ cd /home/sorcerer/fasta/
sorc$ nlinkglyco-fasta.mu < ipi.HUMAN.fasta > ipi.human_n2j.fasta
The latter command literally means to run the MUSE script using “standard input” from file ipi.HUMAN.fasta (after the ‘<’ symbol) and sending the “standard output” to the new file ipi.human_n2j.fasta (after the ‘>’ symbol).
(The script may be easily copied and modified for another consensus sequence. Contact TechTeam for details.)
Prepare Database for Searching
When the new protein sequence database is prepared for searching, assign a static modification ‘MakeN’ of -9885.95707256 Da. This will cause the final ‘J’ mass to be the monoisotopic mass of 114.04292744 Da. (The normally unused codes ‘J’ and ‘U’ are set at 10,000 Da to flag any inadvertent usage.) The resulting peptide database will be used for subsequent searching.
SORCERER Search
The search can now be submitted by creating a user-defined variable modification ‘Nlinkglyco’ with mass of 0.9840 Da on the residue ‘N’ against the new peptide database.
We thank Dr. Rebekah Gundry from the Van Eyk Lab at Johns Hopkins for bringing this SORCERER application to our attention!
Reference: Atwood et al (Rapid Comm Mass Spec 2005; 19: 3002-3006 DOI: 10.1002/rcm.2162)