What's New in Bottom-Up Proteomics?
Gary Kobs, Strategic Portfolio Manager, Proteomics
September 2019; tpub_217
Introduction
Bottom-up proteomics, also termed “shotgun proteomics" consists of three steps: protein extraction, protein digestion and protein analysis.
Protein extraction is a crucial step for MS-based proteomics experiments. The extraction procedure determines the population of proteins that will be analyzed and the environment in which the sample is digested. Enzymatic digestion precedes MS analysis.
Trypsin is the most widely used enzyme for protein digestion. Trypsin cleaves amino acid sequences after lysine and arginine, generating smaller peptide fragments. The size of such peptides is appropriate for ionization and further detection by MS.
Protocols for use in protein extraction and digestion are continually being improved. Here we summarize seven recent references for insights into the latest in proteomics techniques.
Quantitation of Histone Post-Translational Modifications
Guo, Q. et al. (2017) Assessment of quantification precision of histone post-translational modifications by using an ion trap and down to 50,000 cells as starting material. J. Proteome Res. 17, 234–42.
Summary: DNA is organized by protein:DNA complexes called nucleosomes in eukaryotes. Nucleosomes are composed of 147 base pairs of DNA wrapped around a histone octamer containing two copies of each core histone protein. Histone proteins play significant roles in many nuclear processes including transcription, DNA damage repair and heterochromatin formation. Histone proteins are extensively and dynamically post-translationally modified, and these post-translational modifications (PTMs) are thought to comprise a specific combinatorial PTM profile of a histone that dictates its specific function. Abnormal regulation of PTMs may lead to developmental disorders and disease development such as cancer.
Antibodies have been widely used to characterize histones and histone PTMs. However, antibody-based techniques have several limitations. Mass spectrometry (MS) has therefore emerged as the most suitable analytical tool to quantify proteomes and protein PTMs. The most commonly used strategy is still bottom-up MS, and the most widely adopted protocol includes derivatization of lysine residues in histones to allow trypsin to generate Arg-C-like peptides (4–20 amino acids). However, samples such as primary tissues, complex model systems and biofluids are hard to retrieve in large quantities. Because of this, it is critical to know whether the amount of sample available would lead to an exhaustive analysis if subjected to MS.
This publication examined reproducibility in quantification of histone PTMs using a wide range of starting materials, from 50,000 to 5,000,000 cells. They used four different cell lines: HeLa, 293T, human embryonic stem cells (hESCs) and myoblasts. Their results demonstrated that an accurate quantification of abundant histone PTMs can be efficiently obtained by using low-resolution MS and as few as 50,000 cells. Low-abundance histone marks showed more variability in quantification when comparing different amounts of starting material, so a larger amount of starting material (at least 500,000 cells) is recommended.
Affinity Purification for Protein:Protein Interactions
Zhang, Y. et al. (2017) Quantitative assessment of the effects of trypsin digestion methods on affinity purification−mass spectrometry-based protein−protein interaction analysis. J. Proteome Res. 16, 3068–82.
Summary: Protein:protein interactions (PPIs) play a key role in regulating cellular activities including DNA replication, transcription, translation, RNA splicing, protein secretion, cell cycle control and signal transduction. A comprehensive method is needed to identify PPIs before the significance of their interactions can be characterized. Affinity purification−mass spectrometry (AP−MS) has become the method of choice for discovering PPIs under native conditions to preserve PPIs. Using this method, the protein complexes are captured by antibodies specific for the bait proteins or for tags that were introduced on the bait proteins and pulled down onto immobilized protein A/G beads. The complexes are further digested into peptides with trypsin. The protein interactions of the bait proteins are identified by quantification of the tryptic peptides via mass spectrometry.
The success of AP-MS depends on the efficiency of trypsin digestion and the recovery of the tryptic peptides for MS analysis. Several different protocols have been used for trypsin digestion of protein complexes in AP-MS studies, but no systematic studies have been conducted on the impact of trypsin digestion conditions on the identification of PPIs.
Here, Zhang et al. used NFB/RelA and BRD4 as bait proteins and five different trypsin digestion conditions (two using “on beads” and three using “elution” digestion protocols). Although the performance of the trypsin digestion protocols changed slightly depending on the different bait proteins, antibodies and cell lines used, the authors of the paper found that elution digestion methods consistently outperformed on-beads digestion methods.
Optimized Protocols for Alternative Proteases
Giansanti, P. et al. (2016) Six alternative proteases for mass spectrometry based proteomics beyond trypsin. Nat. Protocols 11, 993–6.
Summary: Bottom-up proteomics focuses on the analysis of protein mixtures after enzymatic digestion of the proteins into peptides. The resulting complex mixture of peptides is analyzed by reverse-phase liquid chromatography (RPLC) coupled to tandem mass spectrometry (MS/MS). Identification of peptides and subsequently proteins is completed by matching peptide fragment ion spectra to theoretical spectra generated from protein databases.
Trypsin has become the gold standard for protein digestion to peptides for shotgun proteomics. Trypsin is a serine protease. It cleaves proteins into peptides with an average size of 700–1,500 daltons, which is in the ideal range for MS. It is highly specific, cutting at the carboxyl side of arginine and lysine residues. The C-terminal arginine and lysine peptides are charged, making them detectable by MS. Trypsin is highly active and tolerant of many additives.
Even with these technical features, the use of trypsin in bottom-up proteomics may impose certain limits in the ability to grasp the full proteome. Tightly folded proteins can resist trypsin digestion. Post-translational modifications (PTMs) present a different challenge for trypsin because glycans often limit trypsin access to cleavage sites, and acetylation makes lysine and arginine residues resistant to trypsin digestion.
To overcome these problems, the proteomics community has begun to explore alternative proteases to complement trypsin. However, protocols, as well as expected results generated when using these alternative proteases, have not been systematically documented.
In this publication optimized protocols for six alternative proteases that have already shown promise in their applicability in proteomics, namely chymotrypsin, Lys-C, Lys-N, Asp-N, Glu-C and Arg-C, have been created. Data describe the appropriate MS data analysis methods and the anticipated results in the case of the analysis of a single protein (BSA) and a more complex cellular lysate (E. coli).
Filter-Aided Sample Prep Prior to Mass Spec
Nel, A. et al. (2015) Comparative reevaluation of FASP and enhanced FASP methods by LC-MS/MS. J. Proteome Res. 14, 1637–42.
Summary: Filter-aided sample preparation (FASP) method is used for the on-filter digestion of proteins prior to mass-spectrometry-based analyses. FASP was designed for the removal of detergents and chaotropes that were used for sample preparation. In addition, FASP removes components such as salts, nucleic acids and lipids. Alkylation of reduced cysteine residues is also carried out on filter, after which protein is proteolyzed by use of trypsin on filter with the optimal enzyme buffer. Subsequent elution and desalting of the peptide-rich solution then provides a sample ready for LC–MS/MS analysis.
eFASP is an enhanced workflow that includes 0.2% DCA in the exchange, alkylation and digestion buffers, thus enhancing trypsin proteolysis and resulting in increased cytosolic and membrane protein representation. DCA has been reported to improve the efficiency of the denaturation, solubilization and tryptic digestion of proteins, particularly proteolytically resistant myoglobin and integral membrane proteins, thereby enhancing the efficiency of their identification with regard to the number of proteins and unique peptides identified.
Traditional FASP and eFASP were re-evaluated by ultra-high-performance liquid chromatography coupled to a quadrupole mass filter Orbitrap analyzer. The results indicate that, at the protein level, both methods extracted essentially the same number of hydrophobic transmembrane proteins as proteins associated with the cytoplasm or the cytoplasmic and outer membranes.
The LC–MS/MS results indicate that FASP and eFASP showed no significant differences at the protein level. However, because of the slight differences in selectivity at the physicochemical level of peptides, these methods can be seen to be complementary for analyses of complex peptide mixtures.
Using IMAC for Phosphopeptide Enrichment
Kanshin, E. et al. (2015) Sample collection method bias effects in quantitative phosphoproteomics. J. Proteome. Res. 14, 2998–3004.
Summary: Protein phosphorylation is a very important post-translational modification that controls many cellular processes,including metabolism, transcriptional and translation regulation, degradation of proteins, cellular signaling and communication, proliferation, differentiation and cell survival. Approximately 35% of human proteins are phosphorylated. Phosphoproteins are low in abundance and therefore, are challenging to detect and characterize by mass spectrometry.
Different enrichment systems have been developed to isolate phosphopeptides. Among these techniques, immobilized metal affinity chromatography (IMAC) using Fe3+ and Ga3+ has been widely used for the enrichment of phosphopeptides. Typical experimental workflows are tedious and consist of numerous steps, including sample collection and cell lysis. One of the major challenges of the process is to maintain the in vivo phosphorylation state of the proteins throughout the preparation process.
To evaluate the effect of sample collection protocols on the global phosphorylation status of the cell, this publication compared different sample workflows by metabolic labeling and quantitative mass spectrometry on Saccharomyces cerevisiae cell cultures. Three different sample collection workflows were evaluated: two workflows used denaturing conditions and involved mixing of cell cultures with an excess of either ethanol (EtOH) at −80°C or trichloroacetic acid (TCA). A third workflow was performed under nondenaturing conditions and included washing cells in PBS.
Their data suggest that either TCA or EtOH sample collection protocols introduced lower collection bias than the PBS protocol. It was also suggested that similar studies be carried out to determine the effects of sample preparation on other post-translational modifications, such as acetylation or ubiquitination.
Evaluating CSF for Potential Brain Biomarkers
Galindo, M-N. et al. (2015) Proteomics of Cerebrospinal Fluid: Throughput and Robustness Using a Scalable Automated Analysis Pipeline for Biomarker Discovery. Anal. Chem. 87, 10755–61.
Summary: Cerebrospinal fluid (CSF) is a bodily fluid present around the brain and in the spinal column. It acts as a protective cushion against shocks and participates in the immune response in the brain. Analysis of total CSF protein can be used for diagnostic purposes, for instance as the sign of a tumor, bleeding, inflammation or injury. Considering the high value of CSF as a source of potential biomarkers for brain-associated damages and pathologies, the development of robust automated platform for CSF proteomics is of great value.
The scalable automated proteomic pipeline (ASAP2) was initially developed with the purpose of discovering protein biomarkers in plasma A summary of the ASAP2 process is as follows: As a first step, abundant-protein immuno-affinity depletion is performed with antibody-based columns and LC systems equipped with a refrigerated autosampler and fraction collector. This block is linked to and followed by buffer exchange performed in a 96-well plate format by manual operations that require <1 hour to be completed. The rest of the process is fully automated and includes (i) reduction, alkylation, enzymatic digestion; (ii) tandem mass tag (TMT) labeling and pooling; (iii) RP solid-phase extraction (SPE) purification; and (iv) strong cation-exchange (SCX) SPE purification.
This publication validated the use of ASAP2 for sample preparation, and proteomic analysis of human CSF samples was performed. CSF samples were first depleted from abundant proteins by multiplexed immuno-affinity. Subsequently, reduction, alkylation, protein digestion (using Trypsin/Lys-C), TMTsixplex™ labeling, pooling and sample cleanup were performed in a 96-well plate format using a liquid-handling robotic platform. Ninety-six identical CSF samples were prepared using the highly automated ASAP2 procedure. Proteome coverage consistency, quantitative precision and individual protein variability were determined. Results indicated that ASAP2 is efficient in analyzing large numbers of human CSF samples and would be a valuable tool for biomarker discovery.
Discovering Biomarkers in Plasma
Proc, J.L. et al. (2010) A quantitative study of the effects of chaotropic agents, surfactants and solvents on the digestion efficiency of human plasma proteins by trypsin. J. Proteome Res. 9, 5422–37.
Summary: Biomarkers in biological fluids in particular have the potential to inform regarding risk of disease or to allow early detection for more effective treatment. Plasma/serum is considered the universal source of biomarkers. This fluid is, indeed, easily collected, and the important point is that plasma collects proteins from each and every tissue, compared to other fluids such as urine or cerebrospinal fluid. Optimizing experimental conditions (i.e., use of trypsin for the digestion of target proteins) used to discover or monitor biomarkers in plasma is critical to successful detection of biomarkers.
In this publication, plasma denaturation/digestion protocols were compared using quantitative methods. Fourteen combinations of heat, solvent (acetonitrile, methanol, trifluoroethanol), chaotropic agents (guanidine hydrochloride, urea), surfactants (sodium dodecyl sulfate (SDS) and sodium deoxycholate (DOC) were evaluated on their effectiveness in improving tryptic digestion. Digestion efficiency was monitored by quantitating the peptides from 45 moderate- to high-abundance plasma proteins using tandem mass spectrometry in multiple reaction mode with a mixture of stable isotope-labeled analogues of these peptides as internal standards. The results noted that use of either DOC and SDS produced an increase in the overall yield of tryptic peptides. Since SDS is not compatible with mass spectrometry and DOC can be easily removed by acid precipitation, the overall recommendation was the use of DOC with a nine-hour digestion procedure.
TMTsixplex is a trademark of Proteome Sciences plc.