High-throughput expression, purification, and characterization of recombinant Caenorhabditis elegans proteins

https://doi.org/10.1016/S0006-291X(03)01265-8Get rights and content

Abstract

Modern proteomics approaches include techniques to examine the expression, localization, modifications, and complex formation of proteins in cells. In order to address issues of protein function in vitro using classical biochemical and biophysical approaches, high-throughput methods of cloning the appropriate reading frames, and expressing and purifying proteins efficiently are an important goal of modern proteomics approaches. This process becomes more difficult as functional proteomics efforts focus on the proteins from higher organisms, since issues of correctly identifying intron–exon boundaries and efficiently expressing and solubilizing the (often) multi-domain proteins from higher eukaryotes are challenging. Recently, 12,000 open-reading-frame (ORF) sequences from Caenorhabditis elegans have become available for functional proteomics studies [Nat. Gen. 34 (2003) 35]. We have implemented a high-throughput screening procedure to express, purify, and analyze by mass spectrometry hexa-histidine-tagged C. elegans ORFs in Escherichia coli using metal affinity ZipTips. We find that over 65% of the expressed proteins are of the correct mass as analyzed by matrix-assisted laser desorption MS. Many of the remaining proteins indicated to be “incorrect” can be explained by high-throughput cloning or genome database annotation errors. This provides a general understanding of the expected error rates in such high-throughput cloning projects. The ZipTip purified proteins can be further analyzed under both native and denaturing conditions for functional proteomics efforts.

Section snippets

Materials and methods

Expression constructs, cell culture, and lysis. The open reading frame sequences tested in this study were identified as DNA damage response (DDR) pathway interacting proteins in C. elegans that were mostly hypothetical and uncharacterized [2]. The 86 selected ORFs, which had sizes of the encoded proteins ranging from 10 to 110 kDa, were subcloned into the GATEWAY (Invitrogen) Entry vectors and subsequently transferred to the Destination vector (pDEST17) containing N-terminal 6× Histidine-tag [9]

Protein expression and solubility screening

The results of expressing 86 putative C. elegans DDR genes using the ZipTip/MALDI-MS method show that ∼62% of the genes were expressed and 5.4% of these were also soluble in the solvent tested (Fig. 2). The expressed genes were assigned as “correct mass” when the MALDI-MS measurements differ from the estimated mass based on the sequence prediction (gene plus 4 kb N- and C-terminal extra sequences) by less than 1%. The accuracy of the MALDI-MS measurements was determined by comparing to the mass

Discussion

When eukaryotic genes are expressed in prokaryotic organisms such as E. coli, they may not fold properly and may form aggregates (in the form of inclusion bodies) due to the absence of appropriate post-translational chaperones or processing [15]. However, the manipulability, short growth time, and low cost render E. coli the most widely used expression system for recombinant proteins. Therefore, many of the structural genomics projects have focused on searching for genes that express and fold

Acknowledgements

The authors thank Tim Blankenship and Elena Chernokalskaya from Millipore for their technical assistance in developing the ZipTip protein purification and Edward Nieves for his helpful suggestions on MALDI-MS methods. This research is supported in part by The Protein Structure Initiative P50-GM-62529, P41-EB-01979, and R33-CA-83179. The MALDI-MS experiments were performed in The Laboratory for Macromolecular Analysis and Proteomics (LMAP) at the Albert Einstein College of Medicine, which is

References (29)

  • A.C Gavin et al.

    Functional organization of the yeast proteome by systematic analysis of protein complexes

    Nature

    (2002)
  • J Reboul et al.

    C. elegans ORFeome version 1.1: experimental verification of the genome annotation and resource for proteome-scale protein expression

    Nat. Genet.

    (2003)
  • S.A Lesley et al.

    Structural genomics of the Thermotoga maritima proteome implemented in a high-throughput structure determination pipeline

    Proc. Natl. Acad. Sci. USA

    (2002)
  • J.R Hudson et al.

    The complete set of predicted genes from Saccharomyces cerevisiae in a readily usable form

    Genome Res.

    (1997)
  • Cited by (14)

    • Stressed worms: Responding to the post-genomics era

      2005, Molecular and Biochemical Parasitology
    • Life in the fast lane for protein crystallization and X-ray crystallography

      2005, Progress in Biophysics and Molecular Biology
    • Protein localization in proteomics

      2004, Current Opinion in Chemical Biology
    View all citing articles on Scopus
    1

    Present address: DNA Damage Response Lab, Cancer Research UK, London Research Institute, Clare Hall, UK.

    View full text