Modeling and prediction of protein solubility using the second osmotic virial coefficient
Introduction
The industrial protein production has gained an increased interest in industrial and academic research within the last decade. Especially within the field of medical (red) biotechnology, where pharmaceutical proteins such as monoclonal antibodies (mAB) are produced, the number of processes on an industrial scale has been steadily increasing.
One major bottleneck in state-of-the-art industrial (pharmaceutical) protein production is the downstream processing. Historically, it is often accomplished by a series of cost-intensive chromatographic steps. This results from both, costly chromatographic material, as well as the low capacity of these workup steps. This leads to the fact, that the downstream processing itself can cover up to 80% of the total production costs. For economic and efficient processes there is thus a demand for alternative downstream processing concepts [1]. One alternative to chromatographic separation steps, already widely used in the chemical industry, is crystallization. This can be used either for the initial product capture (precipitation) or final product polishing (crystallization) [2]. Besides this, protein crystals usually have a high purity and a higher stability compared to proteins in solution which makes them attractive for storage and formulation, later needed in pharmaceutical processes [3].
For developing crystallization processes, the solubility of a protein in solution is the most important information. Unfortunately this quantity is not easily accessible. Protein solubility is influenced by the type of solvent or buffer used, the pH, the precipitating agent (e.g. salt, alcohol, polymer), and temperature. In state-of-the-art investigations for crystallization processes, a high experimental effort is applied for the screening of potential crystallization conditions. For this purpose, often methods like the sitting-drop or the hanging-drop method are used [4] and applied in well-plates covering up to 384 different crystallization conditions per plate, depending on the plate used. Once a potential crystallization condition is found, the protein solubility is then measured by determining the protein concentration of the mother liquor with UV absorption after equilibration [5].
In order to decrease this experimental effort and to simplify the development of crystallization processes, the prediction of the protein solubility, as function of parameters such as pH, kind of salt, salt concentration, or temperature, by a physically-based thermodynamic model is of high interest. First approaches to model protein solubility in aqueous solutions containing a salt have been published in open literature back in 1925 by Cohn [6]. Melander and Horvath [7] developed empirical equations to correlate protein solubility and hydrophobic effects of the protein in aqueous solutions. Unfortunately, as shown by Przybycien and Bailey [8], [9], these empirical equations are only valid for conformationally robust proteins such as lysozyme or chymotrypsin. In 1998 Terry Jenkins [10] proposed three empirical equations, relating protein solubility and salt concentration in terms of either salt molarity, salt activity or water activity. As described by Naik and Bhagwat, the major drawback of these empirical equations is that protein solubility is not predictable for systems or conditions different from those used for parameter fitting [11]. One of the latest approaches was presented by Agena et al. [12] who used a UNIQUAC-based approach with temperature-dependent parameters to model the solubility of lysozyme and concanavalin in aqueous salt solution. Their results agreed qualitatively with experimental data but describes only the protein solubility as function of temperature but not of pH or salt concentration [11].
All of these models have in common, that they require a high amount of experimental data in order to fit parameters used for the modeling (e.g. UNIQUAC pure-component or binary interaction parameters). Furthermore the predictive capabilities are very limited, since Agena et al. e.g. did not account for the influence of pH on solubility [11], [13].
In order to provide a model which is capable of predicting protein solubility based on a minimal set of experimental data, the second osmotic virial coefficient (B22) is used within this work as easily-accessible property for characterizing aqueous protein solutions containing a salt. B22 serves as an ideal measure, as it describes the complex interactions between two solute molecules (e.g. proteins) in solution, by at the same time accounting for the influence of salt, salt concentration, pH, and temperature. If B22 is negative, attractive interactions between the solute molecules, in our case proteins, in solution dominate, favoring crystallization or precipitation [14].
First shown in 1999 by Haas et al., the protein solubility can be modeled as function of B22 [15]. Haas et al. developed a solubility model based on the Gibbs energy of an aqueous protein solution in equilibrium with a crystalline protein containing a considerable amount of water derived from a simple lattice model. Using a value for B22 estimated from a square-well potential, the Gibbs energy of the liquid phase was calculated and protein solubility was estimated using this B22 [15]. Another solubility model was developed by Ruppert et al. to model the protein solubility from B22 [16]. In this model the fugacity of crystalline and dissolved protein was equalized and the activity coefficient of the protein was related to B22. Mehta et al. compared both models and concluded that the model from Ruppert et al. provides better results than the model from Haas et al., since the first model has two fitting parameters whereas the latter one has only one fitting parameter [13]. The major drawback of the model by Ruppert et al. is the use of two adjustable parameters which both depend on solvent, type of salt and protein. A prediction, transferring these parameters to different systems, is not possible.
In general, using the solubility models from literature, protein solubility can only be calculated for those concentrations where experimentally determined B22 data is available. These limitations arise from the method used for fitting the model parameters where pairs of B22 and protein solubility are needed. As the protein solubility can well be measured at high salt concentrations, and B22 at low salt concentrations, these methods are limited to a narrow intersection of the two concentration ranges.
To avoid these problems and to improve the solubility model enabling a prediction of the protein solubility even for salt concentrations, pH, and temperature ranges where experimental B22 is unavailable, a new model has to be supplied.
In this work a new solubility model was developed based on a modified form of the solubility equation of Ruppert et al. combined with the xDLVO model of Asakura and Oosawa [17] to predict B22 data for different temperature, salt type, salt concentration and pH. The xDLVO model was used to model and to predict B22 data of lysozyme and monoclonal antibody (mAb) over a broad salt-concentration range from salt-free solution to saturated salt solutions. B22 data for the mAb were also measured at different pH. Using the B22 data from xDLVO, the protein solubility was then estimated applying the modified solubility equation of Ruppert et al.
This approach allows for predicting the protein solubility in a broad salt-concentration range and for different pH. This leads to a decrease of experimental effort and significantly reduces the time for developing protein-production processes.
Section snippets
Materials
The proteins used in this study were lysozyme from chicken egg white (14.4 kDa) and a monoclonal antibody (144.2 kDa). Lysozyme from chicken egg white (CAS: 12650-88-3) was purchased from Sigma Aldrich (Steinheim, Germany). The monoclonal antibody, an IgG 1, was supplied in an aqueous solution of PBS buffer (10 mM Na2HPO4, 1.5 mM KH2PO4, 2.7 mM KCl, 138 mM NaCl) by Bayer HealthCare (Wuppertal, Germany). Sodium chloride (NaCl, CAS: 7647-14-5), sodium p-toluenesulfonate (Na-p-Ts, CAS: 657-84-1),
Protein solubility as function of B22
The starting point for the solubility model is the equilibrium of the chemical potentials of pure solid protein and the protein in liquid phase :
The assumption, that the protein is pure in solid form is reasonable, because the interstitial water as well the small molecule solutes will not significantly affect the thermodynamic properties of the solid protein [16].
The chemical potential of the protein in the liquid phase can be described by the chemical potential of
Second osmotic virial coefficients
Experimental B22 data of lysozyme at pH 4.2 in the presence of NaCl were taken from literature [35]. B22 data of lysozyme at pH 4.6 in the presence of Na-p-Ts were measured via static light scattering (see Table 2) as described in Section 2.2.4. Additionally, B22 data were also measured for the mAb in presence of (NH4)2SO4 at different pH (6.5–7.4) (see Table 3). The experimental B22 data marked with superscript a in Table 2, Table 3 was used to fit the parameters a through d of the xDLVO model
Conclusion
Within this work, a new solubility model for proteins based on modeling protein solubility with the second osmotic virial coefficient B22 was developed. In contrast to previous works where only experimental B22 data was used, protein solubility was modeled with B22 data retrieved from a potential of mean force model (xDLVO model). With this solubility model protein solubility was modeled as function of salt concentration, pH and temperature.
On major drawback of models using only experimental B22
Acknowledgments
The authors kindly thank the Ministry for Innovation, Science, Research and Technology of the State of North Rhine-Westphalia (MIWF, NRW, Germany) and the European Union for funding within the project “Modulare Bioproduktion – Disposable und Kontinuierlich” (MoBiDiK, 005-1009-0053, Ziel2.NRW).
References (40)
- et al.
Optimisation of aqueous two-phase extraction of human antibodies
J. Biotechnol.
(2007) - et al.
Salt effects on hydrophobic interactions in precipitation and chromatography of proteins: an interpretation of the lyotropic series
Arch. Biochem. Biophys.
(1977) - et al.
Solubility-activity relationships in the inorganic salt-induced precipitation of α-chymotrypsin
Enzyme Microb. Technol.
(1989) - et al.
Structure-function relationships in the inorganic salt-induced precipitation of α-chymotrypsin
Biochim. Biophys. Acta (BBA) – Protein Struct. Mol. Enzym.
(1989) - et al.
Second virial coefficient: variations with lysozyme crystallization conditions
J. Cryst. Growth
(1999) - et al.
Calculation of protein extinction coefficients from amino acid sequence data
Anal. Biochem.
(1989) - et al.
On the thermodynamics of the McMillan–Mayer state function
Fluid Phase Equilibria
(2009) - et al.
The osmotic second virial coefficient and the Gibbs–McMillan–Mayer framework
Fluid Phase Equilibria
(2009) - et al.
Formation dynamics of protein precrystallization fractal clusters
J. Cryst. Growth
(1993) Amino acid and peptide net charges: a simple calculational procedure
Biochem. Educ.
(1985)
Variation of lysozyme solubility as a function of temperature in the presence of organic and inorganic salts
J. Cryst. Growth
Correlation of second virial coefficients and solubilities useful in protein crystal growth
J. Cryst. Growth
Crystal nucleation rates for particles experiencing short-range attractions: applications to proteins
J. Colloid Interface Sci.
A simple empirical model describing the thermodynamics of hydration of ions of widely varying charges, sizes, and shapes
Biophys. Chem.
Protein-protein and protein-salt interactions in aqueous protein solutions containing concentrated electrolytes
Biotechnol. Bioeng.
Crystallization of recombinant human growth hormone at elevated pressures: pressure effects on PEG-induced volume exclusion interactions
Biotechnol. Bioeng.
X-ray scattering studies of Aspergillus flavus urate oxidase: towards a better understanding of PEG effects on the crystallization of large proteins
Acta Crystallogr. Sect. D
Correlation between the osmotic second virial coefficient and solubility for equine serum albumin and ovalbumin
Acta Crystallogr. Sect. D
The physical chemistry of the proteins
Physiol. Rev.
Three solutions of the protein solubility problem
Protein Sci.
Cited by (17)
Measuring and modeling thermodynamic properties of aqueous lysozyme and BSA solutions
2018, Fluid Phase EquilibriaLight-scattering data of protein and polymer solutions: A new approach for model validation and parameter estimation
2018, Fluid Phase EquilibriaCitation Excerpt :The samples were filtered inline by a 0.1 μm filter (Anodisc 13, Whatman) (see Fig.1). The SLS data of the binary and ternary solutions was measured and evaluated in terms of intensity of scattering as shown in previous works [15,16]. According to PC-SAFT, Ares is calculated as sum of different energy contributions such as repulsive hard-chain contribution Ahard-chain, attractive dispersion (van der Waals attraction) and association contributions (formation of hydrogen bonds) Adispersion and Aassociation.
Equilibrium in electrolyte systems
2018, Thermodynamics of Phase Equilibria in Food EngineeringHigh-concentration protein formulations: How high is high?
2017, European Journal of Pharmaceutics and BiopharmaceuticsCitation Excerpt :Measuring protein solubility is very challenging, and often surrogate parameters are used, such as opalescence or protein-protein interaction parameters, to obtain or derive apparent solubility data [64,33,67,22]. At best comparative solubility experiments are performed that allow ranking different solution conditions with regards to protein solubility [64,34,19,6,22]. Kramer et al. [31], however, emphasised the difficulty of obtaining quantitative solubility data, because in solubility experiments it is often highly demanding to get reproducible and reliable measurements due to potential protein gel or supersaturated solution formation [31].
Solubilization of proteins in aqueous two-phase extraction through combinations of phase-formers and displacement agents
2017, European Journal of Pharmaceutics and BiopharmaceuticsCitation Excerpt :The weight fraction of PEG of each phase was subsequently calculated by mass balance. The second osmotic virial coefficient B22 was measured by composition gradient multi-angle light scattering (CG-MALS) using a static light scattering apparatus (DAWN HELIOS 8+), a refractive index detector (Optilab T-rEX) and a pumping and mixing unit (Calypso II) purchased from Wyatt Technology (Santa Barbara, USA) as described by [20,21,23]. The B22 measurements of both proteins as function of the solute concentration at 298.15 K were performed in a diluted 50 mM K2HPO4-NaH2PO4 buffer solution ensuring a pH of 7.
Precipitation of lysozyme with sodium succinate, sodium tartrate and sodium citrate: Solubility and osmotic second virial coefficient data
2017, Journal of Chemical ThermodynamicsCitation Excerpt :Experimental results for the second virial coefficient are rarer in literature than phase equilibrium data, and comparisons must be made cautiously, as different methods may yield different values for the same conditions [43]. The general trend that the value of B22 is negative and decreases as the ionic strength increases occurs also in solutions of lysozyme and other salts, such as sodium chloride and sodium p-toluenesulfonate, as presented by Herhut et al. [44], sodium chloride, as presented by Le Brun et al. [45], Neal et al. [46] and Schulgin and Ruckenstein [47], and ammonium sulfate, as presented by Curtis et al. [48]. The general trend of B22 can also be inferred from cloud-point measurements, as shown by Janc et al. [49] for lysozyme in solutions containing sodium salts.