Global profiling of lysine 2-hydroxyisobutyrylome in Toxoplasma gondii using affinity purification mass spectrometry

Lysine 2-hydroxyisobutyrylation (Khib) is a recently discovered and evolutionarily conserved form of protein post-translational modification (PTM) found in mammalian and yeast cells. Previous studies have shown that Khib plays roles in the activity of gene transcription and Khib-containing proteins are closely related to the cellular metabolism. In this study, a global Khib-containing analysis using the latest databases (ToxoDB 46, 8322 sequences, downloaded on April 16, 2020) and sensitive immune-affinity enrichment coupled with liquid chromatography-tandem mass spectrometry was performed. A total of 1078 Khib modification sites across 400 Khib-containing proteins were identified in tachyzoites of Toxoplasma gondii RH strain. Bioinformatics and functional enrichment analysis showed that Khib-modified proteins were associated with various biological processes, such as ribosome, glycolysis/gluconeogenesis, and central carbon metabolism. Interestingly, many proteins of the secretory organelles (e.g., microneme, rhoptry, and dense granule) that play roles in the infection cycle of T. gondii were found to be Khib-modified, suggesting the involvement of Khib in key biological process during T. gondii infection. We also found that histone proteins, key enzymes related to cellular metabolism, and several glideosome components had Khib sites. These results expanded our understanding of the roles of Khib in T. gondii and should promote further investigations of how Khib regulates gene expression and key biological functions in T. gondii.


Introduction
Toxoplasma gondii is an obligate intracellular apicomplexan protozoan which has a worldwide distribution in humans and animals (Montoya and Liesenfeld 2004). Infection by this parasite can cause encephalitis and retinitis, and even death particularly in immunocompromised individuals (Elsheikha worldwide distribution, long-term persistent infection in the brain of the affected people (Rougier et al. 2017), a remarkable ability to cross biological barriers (Elsheikha and Khan 2010), including the blood-brain-barrier, blood-retinal-barrier, blood-placental-barrier, infecting the developing fetus to cause miscarriage and congenital malformations (Elsheikha 2008), and its association with neurophysiological disorders in adults .
These facts motivated the global scientific community to have a better understanding of the biology and pathogenesis of toxoplasmosis, and to identify factors essential for the growth and development of T. gondii. One of the areas that have witnessed an intensive effort in the last few years is the protein post-translational modifications (PTMs) because they play essential roles in multiple cellular processes and can greatly expand the proteome diversification and complexity. PTMs are dynamic processes that involve changing of protein properties, such as physicochemical characteristics, space conformation, and stability, by proteolytic cleavage or addition of a modifying group to an amino acid (Walsh et al. 2005). A number of PTMs have been identified, and several of which, such as acetylation (Xue et al. 2013;Cobbold et al. 2016), glycosylation (Fauquenoy et al. 2008;Wang et al. 2016), palmitoylation (Foe et al. 2015;Caballero et al. 2016), phosphorylation (Treeck et al. 2011), succinylation (Li et al. 2014), and ubiquitination (Silmon de Monerri et al. 2015), have been shown to function as key regulators of diverse biological processes and functions in the Apicomplexa parasites (Yakubu et al. 2018).
As regards acetylation, 2876 lysine acetylation sites across 1146 proteins have been identified in Plasmodium falciparum (Cobbold et al. 2016), and 411 lysine acetylation sites distributed in 274 proteins have been reported in T. gondii (Jeffers and Sullivan Jr 2012). A proteomic analysis of T. gondii confirmed that numerous N-and O-linked glycosylated sites were found in the micronemes, rhoptries, dense granules, and the components of glideosome, which are involved in motility, invasion, and intracellular survival (Fauquenoy et al. 2008;Wang et al. 2016). More than 30% of the predicted proteome have been shown to be phosphorylated in P. falciparum and T. gondii (Treeck et al. 2011;Alam et al. 2015), which play crucial regulatory roles in parasite motility, energy metabolism, and host-parasite interaction. In T. gondii, phosphorylation of a motor protein myosin A (MyoA) at two serine sites by calcium-dependent kinase 3 (CDPK3) can facilitate the initiation of parasite motility and egress (Gaji et al. 2015). A phosphorylation null mutant of glycogen phosphorylase (GP S25A ) in T. gondii PRU strain resulted in amylopectin accumulation, showing that GP phosphorylation is a regulatory factor for amylopectin storage and digestion (Sugi et al. 2017). Additionally, T. gondii rhoptry protein 16 (ROP16) can directly phosphorylate host signal transducer and activator of transcription (STAT)-1, STAT-3, STAT-5, and STAT-6 (Yamamoto et al. 2009;Ong et al. 2010;Butcher et al. 2011;Rosowski and Saeij 2012;Jensen et al. 2013), which are critical for host defense against T. gondii.
Lysine 2-hydroxyisobutyrylation (K hib ) is an evolutionarily conserved and abundant histone mark that has been detected in eukaryotic cells (Dai et al. 2014). H4K8 K hib has been shown to be involved in transcriptional activity in meiotic and post-meiotic cells (Dai et al. 2014) and glucose homeostasis in Saccharomyces cerevisiae (Huang et al. 2017). Also, histone K hib has been detected in Trypanosoma cruzi (Picchi et al. 2017). An earlier study also detected K hib along with crotonylation (K cr ) proteins in T. gondii (Yin et al. 2019). In the present study, using the latest databases ToxoDB 46, we identified some different K hib proteins that play important roles in T. gondii pathobiology. The K hib proteome of T. gondii RH tachyzoites was analyzed using liquid chromatography with tandem mass spectrometry (LC-MS-MS) coupled with highly affinity purification. More than 1000 K hib sites across 400 K hib proteins were identified, and these K hib proteins were mainly located in the cytoplasm, nucleus, extracellular, and mitochondria, and were primarily related to ribosome, glycolysis/gluconeogenesis, and central carbon metabolism in cancer.

Materials and methods
Parasite and cell culture maintenance Toxoplasma gondii RH strain was used in this study. Tachyzoites of T. gondii RH strain were originally stored and provided by the Department of Parasitology, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, Guangdong Province, China. This RH strain belonged to type I (ToxoDB #10) based on genotyping using Mn-PCR-RFLP (Liu et al. 2016). Tachyzoites of T. gondii RH strain were maintained in human foreskin fibroblast (HFF) cells (ATCC, Manassas, VA, USA) that were grown in Dulbecco's modified Eagle medium (DMEM) supplemented with 10% fetal bovine serum (Gibco, USA) and antibiotics (100 U/ml penicillin and 100 μg/ml streptomycin; Gibco) at 37°C with 5% CO 2 . When the infected cells were lysed (~within 3-4 days), the parasites and cells were harvested and passed through 25-gauge syringe needles. Tachyzoites were purified from host cell debris using 3-μm membrane filters (Millipore). The purified tachyzoites were washed with phosphate buffered saline (PBS) to remove any remaining host cell debris, and the purified parasite pellets were stored at − 80°C prior to protein extraction.

Protein extraction
The frozen tachyzoite pellets were resuspended and mixed with lysis buffer (8 M urea, 2 mM ethylenediaminetetraacetic acid (EDTA), 5 mM dithiothreitol (DTT), 3 μM trichostatin A (TSA), 50 mM nicotinamide (NAM), and 1% protease inhibitor cocktail) and then sonicated on ice. The cell debris was removed by centrifugation for 10 min at 4°C and 20,000g. The proteins were precipitated with 20% TCA for 2 h at 4°C. The supernatant was discarded by centrifugation at 12,000g for 3 min at 4°C. The remaining precipitate was desalted with cold acetone three times. The protein was dissolved in urea buffer and the protein concentration was determined using a Bradford protein assay kit and bovine serum albumin as a standard. Protein was digested with trypsin twice at trypsin to protein ratios of 1:50 and 1: 100 overnight.

Western blotting
The parasite lysates were separated by SDS-PAGE and transferred to polyvinylidene difluoride membranes (PVDF, Millipore). The K hib proteins were detected by incubation of the membrane with primary pan anti-K hib antibody (PTM Biolabs) and followed by incubation with secondary antibodies coupled with horseradish peroxidase (Thermo-Fisher Scientific, Waltham, MA). The signals of horseradish peroxidase (HRP) were detected by an enhanced chemiluminescence kit (Pierce).

Enrichment of 2-hydroxyisobutyrylated peptides
To enrich the K hib peptides, the tryptic peptides dissolved in NETN buffer (100 mM NaCl, 1 mM EDTA, 50 mM Tris-HCl, 0.5% NP-40, pH 8.0) were incubated with prewashed anti-K hib agarose-conjugated beads (PTM Biolabs, Hangzhou, China) with gentle shaking at 4°C overnight. The beads were washed four times with NETN buffer and three times with ddH 2 O (pH 8.0). The bound peptides were eluted with 0.1% trifluoroacetic acid (TFA) and dried by a vacuum. The resulting peptides were cleaned by C18 ZipTips (Millipore Corp., Bedford, MA) according to the manufacturer's instructions, prior to LC-MS/MS analysis.

LC-MS/MS analysis
The enriched K hib peptides were reconstituted in solvent A (0.1% formic acid in water) and loaded onto a C18 reversephase pre-column (Thermo-Fisher Scientific, Waltham, MA) to separate peptides. The gradient used was programed as follows: 6-23% solvent B (0.1% formic acid in 98% acetonitrile) for 26 min, 23-35% for 8 min and climbing to 80% in 3 min, then holding at 80% for the last 3 min. The eluted peptides were subjected to a NanoSpray Ionization source followed by MS/MS in Q Exactive (Thermo-Fisher Scientific) coupled online to the UPLC. Intact peptides were detected at a resolution of 70,000 in the Orbitrap. Peptides were selected for MS/MS analysis using NCE setting as 30; ion fragments were detected at a resolution of 17,500 in the Orbitrap. For MS scans, the m/z scan range was 350-1800.

Database search
Maxquant search engine (v.1.5.2.8) was used to process the MS/MS data. The mass spectra data were quried in UniProt T. gondii database against the ToxoDB 46, 8322 sequences, downloaded on April 16, 2020, and concatenated with reverse decoy database. Trypsin/P was allowed up to four missing cleavages, specified as the cleavage enzyme. Mass tolerances for precursor ions were set to 10 ppm. K hib on lysine K hib was specified as a variable modification, while cysteine carbamidomethylation of cysteine was set as a fixed modification parameter. False discovery rate (FDR) thresholds for peptides were set to 1%. All the other parameters in MaxQuant analysis were set to default values. The Maxquant label free quantification (LFQ) algorithm (Cox et al. 2014) was used to perform the label-free quantification. The site of lysine K hib site probability localization was set as > 0.75.

Bioinformatic analysis
Gene ontology (GO) annotation of proteins was performed to identify the enriched functional categories using UniProt-GOA (http://www.ebi.ac.uk/GOA/) and ToxoDB 46 database. When an identified protein was not annotated by UniProt-GOA and ToxoDB database, the InterProScan was used to annotate protein's GO function by the alignment of protein sequence. The lysine 2-hydroxyisobutyrylated proteins were classified into three categories based on GO annotation: biological process, cellular component, and molecular function. Domains of 2-hydroxyisobutyrylated proteins were annotated by InterProScan, using the InterPro domain database, based on the protein sequence alignment. Kyoto Encyclopedia of Genes and Genomics (KEGG) was searched to identify the protein pathway. Protein subcellular location was predicted by Wolfpsort (https://wolfpsort.hgc.jp/). The sequence model contained amino acids in specific position of modified-21-mers (10 amino acids upstream and downstream of the Khib site) was analyzed by MoMo (http:// meme-suite.org/tools/momo). The T. gondii proteome database was used as a background parameter, and other parameters were set as default. The GO, KEGG, and domain enrichment analysis of 2-hydroxyisobutyrylated proteins were performed using a two-tailed Fisher's exact test. The P value < 0.05 was considered to be significant. Differentially 2hydroxyisobutyrylated proteins were searched against the search tool for retrieval of interacting genes/proteins (STRING) database (http://string-db.org/) to obtain the protein-protein interaction (PPI) network. All parameters were set as default except the interaction score that was set at ≥ 0.7. Cytoscape (version 3.5.0) software was used to visualize the PPI network.

Results
Proteome-wide analysis of lysine 2hydroxyisobutyrylation sites and proteins in T. gondii To reveal the 2-hydroxyisobutyrylated proteins present in T. gondii, western blotting analysis using pan anti-K hib antibody was performed and showed a wide range of bands in the parasite tachyzoite lysate (Fig. 1a). Subsequently, a proteomic analysis based on LC-MS/MS and immune affinity was used to identify the global K hib proteome of T. gondii. To determine the quality of MS data, the mass error of identified peptides was checked. The peptide mass error was < 4 ppm, suggesting the accuracy of the MS data (Fig. 1b). Most of the identified peptides fell in the range of 7 to 17 amino acids in length, which were consistent with the properties of trypsin peptides (Fig. 1c).
In the present study, three parallel experiments (designated Exp 1, Exp 2, and Exp 3) were performed, 673 K hib sites on 297 K hib -containing proteins were identified in Exp 1, 676 K hib sites across 301 K hib -containing proteins were identified in Exp 2, and 659 K hib sites distributed on 297 K hib -containing proteins were identified in Exp 3. Of these K hib sites, about 47% were identified in at least two parallel experiments, indicating a high accuracy of these sites. Among the identified proteins, over 64% K hib -containing proteins consisted of 1 or 2 K hib sites, about 9% K hib -containing proteins contained > 5 K hib sites (Fig. 1d).

Functional annotation of the K hib -containing proteins of T. gondii
To have better understanding of the putative functions of the K hib -containing proteins in T. gondii, GO functional classification of all K hib -containing proteins was determined based on their biological processes, cellular components, and molecular functions (Fig. 2a-c). Within the biological processes, most K hib -containing proteins were involved in cellular metabolic processes, organic substance metabolic processes, and primary metabolic processes, accounting for 12% of all K hib -containing proteins, respectively (Fig. 2a). For the cellular components, the majority of K hib -containing proteins were enriched in intracellular (23%) (Fig. 2b). Molecular functions analysis showed that 16%, 12%, and 12% of the K hib -containing proteins were associated with protein binding, organic cyclic compound binding, and heterocyclic compound binding, respectively (Fig. 2c). For the subcellular localization, the K hib -containing proteins were mainly distributed in the cytoplasm (29%), nucleus (19%), extracellular (18%), and mitochondria (17%) (Fig. 2d).

Motifs analysis of lysine 2-hydroxyisobutyrylated peptides
To characterize the K hib -containing peptides, the specific amino acid biases adjacent to K hib sites in all the identified K hibcontaining peptides were analyzed by Motif-x algorithm. In total, seven conserved motifs were identified, namely K hib X 1 I, K hib X 5 K, KX 7 K hib , K hib X 4 K, KX 9 K hib , KX 6 K hib , and K hib X 6 K (motif score > 6.7, X represents an amino acid residue) (Fig. 3a). The enriched and depleted amino acid residues surrounding the K hib of all motifs are shown in a heatmap (Fig. 3b). Most positions of I, K, M, V, and Y amino acid residues around K hib site were overrepresented, whereas R, S, P, G, and E amino acid residues were underrepresented in the majority of positions (Fig. 3b) (red indicates that this amino acid is significantly enriched near the modification site, and green indicates that this amino acid is significantly reduced near the modification site).

Functional enrichment analysis
To reveal the biological functions of Khib-containing proteins, an enrichment analysis of the GO, KEGG, and domain databases was performed. GO enrichment analysis showed three categories, including cellular component, molecular function, and biological process, were enriched ( Fig. 4a-c). For the cellular component, the K hib -containing proteins were mainly enriched in mitochondria (Fig. 4a). For the molecular function, most K hib -containing proteins were associated with structural constituent of carbon-oxygen lyase activity, hydrolyase activity, and box C/D snoRNA binding (Fig. 4b). For the biological processes, the majority of the K hib -containing proteins were significantly related to ADP metabolic process, nucleoside diphosphate phosphorylation, and nucleotide phosphorlation (Fig. 4c). Protein domain enrichment analysis revealed that K hib -containing proteins were enriched in thioredoxin, proteasome subunit, ribosomal protein family, and AHPC/TSA family (Fig. 5a). KEGG enrichment analysis indicated that most K hib -containing proteins participated in ribosome, glycolysis/gluconeogenesis and central carbon metabolism (Fig. 5b), suggesting K hib involvement in energy metabolism processes.

PPI network of lysine K hib -containing proteins in T. gondii
To study the cellular processes regulated by K hib in T. gondii, the K hib PPI network was visualized by Cytoscape software. A total of 273 K hib -containing proteins were mapped to the protein interaction database (Fig. 6). The K hib -containing proteins were associated with ribosome, glycolysis/gluconeogenesis, aminoacyl-tRNA biosynthesis, and proteasome.

Discussion
Studies on PTMs in T. gondii are essential to provide valuable information on protein changes and the underling processes that mediate the parasite interaction with the host cells. In recent years, proteomic identification of K hib on histone and non-histone proteins has been reported in many species. A total of 6548 K hib sites distributed on 1725 proteins were discovered in mammalian cells (Huang et al. 2018). In plants, 9916 K hib sites across 2512 proteins were identified in developing rice seeds (Meng et al. 2017), and 11,976 K hib sites in 3001 proteins were found in Physcomitrella patens (Yu et al. 2017). In S. cerevisiae, 1458 K hib sites on 369 proteins were identified, many of which were enriched in the ribosome and glycolysis/glycogenesis pathways (Huang et al. 2017). In Proteus mirabilis, 4735 K hib sites on 1051 proteins were identified, and many K hib -containing proteins were associated with metabolic pathways, such as glycolysis/glycogenesis (Dong et al. 2018). In T. gondii, 9502 K hib sites on 1950 proteins were identified in the tachyzoites of T. gondii RH strain purified from peritoneal fluid of mice (Yin et al. 2019).
In the present study, we determined the K hib profile of T. gondii RH tachyzoites purified from HFF monolayers and explored the potential involvement of the identified K hib -containing proteins in the infection process by analyzing the K hib proteome using a high-resolution LC-MS/MS coupled with immune purification. We searched the latest version of the ToxoDB 46 database against ME49 strain (8322 sequences, accessed on April 16, 2020) and identified 1078 K hib sites across 400 K hib -containing proteins. For protein extraction, we used lysis buffer to lyse tachyzoites; it is inevitable that some insoluble cell membrane proteins may not be dissolved completely and removed with cell debris, but this will not have much impact on the experimental results in general as some previous results showed that PTMs have few or even no modification sites on the cell membrane (Meng et al. 2017;Sun et al. 2017;Wu et al. 2018;Nie et al. 2020). In a recent study, 2-hydroxyisobutyrylated proteins were mostly related to fatty acid degradation (Yin et al. 2019); however, in our study, K hib -containing proteins were primarily involved in ribosome, glycolysis/gluconeogenesis, and central carbon metabolism in cancer. In a previous study, the proteins were mainly distributed in the nucleus (Yin et al. 2019); whereas in the present study, proteins were mostly abundant in the cytoplasm. These differences may be caused by the updated databases and the different growth conditions of T. gondii RH strain used in both studies. Moreover, PPI analysis suggested that abundant interactions involved in important cellular processes were regulated by K hib modification.
Comparative analysis between human cells (Huang et al. 2018), Oryza sativa (Meng et al. 2017), P. patens (Yu et al. 2017), and T. gondii showed that K hib motif patterns are different from each other. However, the K, V, and Y residues were overrepresented in most positions around the K hib sites between T. gondii and P. patens, and I and V residues were overrepresented in the majority of positions in T. gondii and O. sativa, but P and S residues were underrepresented in T. gondii and O. sativa. The sequence logos showed a strong bias for isoleucine (I) downstream of the K hib sites, which was similar to the K mal bias for cysteine (C) detected in T. gondii (Nie et al. 2020), but was different from a recent study that reported that leucine (L), lysine (K), tyrosine (Y), and valine (V) occurred upstream of the K hib sites (Yin et al. 2019). This difference may be due to the different versions of the database used between the two studies.
The carbohydrate metabolism, including glycolysis/ gluconeo genesis, citrate cyc le, glyoxylate and dicarboxylate metabolism, starch and sucrose metabolism, pyruvate metabolism, and fructose and mannose metabolism, participates in the lytic cycle of T. gondii. Our analysis of the K hib proteomic database indicated that some of the modified proteins participated in metabolic processes, which is consistent with that of a previous study (Yin et al. 2019). For example, enolase 2 (ENO2) is an essential factor for the growth of T. gondii (Mouveaux et al. 2014). Fructose-1, 6-bisphosphate aldolase (ALD) is required for energy metabolism rather than host-cell invasion in T. gondii (Shen and Sibley 2014). In the glycolysis/gluconeogenesis and citrate cycle processes, there were many K hib -modified enzymes which are important for the energy supply of T. gondii, especially in the tachyzoites (fast replicating stage) and bradyzoites (slow replicating stage) (Nitzsche et al. 2017;Shukla et al. 2018). Additionally, the alpha-1,4 glucan phosphorylase containing the K hib sites is involved in amylopectin digestion, which is crucial for the development of T. gondii bradyzoite and latent infection (Sugi et al. 2017).
Most of K hib -containing proteins in carbohydrate metabolism are also enriched in other species. K hib -containing proteins were strongly enriched in the S. cerevisiae glycolysis/ gluconeogenesis pathway (Huang et al. 2017). In mammalian cells, several important enzymes were heavily modified, which is required for the glycolysis pathway, such as alphaenolase (ENO1) and fructose-bisphosphate aldolase (ALD) (Huang et al. 2018). In O. sativa seeds, most K hib -containing proteins were enriched in glycolysis/gluconeogenesis, citrate cycle, and starch and sucrose metabolism (Meng et al. 2017). These facts showed that the K hib modification could play key roles in glucose metabolism.
The lytic cycle of T. gondii, including invasion, replication, and egress, is largely regulated by three secretory organelles, including microneme, rhoptry, and dense granule. Many of the proteins secreted by these organelles were identified to be K hib . For example, the largest identified K hib protein was a chaperonin protein BiP with 26 sites, which is different from a previous report which showed the rate-limiting enzyme phosphofructokinase PFKII as the most significantly modified protein (Yin et al. 2019). The second was the heat shock protein HSP70 with 25 sites, which has also been found malonylated with five K mal sites in T. gondii (Nie et al. 2020). AMA1 and MIC2 were identified as K hib -containing proteins, which are involved in the attachment of extracellular parasites to the host membrane (Carruthers and Sibley 1997). In rhoptry proteins, RON2 containing 5 K hib sites can interact with AMA1 to maintain the moving junction (MJ) integrity and is essential for the internalization of T. gondii (Lamarque et al. 2014). The rhoptry kinase ROP17 (1 K hib sites) can manipulate monocyte migration to facilitate T. gondii dissemination (Drewry et al. 2019). GRA7 (2 K hib ) facilitates the virulence in mice (Alaganan et al. 2014), and GRA 12 (4 K hib ) plays an important role in mediating parasites resistance to host gamma interferon (Fox et al. 2019). These results indicated that K hib can play key roles in the lytic cycle of T. gondii.
In T. gondii invasion and egress, the glideosome provides the power in gliding motivity (Frénal et al. 2017). Several components of the glideosome were identified as K hib -containing proteins, including GAP45 (4 K hib sites), GAP50 (4 K hib sites), Myosin A (8 K hib sites), and TgMLC1 (3 K hib sites). Interestingly, changing the PTM sites of the glideosome components usually impairs the invasion, egress, and motility of T. gondii. The phosphorylation of Myosin A by CDPK3 contributes to the initiation of motility in T. gondii egress (Gaji et al. 2015). Mutations in acylation sites of GAP45 impair pellicle integrity in T. gondii invasion (Frénal et al. 2010). Thus, it will be interesting to study the K hib sites of some important proteins in the lytic cycle of T. gondii in the future.
In conclusion, this study provided a new proteome dataset of K hib and identified 1078 K hib modification sites across 400 K hib -containing proteins in T. gondii. These K hib -containing proteins participate in various cellular processes, such as ribosome, glycolysis/gluconeogenesis, and central carbon metabolism. These data expanded our understanding of K hib and provided new resources for further investigation of the roles of the lysine 2-hydroxyisobutyrylation in regulating different biological processes of T. gondii.