Experimental study and computational modelling of cruzain cysteine protease inhibition by dipeptidyl

A combined computational and experimental study aimed to gain insights into the reaction inhibition mechanism of cruzain dipeptidyl Chagas disease aﬀects millions of people in Latin America. This disease is caused by the protozoan parasite Trypanossoma cruzi . The cysteine protease cruzain is a key enzyme for the survival and propagation of this parasite lifecycle. Nitrile-based inhibitors are eﬃcient inhibitors of cruzain that bind by forming a covalent bond with this enzyme. Here, three nitrile-based inhibitors dubbed Neq0409, Neq0410 and Neq0570 were synthesized, and the thermodynamic profile of the bimolecular interaction with cruzain was determined using isothermal titration calorimetry (ITC). The result suggests the inhibition process is enthalpy driven, with a detrimental contribution of entropy. In addition, we have used hybrid Quantum Mechanical/Molecular Mechanical (QM/MM) and Molecular Dynamics (MD) simulations to investigate the reaction mechanism of reversible covalent modification of cruzain by Neq0409, Neq0410 and Neq0570. The computed free energy profile shows that the nucleophilic attack of Cys25 on the carbon C1 of inhibitiors and the proton transfer from His162 to N1 of the dipeptidyl nitrile inhibitor take place in a single step. The calculated free energy of the inhibiton reaction is in agreement with covalent experimental binding. Altogether, the results reported here suggests that nitrile-based inhibitors are good candidates for the development of reversible covalent inhibitors of cruzain and other cysteine proteases.


Funding information
Providing accurate funding information will enable us to help you comply with your funders' reporting mandates. Clear acknowledgement of funder support is an important consideration in funding evaluation and can increase your chances of securing funding in the future.
We work closely with Crossref to make your research discoverable through the Funding Data search tool (http://search.crossref.org/funding). Funding Data provides a reliable way to track the impact of the work that funders support. Accurate funder information will also help us (i) identify articles that are mandated to be deposited in PubMed Central (PMC) and deposit these on your behalf, and (ii) identify articles funded as part of the CHORUS initiative and display the Accepted Manuscript on our web site after an embargo period of 12 months.
Further information can be found on our webpage (http://rsc.li/funding-info).

What we do with funding information
We have combined the information you gave us on submission with the information in your acknowledgements. This will help ensure the funding information is as complete as possible and matches funders listed in the Crossref Funder Registry.
If a funding organisation you included in your acknowledgements or on submission of your article is not currently listed in the registry it will not appear in the table on this page. We can only deposit data if funders are already listed in the Crossref Funder Registry, but we will pass all funding information on to Crossref so that additional funders can be included in future.

Please check your funding information
The

Researcher information
Please check that the researcher information in the table below is correct, including the spelling and formatting of all author names, and that the authors' first, middle and last names have been correctly identified. Names will be indexed and cited as shown on the proof, so these must be correct.
If any authors have ORCID or ResearcherID details that are not listed below, please provide these with your proof corrections. Please ensure that the ORCID and ResearcherID details listed below have been assigned to the correct author. Authors should have their own unique ORCID iD and should not use another researcher's, as errors will delay publication.
Please also update your account on our online manuscript submission system to add your ORCID details, which will then be automatically included in all future submissions. See here for step-by-step instructions and more information on author identifiers.
Queries for the attention of the authors Journal: PCCP

Paper: c8cp03320j
Title: Experimental study and computational modelling of cruzain cysteine protease inhibition by dipeptidyl nitriles For your information: You can cite this article before you receive notification of the page numbers by using the following format: (authors), Phys. Chem. Chem. Phys., (year), DOI: 10.1039/c8cp03320j.
Editor's queries are marked on your proof like this Q1, Q2, etc. and for your convenience line numbers are indicated like this 5, 10, 15, ...
Please ensure that all queries are answered when returning your proof corrections so that publication of your article is not delayed.

Query reference
Query Remarks

Q1
Please confirm that the spelling and format of all author names is correct. Names will be indexed and cited as shown on the proof, so these must be correct. No late corrections can be made.

Q2
Text has been provided for footnote a in Table 3, but there does not appear to be a corresponding citation in the

Introduction
Human American trypanosomiasis (Chagas disease) is a protozoan infection caused by the flagellate Trypanosoma cruzi. 1 Chagas disease occurs mainly in Latin America, where it is estimated that the illness affects about 6 million to 7 million people. 2 Although discovered in 1909 by the Brazilian sanitary doctor Carlos Chagas, the disease remains a serious public health problem in Latin America. 3 The drugs most commonly used to combat the disease, nifurtimox and benznidazole, are not very efficient and cause strong side effects. 4 Therefore, there is an urgent need for developing an effective therapy against this disease. The cysteine protease cruzain is considered an attractive target for therapeutic intervention in the treatment of Chagas disease. 5,6 Recently, a series of compounds based on the dipeptidyl nitrile scaffold were synthesized and assayed for their inhibitory activity against the T. cruzi cysteine protease cruzain. 7 It is important to point out that the presence of the electrophilic nitrile 8 in a molecular structure is not necessarily incompatible with achieving pharmacokinetic and toxicological profiles that are compatible with dosing in humans. For example, the cathepsin K inhibitor odanacatib 9 has been evaluated extensively as a treatment for osteoporosis. 10 This inhibitor has a nitrile 'warhead' to form a covalent bond with the catalytic cysteine in a reversible manner. 9,11,12 Experimental data demonstrated that inhibition of cruzain by nitrile-based is both competitive and reversible. 7 Cruzain react via a nucleophilic reaction mechanism that involves a cysteine residue and the proton of a histidine. Recently, Moliner and coworkers have studied the catalytic mechanism of cruzain cysteine protease with computational techniques. 13 Using computational approaches, Moliner and coworkers also investigated the inhibition mechanism of cruzain by two different irreversible peptidyl halomethyl ketone (PHK) inhibitors. 14 Hitherto, the molecular details of the inhibition reaction mechanism of cruzain by nitrile-based inhibitors are still poorly understood.  The catalytic mechanism of the cysteine proteases depends on a Cys and His residue. It has been proposed that the imidazole group of His polarizes the SH group of the Cys and enables deprotonation of the SH group, thus forming a highly nucleophilic CysS À /HisH + ion pair. 15 The existence of the ion pair was experimentally proven by different studies [16][17][18] and confirmed by computational modelling. 19 Inhibitors that contain an electrophilic functional group, such as nitriles, covalently bind to cruzain via nucleophilic attack of the active site CysS À . 20 Experimental data suggest that dipeptidyl nitriles inhibitors bind tightly within the active site of cruzain and inhibit via the reversible formation of a covalent bond. 7 In this contribution, isothermal titration calorimetry (ITC) was used to assess experimental binding of cruzain to Neq0409, Neq0410 and Neq0570 inhibitors. To gain further insight into the inhibition of cruzain by these molecules, we also modelled the reaction of cruzain with three inhibitors (Scheme 1). Scheme 2 shows two possible mechanisms for the inhibition reaction. In the first proposed mechanism, the protonation of N1 occurs concertedly with the nucleophilic attack of the thiolate anion (see Scheme 2). The second proposed mechanism is the nucleophilic addition and proton transfer in a stepwise mechanism, where the thiolate group of Cys25 attacks the carbon atom of the nitrile group first to form a covalent bond.
Recently, we have used free energy surface and quantum mechanics/molecular mechanics (QM/MM) approach, 21 to investigate biomolecular systems with an emphasis on enzymatic reactions. [22][23][24] In the QM/MM approach a small part of the system (ligand/substrate species) is described by quantum mechanics, while MM force fields represent the protein and solvent environment. 22 Herein, we are reporting the first QM/ MM free energy surface study of the inhibition reaction of cruzain by dipeptidyl nitriles. Overall, we present a combined computational and experimental study aimed to gain insights into the mechanism whereby nitriles inhibitors bound to a cysteine protease.

Methods
2.1 Expression and purification of cruzain enzyme 2.1.1 Cell transformation procedure. 3 mL of the plasmid (80 ng mL À1 ) Epoch, were added to 100 mL of Artic Express DE3-RIL (Agilent) E. coli cells. The solution was placed in an ice bath for 30 minutes. Subsequently, the heat shock was effected at 42 1C in a water bath for 30 seconds and was again glazed in an ice bath for another 3 minutes. Then 250 mL of SOC medium (Sigma-Aldrich) was added and the solution was placed in a shaker incubator at 37 1C at 250 rpm for 60 minutes. Meanwhile, one plate with LB-Agar medium was prepared containing 20 mg mL À1 gentamicin and 100 mg mL À1 ampicillin. After 1 hour of stirring, the solution of the bacterium with the plasmid was placed on the plate and left in an oven (New Ethics) at 37 1C overnight. The next day, bacterial colonies containing the plasmid appeared on the plaque. Expression was attained with medium of high induction, which was prepared using the following reagents for 1 L of culture medium. Autoclaving the mixture: N-Z amine (Sigma Aldrich), 10 g; yeast extract (Sigma Aldrich), 5 g; glycerol (Vetec), 5 mL; deionized water (MilliQ-Millipore, Direct Q3), 900 mL. Added salts (also pre-autoclaved): Na 2 HPO 4 (Sigma Aldrich), 25 mL of 1 M (stock); KH 2 PO 4 (Alfa Aesar) 25 mL of 1 M; NH 4 Cl (Vetec) 50 mL of 1 M; Na 2 SO 4 (Sigma Aldrich) 5 mL of 1 M; MgSO 4 (Sigma Aldrich) 2 mL of 1 M. Added antibiotics: ampicillin (Sigma-Aldrich), 1 mL from 100 mg mL À1 stock solution (final concentration 100 mg mL À1 ); gentamicin (Sigma-Aldrich), 400 mL from 50 mg mL À1 stock solution (final concentration 20 mg mL À1 ). Added sugars: D-glucose (Vetec), 10 mL 5% (w/v) stock or 5 g in 100 mL of deionized water. a-Lactose (monohydrate, Sigma-Aldrich), Scheme 2 Possible reaction mechanisms catalyzed by cruzain in complex with dipeptidyl nitrile derivatives inhibitors. Path 1 corresponds to a concerted mechanism and Path 2 illustrates the stepwise mechanism. 40 mL of 5% (w/v) or 10 g in 200 mL of deionized water. A 0.2% lactose solution was used to induce the bacteria to express the desired protein.
2.1.2 Pre-inoculum. Gentamicin (20 mg mL À1 ) and ampicillin (100 mg mL À1 ) were placed in autoclaved three flasks of 150 mL culture medium (LB, Sigma Aldrich). In the laminar flow hood, a colony (the most isolated) was collected and added to each flask containing the culture medium and then placed in the shaker (Marconi, refrigerated incubator) at 37 1C overnight at 250 rpm. On the following day, the high induction medium was prepared by adding to the first autoclaved mixture (yeast extract, NZ amine, etc.), the previously autoclaved salts, sugars, antibiotics and the volume of solution with the bacterium sufficient to reach the optical density (OD) of 0.16. This preparation should be done in a sterile environment using a laminar flow hood or a Bunsen nozzle to prevent contamination. The mixture was placed in a shaker (Lab Companion Incubated Shaker model SIF600R) at 18 1C at 200 rpm for 72 hours. Subsequently, this solution was centrifuged at 5000 rpm (Hitachi Centrifuge model CR21GIII) at 4 1C and the pellet was frozen at À80 1C.
2.1.3 Purification. The following buffers are required for purification: buffer A: 50 mM Tris (Sigma-Aldrich), 300 mM NaCl (Synth), 10 mM imidazole (Sigma Aldrich) pH = 10. Buffer B: 50 mM Tris (Sigma Aldrich), 300 mM NaCl (Sigma-Aldrich), 10 mM imidazole (Sigma Aldrich) pH = 10. Lysis solution: for a 500 mL pellet of the culture medium, 60 mL of buffer A; 0.5 mM CaCl 2 (Sigma-Aldrich); 0.5 mM MgSO 4 (Sigma Aldrich), 20 mL DNase (Promega), a few milligrams of lysozyme (Sigma-Aldrich) and 100 mM PMSF (phenylmethylsulfonyl fluoride, Sigma-Aldrich). The pellet was resuspended in the lysis solution and allowed to stand for one hour. Soon thereafter, it is sonicated using a Fisher Scientific Sonic Dismembrator (model 500) with 30 s pulse on and 30 s pulse off. When the sonication was completed, the solution was centrifuged at 19 000 rpm for 30 minutes (Hitachi centrifuge model CR21GIII). The supernatant solution was mixed with 5 mL nickel resin (Ni, Sepharose 6 fast flow, GE Healthcare) and allowed to stir for approximately 3 hours at 4 1C. For this procedure, a cold chamber was used in the laboratory. At the end of this time, the solution was placed on a bench column containing Ni resin (Ni Sepharoset 6 Fast Flow -GE Healthcare) then the resin was washed with 50 mL of buffer A and shortly thereafter, the protein was eluted with 60 mL of buffer B. The eluted solution containing the protein was then placed on a cellulose dialysis tubing (Sigma Aldrich), dialysis membrane and left overnight in activation buffer. The composition of the activation buffer was: 100 mM NaAC (Sigma Aldrich), 5 mM NaAc (Sigma Aldrich) and 300 mM NaCl (Synth), pH = 5.5. The next day, the protein was placed in another liter of activation buffer for a few more hours. At the end, the pH (Qualxtron model 8010 pH) was measured, which was adjusted to 5.2 or 5.0 if the protein is precipitated. The protein concentration should be less than or equal to 0.5 mg mL to avoid precipitation during the activation step. If the concentration is higher in this step, the dilution should be done using the same buffer used in the dialysis. For the determination of total protein concentration, the Denovix DS-11 + Spectrophotometer was used.
2.1.4 Activation step. In this step, 1 mM of bmercaptoethanol (Sigma-Aldrich), which is used with cysteine reducing agent, was added and left in the Shaker (Marconi) at 37 1C under slow stirring for approximately one hour (the color of the solution passed from cloudy to milky white and then to colorless during this process). This is an important step where the enzyme loses its pro-domain. At the end of activation, the protein was concentrated using amicon Ultra, Merck-Millipore (10 kDa) and a bench centrifuge (Eppendorf centrifuge model 5804R) at 4500 RCF for 30 minutes. At the end of the process, it was aliquoted, frozen in liquid nitrogen and stored in a freezer at À80 1C (Ultra-Low, Sanyo).
2.2 PEAQ-ITC experimental procedure 2.2.1 Preparation of the sample. An aliquot of the protein (frozen at À80 1C in activation buffer pH = 5.5) was taken and a new dialysis was performed using the last buffer employed in the dialysis process during the purification of the protein.
Dialysis and concentration were performed over the 30 to 60 minute period on Amicon ultra 10 kDa membranes (Merck/ Millipore) using a 4500 RCF benchtop centrifuge (Eppendorf centrifuge model 5804R) at 4 1C. At the end, solutions from the cell and the syringe were prepared using the dialysis buffer at the concentrations desired for the assay. In that case, the inhibitor was placed in the syringe and the protein in the cell. The inhibitor solution was prepared at the concentration of 170 mM and the protein solution was 20 to 30 mM and was adjusted by the equipment software according to the actual concentration of protein that was interacting with the inhibitor. 0.001% Triton-X 100 (Sigma-Aldrich) was added to the protein solution to prevent aggregation formation. The same surfactant was added to the inhibitor solution to avoid unbalance between the syringe and the cell.

Synthesis of compounds
2.3.1 General considerations. All reagents were purchased from Sigma-Aldrich or Combi-blocks and used as received. Solvents such as hexane, ethyl acetate, chloroform, dimethylformamide, methanol, were all purchased from Synth and VETEC. N,N-Diisopropylethylamine was distilled over CaH 2 before used. N,N-Dimethylformamide (DMF) was dried over 3 Å activated molecular sieves for 72 hours. Solvents used in highperformance chromatography (HPLC) were supplied by Tedia and used without further purification. All non-aqueous Thin layer chromatography was performed on Fluka Analytical Sigma-Aldrich silica gel matrix, pre-coated plates with fluorescent indicator 254 nm and/or staining solutions. Flash column chromatography was performed on silica gel (pore size 60 Å, 70-230 mesh). 1 H and 13 C NMR spectra were recorded on HP -400 and 500 MHz instruments in CDCl 3 or DMSO-d 6  The solvents were filtered through a 0.45 mm Merck-Millipore filter before the use and degassed in an ultrasonic bath. In the established HPLC protocol, chiral analysis was carried out at 32 1C (column oven) where not otherwise specified, using analytical Diacel column (IC-chiralpak: 5 mm, 250 mm Â 4.6 mm), by isocratic elution with a flow rate of 0.5 (analytical) and 2.36 mL min À1 (semi-preparative). The mobile phase composition was acetonitrile-water (50 : 50) (v/ v). Volumes of 10 mL (analytical) was injected. Quantification was carried out at 200-800 nm and the chromatographic run time varied according to the sample.
Specific rotations ([a] T = 100a/lc, in deg mL g À1 dm À1 , but reported herein in degrees) were observed at the wavelength 589 nm, the D line of a sodium lamp. T was set to be 25 1C. Samples were weight using a precision balance (Sartorius, Model CPA26P) and were fully dissolved in methanol (HPLC grade, Panreac). The rotations were measured using a Digital Polarimeter (P2000, Jasco): a = observed rotation in degrees; l = cell path length of 0.1 decimeter in length; c = concentration in g 100 mL À1 . Values were calculated using 5 measurements for each compound. Melting points were determined by a Quimica Micro MQAPF-302 apparatus and are uncorrected. Infrared spectra were obtained from FT-IR Bomen Hartman & Braun mod MB-102.

General procedure for removal of Boc group
Compounds (4, 5, 6) were treated with formic acid (2 mL) at room temperature. The solution was stirred for 18 hours. The reaction mixture was evaporated under vacuum to get a yellowish oil. It was treated with 1 M NaOH until pH 9. The product was extracted with ethyl acetate (4 Â 20 mL). The organic phase was evaporated to obtain a colorless oil. The formation of the product was confirmed by TLC (ethyl acetate). The product was used for the next step without any further purification. Yield B90%. 2.6 General procedure for synthesis of compound 7, 8, 9 Compound 4 or 5 or 6 (2.45 mmol, 1.3 eq.) was added to a solution of benzoic acid (229.6 mg, 1.88 mmol, 1 eq.), HATU (932 mg, 2.45 mmol, 1.3 eq.) and N,N-diisopropylethylamine (0.904 mL, 4.9 mmol, 2.6 eq.) in DMF (10 mL), under argon atmosphere. The resulting solution was stirred at room temperature for 16 hours. The reaction mixture was diluted with ethyl acetate (50 mL) and washed with a saturated NaHCO 3 solution (3 Â 20 mL), HCl 1 N (2 Â 20 mL) and brine (3 Â 20 mL). The organic phase was dried over MgSO 4 and evaporated to give a crude residue that was used for the next step without further purification.

Computational model
In this report, the computational model for the QM/MM MD calculations was based on the crystal structure of the cysteine protease cruzain (PDB code: 1ME4), 25 where we have replaced a ketone-based inhibitor by the dipeptidyl nitrile inhibitors studied here (see Scheme 1). Before starting the molecular dynamic simulations an assignment of the protonation states of the residues at pH = 7 was carried out using the empirical propKa program. 26 After adding the hydrogen atoms to the full structure, series of optimization algorithms 27 were applied. All the heavy atoms of the protein and substrate were restrained by means of a Cartesian harmonic umbrella with force constant of 500 kJ mol À1 Å À2 . Afterward, the system was fully relaxed, but the peptide backbone was restrained with a lower constant of 100 kJ mol À1 Å À2 . Then, the optimized protein was placed in an octahedral box of pre-equilibrated waters (80 Å side), using the center of mass of the complex as the geometrical center. Afterward, 100 ps of hybrid QM/MM Langevin-Verlet MD at 300 K and in a canonical thermodynamic ensemble (NVT) were used to equilibrate the system. For the hybrid QM/MM calculations, the atoms of the inhibitor and the side chain of Cys25 and His162 residues were selected to be treated by QM, using a semiempirical AM1/d-phot hamiltonian. 28 The other atoms of the system, protein and water molecules, were described using the CHARMM/TIP3P 29 force fields, respectively. The number of QM atoms resulted then to be 25 for Neq0409, 31 for Neq0410 and 29 for Neq0570 while the final system contains 33 098, 33 104 and 33 102 atoms, respectively.
Due to the number of degrees of freedom, any residue 20 Å apart from any of the atoms of the inhibitors was selected to be kept frozen in the remaining calculations. Cut-offs for the nonbonding interactions were applied using a switching scheme, within a range radius from 14.5 to 16 Å. Afterward, the system was equilibrated by means of 2.0 ns of QM/MM MD at temperature of 298 K. The computed RMSD for the protein during the last 1 ns renders a value always below 0.9 Å. Furthermore, the RMS of the temperature along the different equilibration steps was always lower than 2.5 K and the variation coefficient of the potential energy during the dynamics simulations was never higher than 0.3%. 2.7.1 Free energy surface. Recently, the increased computer power, as well as parallel computing and advances in electronic structures calculations, have opened new possibilities for including ab initio QM/MM simulations upon understanding chemical reactions in a polar environment on a molecular level. 30 However, it is still extremely challenging to obtain computationally converging sampling of ab initio QM/ MM (QM(ai)/MM) free energy surfaces in condensed phases (due to large number of gradient values evaluation). In the present case, we have described quantum region using AM1/d-PhoT hamiltonian 28 to compute the activation free energies at a significantly reduced computational cost, and that incorporates d-extension for the phosphorus atom and modified AM1 parameters for oxygen and hydrogen atoms. AM1/d-PhoT potential has been designed to accurately reproduce high-level DFT results such as geometries, dipole moments, proton affinities, and relative energies for a broad set of molecules, complexes, and chemical reactions relevant to biological transfers. 28 To study the mechanism, we constructed 2D-PMF using the combination of R1-R2 antisymmetric distance that corresponds to the proton transfer from His162 to N1 of the dipeptidyl nitrile inhibitor and R3 distance that corresponds to the nucleophilic attack of the negatively charged Cys25 on the carbon C1 of inhibitors (see Scheme 4). It is important to mention that the initial structures for 2D-PMF calculations were obtained from the last point of 2 ns QM/MM MD calculations for each system. All PMFs have been calculated using the weighted histogram analysis method (WHAM) combined with the umbrella sampling approach 31,32 as implemented in the pDynamo program. 33 The PMF calculation requires series of molecular dynamics simulations in which the distinguished reaction coordinate variable, x, is constrained around particular values. 34 The values of the variables sampled during the simulations are then pieced together to construct a distribution function from which the PMF is obtained as a function of the distinguished reaction coordinate (W(x)). The PMF is related to the normalized probability of finding the system at a particular value of the chosen coordinate by eqn (1): The activation free energy can be then expressed as: 34 where the superscripts indicate the value of the reaction coordinate at the reactants and transition state and G x (x R ) are the free energy associated with setting the reaction coordinate to a specific value at the reactant state. Usually, this last term makes a small contribution and the activation free energy is directly estimated from the PMF change between the maximum of the profile and the reactant's minimum: A total of 30 simulations was performed at different values of R1-R2 antisymmetric combination of distances (ranging from À1.5 Å to 1.5 Å), with an umbrella force constant of 500 kJ mol À1 Å À2 applied to this distinguished reaction coordinate. In addition, 36 simulations were performed at different values of R3 (ranging from 1.2 Å to 3.0 Å), also with an umbrella force constant of 500 kJ mol À1 Å À2 on this combination of distances. Consequently, 1080 simulation windows were needed to obtain the 2D-PMF for the Neq0409 inhibition mechanism. The values of the variables sampled during the simulations were then pieced together to construct a full distribution function from which the 2D-PMF was obtained. In each window, 10 ps of relaxation were followed by 15 ps of production with a time step of 0.5 fs due to the nature of the chemical step involving a hydrogen transfer. The Verlet algorithm was used to update the velocities. The same protocol of the simulation was used to describe the 2D-PMF of the Neq0410 and Neq0570 inhibition mechanism, using the R2-R1 and R3 distances, where also 1080 simulation windows were needed to obtain the 2D-PMFs.
2.7.2 Activation free energy at M06-2X/6-31+G(d)/MM level. In the study, we have used AM1/d-PhoT/MM as the reference potential and M06-2X/6-31+G(d)/MM as target potential in order to explore the challenge of evaluation QM(ai)/MM activation free energies of a reaction catalyzed by cysteine protease cruzain. We have used the same protocol previously, 22 where we have obtained single-point at M062X/6-31+G(d)/MM potential using representative snapshots from RS and TS ensembles from 2D-PMF QM/MM simulations computed at AM1/d-PhoT potential, where a total of 2000 representatives' snapshots for each state were used in the simulations. In this way, the extensive sampling required to calculate the QM(ai)/MM free energy barrier can be done on a computationally inexpensive reference potential rather than directly on the high target potential. 35 3 Results and discussion

Thermodynamic signature of ligand binding to cruzain
We started the work with isothermal titration calorimetry ITC to assess binding of cruzain to Neq0409, Neq0410 and Neq0570. Following experimental data, the binding enthalpy was exothermic for all ligands, Fig. 1. However, the three different cruzain inhibitors are not equipotent: Neq0570 (DG bind = À9.0 kcal mol À1 ) and Neq0409 (DG bind = À8.9 kcal mol À1 ) showed better binding affinities than Neq0410 (DG bind = À7.5 kcal mol À1 ). The structural variation of the inhibitors studied is shown in Scheme 1. The affinities of cruzain inhibitors are factored differently in their enthalpic and entropic contributions. The detrimental entropic contribution of Neq0409 as a whole is more effective than that observed for Neq0570, being Neq0410 the worst case. The introduction of the geminal methyl group in Neq0410 (see Scheme 1) causes the most significant detrimental entropic contribution to the binding free energy ( Fig. 1 and Table 1). Fig. 2 shows the thermodynamic signatures of cruzain-ligand complexes as histograms (expressed in kcal mol À1 ). Although there is no solution to the enthalpy-entropy compensation, there is a direct reflection on the affinity of Neq0410 which is advocated by the introduction of the geminal dimethyl moiety.
In addition, there is the danger that irreversibly modified proteins could cause an immune response. The precise effects of irreversible protein modifications are difficult to predict, and the risks are most relevant for the long-term treatment of chronic diseases. Therefore, the development of targeted covalent inhibitors would benefit from strategies that minimize the occurrence of such protein modifications.
When comparing and contrasting molecules of similar structure to the same target, the differences in DG (dDG), DH (dDH) and ÀTDS (ÀTdDS) are more informative because the effect of removing solvent from the internal surfaces during complex formation will be canceled.
Matched Molecular Pair Analysis (MMPa) provides the opportunity to evaluate molecular modifications in which the thermodynamic contributions are rationalized according to the individual contributions of their unique elements. The analysis is performed in molecular pairs that differ only in positions where there is a modification in molecular structure. Transformations occur by identifying the change from one functional group to another in which its regiochemistry is also taken into account. Herein, an example of transformation refers to the exchange of a functional group (a fragment) by another whose significant fragmental conversion reflects the change in the thermodynamic properties. MMPa may be useful in interpreting the bimolecular recognition process and the individual contributions of each alteration in that process. One of our objectives was to identify possible small molecular modifications that included simple fragments, but that resulted in some significant modification in the properties of interest. The subtle changes in the thermodynamic signature upon binding can be observed in Fig. 3.
The modifications implemented in the inhibitors (see Scheme 1) impact in the binding affinity and the thermodynamic profile of the ligands when co-complexed with cruzain. For comparisons between the pairs to be significant, the thermodynamic parameters were evaluated concerning the Neq0409. The pair Neq0410-Neq0409 presents a thermodynamic signature in which the introduction of the geminal methylene results in a total detrimental contribution towards binding and the three thermodynamic parameters are unfavorable to the spontaneity of the bimolecular interaction process. On the other hand, the introduction of the cyclopropyl moiety in Neq0409 (see Scheme 1) thereby yielding Neq0570 contributes favorably to the decrease in the detrimental entropic contribution that makes the spontaneous process to happen  Compound TD parameters DG (kcal mol À1 ) DH (kcal mol À1 ) ÀTDS (kcal mol À1 ) Neq0409 À8.9 À14.9 6.0 Neq0410 À7.5 À13.9 6.3 Neq0570 À9.0 À14.6 5.6  even at the expense of the detrimental contribution of its net enthalpy. Nonetheless, values result in a small difference between dDG, dDH and ÀTdDS. The thermodynamic profile of the pair Neq0570-Neq0410 has the same signal for the contributions of dDH and ÀTdDS, both favorable to the change in the spontaneity, dDG, of such bimolecular interaction.
At this point, it is important to clarify that in the processes of discovery of new bioactive chemical entities, it is accepted that molecules that occupy the same (or similar) chemical space within a set of properties exhibit potencies in similar magnitudes. Yet, it is quite common to find structurally similar molecules with high capacity prediction of the potency within such a chemical space, which, however, present great differences in potency. These discontinuities are known as ''activity cliffs'', 36 and it is exactly this phenomenon that we are describing in this work, where the compounds selected for this study contain in their structures a fundamental characteristic that is related to their peculiarly similar structure with discrete modifications. They range from a methylene group to cyclopropyl through the geminal methylene group (see Scheme 1). The change of affinity in the magnitude improves from 25.1 mM for Neq0410 to 0.25 mM (or 251 nM) for Neq0570. 7 In other words, Neq0570 is the most potent compound in the series with 2 log units in relation to the Neq0409. This observation is in agreement with our ITC results described above which show that inhibition of cruzain by Neq0570 is thermodynamically more favourable than Neq0410 (Fig. 3).

Computational modelling
Computational descriptions of chemical reactions in enzymatic environments are usually based on the selection of a distinguished reaction coordinates. 37 Recently, Moliner and coworkers have performed a computational study of the inhibition of cruzain by two irreversible peptidyl halomethyl ketones inhibitors using hybrid AM1d/MM and M06-2X/MM potentials. 38 More recently, Chatterjee and coworkers 38 proposed an approach that can utilize relative FEP calculations to minimize the amount of QM/MM calculation needed to estimate the thermodynamics and kinetics of reversible covalent inhibitors. In this study, we are using free energy surface simulations to explore and test the cruzain inhibition reaction by dipeptidyl nitrile inhibitors Neq0409, Neq0410 and Neq0570.
The covalent reversible inhibition mechanism of a dipeptidyl nitrile inhibitor in cruzain requires the covalent link of Cys25 on the carbon C1 of the inhibitor, as shown by Avelar and coworkers 7 (Scheme 2). We have traced the 2D-PMFs using the combination of R1-R2 antisymmetric distance and R3 distance (see Scheme 4), where R1-R2 corresponds to the proton transfer from His162 to N1 of the dipeptidyl nitrile inhibitor, while R3 corresponds to the nucleophilic attack of negatively charged Cys25 on the carbon C1 of Neq0409, Neq0410 or Neq0570. Fig. 4a shows the 2D-PMF obtained for reaction inhibition of cruzain by Neq0409. The mechanism is described at AM1/d-PhoT/MM level, as a function of the combination of the distances (R1-R2, and R3, see Scheme 4) defining the breaking and forming bonds. In this PMF, the RS corresponds to the non-covalent cruzain-Neq0409 complex, a structure where the inhibitor binds tightly within the active site of the protein. In the TS (the transition state for both proton transfer and nucleophilic attack), the bond distance for R1 corresponds to 1.23 Å, whereas R2 corresponds to 1.33 Å, indicating a very advanced stage for the proton transfer, concomitantly the bond distance for R3 is verified as 2.21 Å, indicating the C-O bond formation (see Table 2 and Fig. 4).
Simulation of the reaction inhibition of cruzain by Neq0409 reveals that reaction proceeds via a concerted mechanism, where the proton transfer from His162 to N1 of Neq0409 co- occurs with a nucleophilic attack from negatively charged Cys25 at C1 of Neq0409. The calculated activation free energy at AM1/d-PhoT level is 5.1 kcal mol À1 (see Table 3), and the calculated reaction free energy is moderately exergonic (À7.5 kcal mol À1 ). It is known from experiment that this reaction is reversible, where the covalent adduct can revert to the noncovalent protein-inhibitor complex, which means that the computed values at the AM1/d-PhoT level are in very good agreement with experimental data.
As commented in the Method section, we have used AM1/d-PhoT/MM as the reference potential and M06-2X/6-31+G(d,p)/ MM as target potential to determine the higher level energy corrections for QM/MM activation free energies of cruzain inhibition mechanism by Neq0409. The representative snapshots from RS, TS and PS ensembles from AM1/d-PhoT 2D-PMF were used as start point for evaluating the activation free energy of cruzain at M06-2X/6-31+G(d,p)/MM level following the approach described in methods (see ref. 22 for more details). It is worth to mention that estimating the average difference between target potential and reference potential allows us to compute the activation free energies at a significantly reduced computational cost, using 2 trajectories: one at the reactants and one at the transition state. Therefore, the reference potential idea seems to be a feasible approach regarding computational cost in calculating the M06-2X/6-31+G(d,p)/ MM free-energy barrier for an enzymatic reaction. After M06-2X/6-31+G(d,p)/MM corrections, the resulting activation barrier at M06-2X/6-31+G(d,p)/MM level is 14.8 kcal mol À1 for the transfer of a proton from His162 to N1 and nucleophilic attack from negatively charged Cys25 at C1 of Neq0409 (concerted step). Computational work on the reference solution reaction for studies of cysteine protease found a barrier of 24 kcal mol À1 for nucleophilic attack of thiomethanol S À on amide, 39 which suggest that for the corresponding reaction in water, the barrier can be about higher by 10 kcal mol À1 . It is also important to point out that there is a considerable difference between values computed at AM1d and M06-2X level (see Table 3). The convergence of one-step perturbation can be problematic when the difference between the reference potential (RP) and target potential (TP) is significant. 40 The full FEP treatment should be used to reduce the source of the error. Olsson and Ryde have proposed the use of reference-potential approach, converting the ligands with FEP at the molecular mechanics (MM) level and then perform also MM -QM/MM FEP for each ligand. 41 However, the one-step perturbation approach drastically reduces the computational cost of implementation of the perturbation from reference to target potential. The values reported in Table 3      shows that energies have a consistent trend for each inhibitor of reaction which suggests that the perturbation was performed in the right direction. Here, we are assuming that AM1-d reproduces DFT geometry accurately and consequently we may be matching the correct stationary points. It is important to note that the nucleophilic addition barrier calculated at the M06-2X/6-31+G(d,p)/MM level is in reasonable agreement with that found in previous studies for inhibition of cathepsin K by nitrile-based inhibitors, for which a barrier of 19.9 kcal mol À1 was obtained from DFT/MM calculations Q3 (Fig. 5). 42 In the present study, we have also explored the molecular mechanism of the cruzain inhibition by dipeptidyl nitrile inhibitors Neq0410 and Neq0570, where we have traced a 2D-FES using the same reaction coordinates described for Neq0409 (R1-R2, and R3, see Scheme 4). The results show a concerted mechanism for both Neq0410 and Neq0570 (Fig. 4). Furthermore, FES results suggest that the activation barrier for Neq0410 is more significant than the barrier observed for Neq0409 (see Fig. 4 and Table 3). Fig. 4b shows the 2D Free energy surface (2D-FES) for Neq0410 using the combination of distances R1-R2 and R3, as described above. The TS1 observed on the FES for inhibition by Neq0410 corresponds to the transition state for both proton transfer and nucleophilic attack. The bond distance for R1 and R2 corresponds to 1.12 and 1.51 Å, respectively (see Table 2 and Fig. 6), and the R3 is 2.32 Å, indicating a concomitant step of both proton transfer and nucleophilic attack.
The FES for Neq0570 (Fig. 4c) suggests that its activation barrier is also lower than that barrier for Neq0410 (Table 3). Fig. 4c shows the 2D Free energy surface (2D-FES) for Neq0570 using the combination of distances R1-R2 and R3, as described. The TS1 observed on the FES corresponds to the transition state for both proton transfer and nucleophilic attack, where bond distance for R1 and R2 corresponds to 1.38 and 1.33 Å, respectively (see Table 1 and Fig. 7), and the R3 is 2.46 Å, also indicating a concerted process.
The reactions of irreversible inhibitors were proposed to be more exothermic than À22 kcal mol À1 (see ref. 43). Analysis of the reaction energies obtained from QM/MM FES and ITC data reported here for the three inhibitors studied shows that the process is exergonic and reversible since the DG obtained from ITC and computational calculations are less exothermic than À22 kcal mol À1 (see Table 3), which is agree with previous experimental work 7 that have shown that inhibition process is reversible for inhibition reactions of cruzain by nitrile-based inhibitors. Therefore, the Neq0570 and Neq0409 and Neq0410 inhibitors binding covalently and in a reversible fashion. It is important to point out that our QM/MM result is consistent with the ITC data that demonstrates Neq0570 and Neq0409 are more potent inhibitors of cruzain from Trypanosoma cruzi than Neq0410. Finally, we hope this protocol used here may be helpful for the development of reversible covalent inhibitors of cruzain.

Conclusions
In the present study, ITC analysis shows the cruzain inhibition by Neq0570 and Neq0409 and Neq0410 is enthalpy driven with a clear detrimental contribution of the entropy difference to the free binding energy. Furthermore, an important conclusion from the hybrid QM/MM approach and FESs obtained is that the inhibition reaction of cruzain by Neq0409, Neq0410 and Neq0570 follows a concerted mechanism. A proton transfer (PT) from His162 to N1 co-occurs with the nucleophilic attack of the negatively charged Cys25 on the C1 atom of inhibitors. The barrier for a forward reaction involving Neq0409 and Neq0570 inhibitors corresponds to 5.1 and 5.2 kcal mol À1 , respectively at AM1-d/MM level. The barrier for Neq0409 and Neq0570 is lower than the barrier for Neq0410 (6.9 kcal mol À1 ). It is also worth to mention that results from the AM1-d/MM calculated activation free energy are in agreement with activation free energy derived from the M06-2X/6-31(d)/MM estimations. Our experimental and computational results also demonstrate that inhibition of cruzain by Neq0570 and Neq0409 is thermodynamically more favourable than Neq0410. Finally, we hope these results reported here may provide valuable information for designing new inhibitors of cruzain from Trypanosoma cruzi with known mode of action (MoA).

Conflicts of interest
There are no conflicts to declare.