Coronaviruses (CoVs) are single-stranded RNA viruses with large, enveloped and positive senses that can infect both animals and humans1. CoVs, along with Artierivirdae and Roniviridae, belong to the Coronaviridae family in the order Nidovirales. These CoVs can infect various hosts, including avian, swine and humans. Human coronaviruses (HCoVs) represent a major group of CoVs associated with various respiratory diseases from common cold to serious pneumonia and bronchiolitis2. Today, HCoVs are recognised as one of the fastest-evolving viruses derived from their characteristic high genomic nucleotide replacement rates and recombination3. Severe Acute Respiratory Syndrome (SARS), the first confirmed atypical pneumonia in China's Guangdong province, has spread to several countries. The most common symptoms of SARS include coughing, high fever (>38 °C), chills, convulsions, headaches, dizziness and progressive radiographic changes of the chest and lymphopenia4. The severity of the disease shows a death rate of about 3% to 6%, although this rate could rise up to 43% to 55% for senior citizens older than 60 years5. The primary epidemic of SARS was eventually controlled, but a SARS CoV-like virus was detected in Chinese bats6,7. Besides, a recent pandemic of middle east respiratory syndrome (MERS) caused by a novel coronavirus MERS-CoV raises fear of possible recurrence of SARS or related dangerous diseases8,9. Since there is no vaccine and effective therapy for these viral infections, developing anti-SARS drugs against future outbreaks remains a formidable challenge.
SARS- and MERS-CoVs genomes contain two open reading frames ORF1a and ORF1b translated to two respective viral polyproteins pp1a and pp1ab by host ribosomes. ORF1a encodes two cysteine proteases, a papain-like protease (PLpro) and a 3C-like protease (3CLpro). While PLpro cuts the first three cleavage sites of its polyprotein, 3CLpro is responsible for cleavage of the remaining 11 locations resulting in release of a total of 16 non-structural proteins (nsp) in both SARS- and MERS-CoVs. The homodimeric form of 3CLpro is active in the presence of substrates. The crystal structures of both 3CLpros showed that each monomer is composed of three structural domains: domains I and II form a chymotrypsin-like architecture with a catalytic cysteine and are connected to a third C-terminal domain via a long loop10. In the proteolytic site, all 3CLpros prefer glutamine at P1 position and leucine, basic residues, small hydrophobic residues at P2, P3 and P4 positions, respectively11. At P1′ and P2′ positions, small residues are required. However, P3′ position shows no strong preference. Since the autocleavage process is essential for viral propagation, 3CLpro is a good drug target for anti-coronaviral infection.
In this study, we employed a proteolytic method to probe SARS-CoV 3CLpro inhibitory compounds. A synthetic peptide labelled with an Edans-Dabcyl FRET (Fluorescence resonance energy transfer) pair12 was used to search SARS-CoV 3CLpro inhibitory compounds against a flavonoid library. Recent studies showed that flavonoids have antiviral activity in some viruses including SARS-CoV13–17. However, a molecular level study has not been reported for SARS-CoV. Therefore, we performed the proteolytic assay with flavonoids followed by an induced-fit docking experiment with top hits. With the results, we tried to deduce a structural and functional relationship of flavonoids crucial to binding with SARS-CoV 3CLpro. The information can be applied to develop synthetic compounds with better affinities.
Materials and methods
Protein expression and purification
The coding sequence of SARS-CoV nsp5-pp1a/pp1ab 3CL (chymotrypsin-like) protease (3CLpro) was obtained at the NCBI database (NP_828863.1). The catalytic domain of 3CLpro (M1∼T196) was synthesised chemically by Bioneer (Daejeon, Korea) and cloned into a bacteriophage T7-based expression vector, pBT7. The plasmid DNA was transformed into E. coli BL21 (DE3) for protein expression. E. coli BL21 (DE3) cells were grown on Luria–Bertani (LB) agar plates containing 150 μg ml−1 ampicillin. Several colonies were picked and grown in capped test tubes with 10 ml LB broth containing 150 μg ml−1 ampicillin. A cell stock composed of 0.85 ml culture and 0.15 ml glycerol was prepared and frozen at 193 K for use in a large culture. The frozen cell stock was grown in 5 ml LB medium and diluted into 1000 ml fresh LB medium. The culture was incubated at 310 K with shaking until an OD600 of 0.6–0.8 was reached. At this point, the expression of the SARS-CoV 3CLpro was induced using isopropyl-β-d-1-thiogalactopyranoside (IPTG) at a final concentration of 1 mM. The culture was further grown at 310 K for 3 h in a shaking incubator. Cells were harvested by centrifugation at 7650g (6500 rev min−1) for 10 min in a high-speed refrigerated centrifuge at 277 K. The cultured cell paste was resuspended in 25 ml of a buffer consisting of 50 mM Tris–HCl pH 8, 100 mM NaCl, 10 mM imidazole, 1 mM phenylmethylsulfonyl fluoride (PMSF), 10 μg ml−1 DNase I. The cell suspension was disrupted using an ultrasonic cell disruptor (Digital Sonifier 450, Branson, USA). Cell debris was pelleted by centrifugation at 24,900g (15,000 rev min–1) for 30 min in a high-speed refrigerated ultra-centrifuge at 277 K.
The protein was purified by cation chromatography using a 5 ml Hi-Trap SP column (GE Healthcare, Piscataway, New Jersey, USA). The column was equilibrated with a buffer consisting of 50 mM MES pH 6.5 and the pooled fractions were loaded. The column was eluted using a linear NaCl gradient to 1 M NaCl and the protein was eluted at 0.23 M NaCl. SDS-PAGE showed one band around 22 kDa (21895.09 Da), corresponding to the molecular weight of the catalytic domain of SARS-CoV 3CLpro. The protein was concentrated to 16 mg ml−1 for the protease assay in a buffer consisting of 0.23 M NaCl and 50 mM MES pH 6.5.
FRET protease assays with the SARS-CoV 3CLpro
The custom-synthesised fluorogenic substrate, DABCYL-KTSAVLQSGFRKME-EDANS (ANYGEN, Gwangju, Korea), was used as a substrate for the proteolytic assay using the SARS-CoV 3CLpro18. This substrate contains the nsp4/nsp5 cleavage sequence, GVLQ↓SG19, and works as a generic peptide substrate for many coronavirus including the SARS-CoV 3CLpro. The peptide was dissolved in distilled water and incubated with each protease. A SpectraMax i3x Multi-mode microplate reader (Molecular Devices) was used to measure spectral-based fluorescence. The proteolytic activity was determined at 310 K by following the increase in fluorescence (λexcitation = 340 nm, λemission = 490 nm, bandwidths = 9, 15 nm, respectively) of EDANS upon peptide hydrolysis as a function of time. Assays were conducted in black, 96-well plates (Nunc) in 300 μl assay buffers containing protease and substrate as follow; For the SARS-CoV 3CLpro assay, 4.05 μl of 0.074 mM protease containing 50 mM Tris pH 6.5 was incubated with 7.5 μl of 0.1 mM substrate at 310 K for 2 h before measuring Relative Fluorescence Unit (RFU). Before the assay, the emission spectra of 64 flavonoids were surveyed after illuminating at 340 nm to avoid the overlapping with the emission spectrum of EDANS. Every compound was suitable to be tested. The final concentration of the protease, peptide and chemical used at the assay was 1, 2.5 and 20 μM each. At first, the SARS-CoV 3CLpro and chemical were mixed and pre-incubated at room temperature for 1 h. The reaction was initiated by the addition of the substrate and each well was incubated at 310 K for 16 h. After that, we measured the fluorescence of the mixture on the black 96-well plate using the endpoint mode of SpectraMax i3x where the excitation wavelength was fixed to 340 nm and the emission wavelength was set to 490 nm using 9, 15 nm bandwidth, respectively. All reactions were carried out in triplicate. Among the first 64 flavonoids (Supplementary Table 1), three of them were picked up to further assay at a concentration range of 2–320 μM. The IC50 value which is the value causing 50% inhibition of the catalytic activity of the SARS-CoV 3CLpro was calculated by nonlinear regression analysis using GraphPad Prism 7.03 (GraphPad Software, San Diego, CA, USA).
FRET protease assays with the SARS-CoV 3CLpro in the presence of Triton X-100
The proteolytic assay using the SARS-CoV 3CLpro in the presence of Triton X-100 has been performed to differentiate the artificial inhibitory activity of chemicals through non-specific binding with proteases by forming aggregate or complexation. The concentration used in this study was 0.01%.
Absorption spectroscopic studies based on tryptophans of the SARS-CoV 3CLpro
To confirm the feasibility of the assay method independently, the fluorescence spectra from the tryptophan of the SARS-CoV 3CLpro with candidate inhibitors were investigated20. The fluorescence measurements were recorded with a SpectraMax i3x Multi-mode microplate reader (Molecular Devices) at excitation and emission wavelengths of 290 nm and 300–500 nm, respectively. The optimal excitation and emission wavelengths were determined by SoftMax Pro. One tryptophan of the SARS-CoV 3CLpro showed a fluorescence emission with a peak at 350 nm after the excitation at the wavelength of 290 nm. In contrast, the flavonoids were almost non-fluorescent under the same experiment condition. Each 40 μM chemical was incubated with 1 μM the SARS-CoV 3CLpro for 1 h and the fluorescence intensity of the mixture was measured.
Ligand preparation, target preparation and induced-fit docking
All the docking and scoring calculation were performed using the Schrödinger software suite (Maestro, version 11.8.012). The compounds were extracted from the PubChem database in the SDF format and were combined in one file. The file was then imported into Maestro and prepared for docking using LigPrep. The atomic coordinates of the crystal structure of SARS-CoV 3CLpro (4WY3) were retrieved from the Protein Data Bank and prepared by removing all solvent and adding hydrogens and minimal minimisation in the presence of bound ligand using Protein Preparation Wizard. Ioniser was used to generate an ionised state of all compounds at the target pH 7 ± 2. This prepared low-energy conformers of the ligand were taken as the input for an induced-fit docking. The induced-fit docking protocol21 was run from the graphical user interface accessible within Maestro 11.8.012. Receptor sampling and refinement were performed on residues within 5 Å of each ligand for each of the ligand–protein complexes. With Prime22, a side‐chain sampling and prediction module, as well as the backbone of the target protein, were energy minimised. A total of induced‐fit receptor conformations were generated for each of the ligands. Re‐docking was performed with the test ligands into their respective structures that are within 30 kcal/mol of their lowest energy structure. Finally, the ligand poses were scored using a combination of Prime and GlideScore scoring functions23.
The yield of cell harvested for purification of the catalytic domain of SARS-CoV 3CLpro was 2.69 g per 1000 ml of E. coli culture. The amount of purified protein synthesised with no-tag was 16 mg. For storage and assay, the protein solution was concentrated to 16 mg ml−1. The concentrated solution was diluted to 1 μM when the inhibitory assay was going on.
A flavonoid library consisting of 10 different scaffolds was also built (Figure 1). It contains five isoflavones, one isoflavane, 17 flavones, 11 flavonols, seven flavanols, seven flavanones, four flavanonol, one prenylflavonoid, nine chalcones and two unclassified flavonoids (Supplementary Table 1). We applied the library to assay SARS-CoV 3CLpro. Using 64 flavonoids, an inhibitory effect of each compound at 20 μM was tested. Among them, herbacetin (3,4′,5,7,8-pentahydroxyflavone), rhoifolin (apigenin-7-O-rhamnoglucoside) and pectolinarin (5,7-dihydroxy 4′,6-dimethoxyflavone 7-rutinoside) were found to have prominent inhibitory activity (Figure 2). The binding affinity data were plotted as log inhibitor concentration versus percent fluorescence inhibition (Figure 2). The three compounds showed the severely reduced fluorescent intensity and thus represented their SARS-CoV 3CLpro inhibitory activity. The IC50 values were calculated from the dose-dependent inhibitory curves of herbacetin, rhoifolin and pectolinarin. The measured values were 33.17, 27.45 and 37.78 μM, respectively. Since flavonoids are known to be aggregated through complexity and thus non-specifically inhibit various proteases, the assay in the presence of Triton X-100 was also performed24. Before the examination, we tested the effects of Triton X-100 on the catalytic activity of the SARS-CoV 3CLpro. As shown in Supplementary Figure 1, only a slight increase in catalyst activity was observed up to 0.1% Triton X-100. Therefore, the assay was performed at a concentration of 0.01% Triton X-100 with no significant interference detected.
Figure 1. The basic skeleton structures of flavonoids and their scaffolds. Basic representative structures of the most common flavonoids classified in this study were drawn with rings and numbered positions.
Figure 2. Results from the FRET method. Each data point represents the effect of each inhibitory compound against SARS-CoV 3CLpro compared to the control. The RFU are plotted against the log-concentration of inhibitory compounds. Each dot is expressed as the mean ± standard error of the mean (n = 3). RFU: Relative Fluorescence Units.
To independently confirm the inhibitory activity of flavonoids, a general tryptophan-based assay method was employed. Tryptophan was well known to emit its fluorescence. Therefore, if tryptophan is positioned adequately in proteins, the change of fluorescence intensity can reflect the binding state of chemicals and be used to judge interaction between proteins and chemicals. The SARS-CoV 3CLpro contains one tryptophan residue (Trp31) at the catalytic domain. Its backbone lines up at the entrance of the active site. Therefore, its fluorescence change can reflect environmental variation of the active site. The catalytic domain of SARS-CoV 3CLpro used in this study displays a fluorescence peak at 340 nm after the tryptophan excitation wavelength of 290 nm. We monitored the change of the fluorescence intensities depending on the presence or absence of all flavonoids. Since each compound in the flavonoid library was almost non-fluorescent under the experiment condition, a change of fluorescence intensity reflects interactions between the catalytic domain and chemicals. Intriguingly, the three inhibitory compounds obviously reduced the fluorescence intensity of the catalytic domain compared with others (Figure 3). The decreased emission intensity confirmed the complex formation between the catalytic domain and inhibitory compounds.
Figure 3. Fluorescence quenching spectra of SARS-CoV 3CLpro. A solution containing 1 μM SARS-CoV 3CLpro showed a strong fluorescence emission (the solid line) with a peak at 340 nm at the excitation wavelength of 290 nm. After adding 40 μM each inhibitory compound such as herbacetin (the dashed line), rhoifolin (one dashed one dotted line) and pectolinarin (one dotted line), fluorescence quenching spectra were obtained.
In order to deduce the binding modes of the inhibitory flavonoids in the molecular level, an in-depth theoretical investigation through an induced-fit docking study was carried out. The interactions between SARS-CoV 3CLpro and three inhibitory flavonoids were analysed to predict their binding affinities. Top-ranked structures (according to the glide gscores) from the induced-fit docking results for herbacetin (–9.263), rhoifolin (–9.565) and pectolinarin (–8.054) were selected and hypothesised to be biological complexes. To find reasonable structural factors endowing the good affinity of herbacetin, the poses of kaempferol (–8.526) and morin (–8.930), two closest homologues, were also calculated and compared. The predicted complex structures and 2D schematic representations of them are illustrated in Figure 4.
Figure 4. Predicted complexes of flavonoids in the catalytic site of SARS-CoV 3CLpro. Docking poses of (A) herbacetin, kaempferol and morin and (B) rhoifolin and pectolinarin were depicted on the electrostatic surface potential of SARS-CoV 3CLpro (red, negative; blue, positive; white, uncharged). Flavonoids were predicted to occupy the active site of SARS-CoV 3CLpro. The 2D schematic representations of the interactions of five flavonoids were also drawn. Figures were created with Maestro v11.5.011. S1 represents the polar S1 site of SARS-CoV 3CLpro, S2 for the hydrophobic S2 site, and the S3′ site with no strong tendency. The pink arrows represent hydrogen bond interaction.
Flavonoids are an important kind of natural products. In particular, they belong to a type of plant secondary metabolites with a polyphenolic structure widely found in fruits and vegetables. They have miscellaneous reciprocal biochemical and antioxidant effects associated with various diseases such as cancer, Alzheimer's disease and atherosclerosis25–27. It is due to antioxidants, anti-inflammatory, anti-mutagens and anti-cancer-causing properties combined with the ability to control major cell enzyme functions15. Intriguingly, some flavonoids also have antiviral activity17. Specifically, apigenin, luteolin, quercetin, amentoflavone28, quercetin, daidzein, puerarin, epigallocatechin, epigallocatechin gallate, gallocatechin gallate29 and kaempferol30 were reported to inhibit the proteolytic activity of SARS-CoV 3CLpro. Therefore, the antiviral effect is presumed to be directly linked to suppress the activity of SARS-CoV 3CLpro in some cases. However, the systematic analysis using various scaffolds of flavonoids was not reported. In order to find the best scaffold to inhibit the function of SARS-CoV 3CLpro, an assay with various flavonoid derivatives classified in 10 scaffolds was built and performed. Among them, the best compounds were one flavonol, herbacetin and two flavones, rhoifolin and pectolinarin. In the current assay method including 0.01% Triton X-100, the three flavonoids showed the better inhibitory activity than the previously reported flavonoids as mentioned above28–30. The aggregating tendency of flavonoids frequently leads to obtain false bioassay results and can be avoided by adding 0.01% Triton X-10024. It is worthwhile to note that amentoflavone, the most effective flavonoid inhibiting SARS-CoV 3CLpro28, did not effectively function in the presence of 0.01% Triton X-100 (Figure 5).
Figure 5. The effect of Triton X-100 on flavonoids. Each of two bars represents the inhibitory activity of compounds w/wt 0.01% Triton X-100. The first bar (shaded) represents the control. Inhibitory compounds were used at 40 μM concentration. Each bar is expressed as the mean ± standard error of the mean (n = 3). RFU: Relative Fluorescence Units.
To elucidate the relationship between binding mode and binding affinity, a docking study for herbacetin homologues, kaempferol and morin, were also performed. The glide scores of three compounds were –9.263, –8.526 and –8.930, respectively. The tendency of the glide scores is matched with the binding affinity of the compounds. The comparison of the predicted binding modes of three complex structures showed critical factors governing their binding affinity. In common, they share the kaempferol motif. As shown in Figure 4, the phenyl moiety of kaempferol occupies the S1 site of SARS-CoV through a hydrogen bond with Glu166. In contrast, the chromen-4-one scaffold of kaempferol locates in the S2 site. In herbacetin, total of four hydrogen bonds are formed within a distance of 2.33 Å in the S2 site. Especially, the major binding force was driven by the presence of the additional 8-hydroxyl group which plays a critical role in binding with Glu166 and Gln189. These bindings are predicted to confer a good glide score of herbacetin. In contrast, the binding modes of morin and kaemperol become different due to the absence of above two hydrogen bonds due the lack of the 8-hydroxyl group (Figure 4). The absence also induces the change nullifying the hydrogen bond formed by the 5-hydroxyl group of the chromen-4-one scaffold with Asp187 observed in herbacetin. Though there is a new hydrogen bond formed through the 3-hydroxyl group, it may not enough to overcome the loss of binding capacity induced by the 8-hydroxyl group. This survey shows the importance of the 8-hydroxyl group at the strong binding affinity of herbacetin.
The glide scores of rhoifolin and pectolinarin were –9.565 and –8.054, respectively. They belong to the flavone family. Interestingly, their binding modes are different from those of the above three flavonols. Rhoifolin and pectolinarin possess α-l-rhamnopyranosyl β-d-glucopyranoside and l-mannopyranosyl β-d-glucopyranoside, respectively. The additional bulky carbohydrate groups are attached to 7-position of the chromen-4-one scaffold. As a result, the hydrogen bond formed between the 7-hydroxyl group and the backbone of Ile188 was abolished. In addition, the bulky groups require a large space to reside in. Therefore, these carbohydrate groups occupy the S1 and S2 sites and the chromen-4-one moieties locate in the S2 and S3′ sites, unlike the above three flavonols. The better affinity of rhoifolin may be due to orchestrated binding through S1, S2 and S3' sites.
The assay and docking result indicates important conclusions. At first, flavonoids have a wide range of binding affinity to SARS-CoV CLpro due to their hydrophobic aromatic rings and hydrophilic hydroxyl groups. Second, the presence of carbohydrate groups influences severely to the binding affinity and mode of the chromen-4-one moiety. Third, Triton X-100 is critical to reduce false-positives and overestimates. Based on the result, the strategy to cope with SARS targeting SARS-CoV 3CLpro can be newly designed. A further study is going on to make derivatives which lead to better inhibitory compounds based on this study.
Inhibition of SARS-CoV 3CL protease by flavonoids
Journal of Enzyme Inhibition and Medicinal Chemistry Volume 35, 2020 - Issue 1 Submit an article Journa:Pages 145-151
The authors declare no conflict of interest.