Spatial distribution of the different strains of the distinct coconut lethal yellowing-type phytoplasma species associated with the syndrome in Tanzania

Phytoplasmas are associated with the lethal disease of coconut palms in Tanzania (LDT). It is a destructive lethal yellowing-type syndrome (LYTS) exhibiting differences in losses between the southern districts and the northern districts. To explain these differences, the existence of variable pathogenic strains of the LDT phytoplasma was investigated using ribosomal RNA gene PCR primers. A total of 84 samples were collected from 67 palms in 14 coastal districts of Tanzania, including the low, moderate, and high incidence areas. Of these, 38 samples were studied in detail. Detected phytoplasma rDNA was characterized by either sequencing of the PCR products and/or restriction fragment length polymorphism (RFLP). Sequence analysis of the P1/P7-primed PCR products revealed several positions of variability, making it possible to distinguish two main geographical clusters. The northern cluster included samples from Tanga region only and is associated with low/moderate disease incidence. A second larger cluster included samples from the rest of the coastline between Bagamoyo and Mtwara. Five genotypes could be identified based on mutations/deletions in the P1/P7 PCR product, two within the northern cluster, and three within the southern cluster. The geographical distribution of the two clusters and the genotypes could be related to the history of coconut introductions in Tanzania. The sequences obtained also confirm that the phytoplasmas associated with LDT are significantly different from all of the other phytoplasmas associated with coconut lethal yellowing-type syndromes worldwide, and it is proposed that these phytoplasmas should be classified into their own 16Sr group.


Introduction
Coconut palm (Cocos nucifera L.) is the most important perennial oil crop that supports the livelihood of most farmers in the coastal areas of Tanzania. Currently, 25 million palms are grown on 262,000 ha along the coastal belt of mainland Tanzania and the islands of Zanzibar, Pemba, and Mafia, where more than 95% of palms are cultivated by smallholder farmers (Kullaya and Mpunami 2008). The crop is, however, threatened by a destructive lethal yellowing-type syndrome (LYTS) known as "lethal disease" or "lethal disease Tanzania" (LD or LDT) or even "Tanzanian lethal decline"   (Fig. 1). The symptoms were described before (Steiner 1978;Schuiling et al. 1981;. Since 1965, LDT has killed more than nine million palms or 40% of the groves . Phytoplasma are consistently associated with the disease (Schuiling et al. 1981;Nienhaus et al. 1982;Mpunami et al. 1999). LDT was first reported in the country in 1905 near Bagamoyo, then in 1912, in the south close to the Mozambique border, and at Kunduchi, close to Dar-es-Salaam ). However, the most serious outbreaks occurred independently in the 1940s in the districts of Kilwa and Rufiji (south of the Pwani region) (Fig. 2), and in the 2000s, in Mkuranga (center of Pwani).
There exists a profound difference in the epidemiology of the LDT between affected areas. In the northern districts of Tanga region and Bagamoyo district, the incidence of disease was low to moderate in the early 2000s (average losses of 0.6% per year). However, in the southern coastal districts of Kilwa, Rufiji, and Mkuranga, the disease was very severe and losses to LDT were over four times higher than in the northern districts Mpunami et al. 2008). These differences have been presumed to be related to the diversity in the coconut germplasm planted in these different regions .
PCR using primer pairs based on the 16S rRNA gene, then on the 16-23S intergenic spacer region have been previously developed and used extensively for detection of LDT phytoplasmas in both plant and insect host tissues (Rohde et al. 1993;Tymon et al. 1998;Mpunami et al. 1999Mpunami et al. , 2000. Molecular studies based on the ribosomal operon have demonstrated that there is an important diversity among the phytoplasma associated with the various LYTS in the world (Harrison et al. 1994;Tymon et al. 1998;Harrison et al. 2002Harrison et al. , 2008Marinho et al. 2008;Martinez et al. 2008;Dollet et al. 2009;Harrison et al. 2014;Pilet et al. 2019). Harrison et al. (2014) noted that the 16SrIV phytoplasmas of the Americas should be classified as "Candidatus Phytoplasma palmae" while the 16SrXXII-A type of Nigeria and Mozambique, should be classified as "Ca. Phytoplasma palmicola" and the 16SrXXII-B type found in Ghana and Côte d'Ivoire as "Ca. Phytoplasma palmicola"-related. They also noted that the LDT phytoplasma from Tanzania, which has historically been classified as 16SrIV-C, is distinct from both of the other two subclades, and it had been informally proposed that these subclades represented three separate candidate species of "Ca. Phytoplasma." In the same article, a name of "provisional candidate species" for the phytoplasma associated with the LYTS in Tanzania was proposed: "Ca. Phytoplasma cocostanzaniae." However, despite the differences in sequences and epidemiology, these three distinct subclades are often mistakenly clumped together and referred to as lethal yellowing disease (originally, "lethal yellowing" is the name used for the disease in the Americas) rather than distinct "lethal yellowing-type syndromes" (LYTS).
In the present study, phytoplasmas associated with 67 LDT-diseased coconut palms in 14 districts throughout coastal Tanzania were analyzed in order to determine whether or not differences in disease incidence were associated with Fig. 1 Symptoms of lethal decline, a lethal yellowing-type syndrome of coconut in Tanzania (Michel Dollet) genetic variability among phytoplasma populations. We report detection of five different genotypes/strains differently distributed according to the variability of the spread of the disease. In addition, the16Sr RNA gene sequences obtained have been used to re-emphasize that the LDT phytoplasmas are different from those associated with LY of the Americas (16SrIV) and from "Ca. Phytoplasma palmicola" (16SrXXII) from West Africa and Mozambique.

Plant material
Eighty-four samples of immature inflorescences and/or leaflets of spear leaves from 67 East African Tall (EAT) coconut palms displaying typical LDT symptoms were collected. Palms for sampling were selected from the farmers' fields in different villages within the 14 coastal districts that were affected by LDT. The districts sampled include Mkinga and Tanga from the north (low incidence areas); Pangani, Bagamoyo, Kinondoni Temeke, and Mkuranga from north and central Tanzania (moderate incidence), Rufiji, Kilwa, and Lindi in the south (high incidence areas) (Fig. 2). The collection also included districts of Mtwara, Tandahimba, and Newala, further south in an area that had previously low incidence but in which the incidence of disease has recently increased in the early 2000s. Only samples of the diseased trees showing a strong electrophoresis band exploitable for direct sequencing or RFLP are listed (Table 1).
Leaflets were excised from the immature spear leaves, and immediately placed in plastic bags for transportation to the laboratory at Mikocheni Agricultural Research Institute (MARI) in Dar-es-Salaam. For inflorescences, whole unopened inflorescences were excised from affected palms and transported to the laboratory. On arrival, the inflorescences were split open, and flowers removed from the rachillas. The rachillas were cut off and crushed for DNA extraction, or frozen at − 20°C until DNA was extracted. Similarly, the leaflets were immediately crushed for DNA extraction, or frozen at − 20°C.

DNA extraction
DNA was extracted from the spear leaves or rachillas, using a modification of the CTAB extraction protocol of Doyle and Doyle (1990). Five grams of freshly harvested or frozen tissues were used. Nucleic acids pellets were resuspended in 1 ml of 1× TE buffer, pH 8·0.

PCR analysis
Two different primer combinations were used to amplify phytoplasma rDNA from LDT affected palm tissues. First, amplification of part of the 16S rRNA gene was performed using the primer pair Rohde forward and Rohde reverse (Rohde et al. 1993). The second primer pair was the phytoplasma universal primer pair P1/P7 (Deng and Hiruki 1991;Smart et al. 1996) which covered most of the 16S rRNA gene, the 16S-23S rRNA intergenic region and the 5′end of the 23S rRNA gene (Fig. 3).
For each PCR, a 25 μL reaction mixture contained about 50 ng template DNA, 150 mM mixed deoxynucleotide triphosphates (dNTP), 50 ng of each primer, 1 unit of Taq polymerase, and PCR buffer 1× (PCR Master Kit, Qiagen). For the Rohde's primer pair, the mixture was subjected to 36 cycles (Gene Amp PCR System 9700, Applied Biosystems) with the following parameters: 94°C for 60 s (after a first denaturation of 60 s), 57°C for 80 s, 72°C for 130 s, followed by a final extension of 72°C for 5 min.
The PCR parameters for the P1/P7 primer pair were as follows: 94°C for 90 s followed by 35 cycles of 94°C for  PCR products were analyzed by electrophoresis through a 1% agarose gel and visualized in the gel by UV transillumination after staining with ethidium bromide.

DNA sequencing and sequence analysis
Twelve PCR products using Rohde's primers and 15 samples amplified using P1/P7 were sequenced (Beckman Coulter Genomics). Rohde's PCR products were sequenced using Rohde forward primer only. For the P1/P7 PCR, the two strands were sequenced using both P1 and P7. Accession numbers of the sequenced products are listed in Table 1.
Sequences were edited and aligned along with those of LDT phytoplasma accessions X80117 and EU168773, and sequences of phytoplasmas from other 16Sr groups obtained from GenBank. Phylogenetic analysis was performed in Mega v. 6.06 (Tamura et al. 2013) using the package ClustalW followed by a phylogenetic reconstruction by neighborjoining using the bootstrap method (with 1000 replications) as a test of phylogeny and maximum composite likelihood as the model.
Based on the variable regions identified in the aligned sequences, putative endonuclease recognition sites and the relevant restriction enzymes that were likely to show differences between LDT samples were identified using the Vector NTI software (Invitrogen).

RFLP analysis of PCR products
Variability within the 16S rRNA gene amplified by primer pair P1/P7 was identified by digestion of the P1/P7 product with BstUI endonuclease, which was the unique enzyme among the 17 commonly used for phytoplasmas (Lee et al. 1998) able to differentiate our samples. Six microliters (6 μL) of the P1/P7 PCR product (for selected samples) was digested with 5 U of BstUI, (New England Biolabs) at 60°C overnight in 10 μL reaction volumes. Digests were separated on 2% agarose gels in 1 X TAE buffer, stained with ethidium bromide, and visualized under UV light.
In silico digestions of the F2nR2 region with 17 restriction enzymes were performed using pDRAW32 software (AcaClone Software, http://www.acaclone.com) as previously described (Wei et al. 2007(Wei et al. , 2008, Similarity coefficients were derived from the relationship F = 2Nxy/(Nx + Ny) (Nei and Li 1979) in which x and y are the two given strains under investigation, Nx and Ny are the total fragments resulting from digestion by 17 enzymes in strain x and y respectively, and Nxy is the number of fragments shared by the two strains.

Results
Detection of LDT phytoplasma using the primer pair Rohde forward/Rohde reverse The Rohde forward/Rohde reverse primer pair, designed to detect the LDT phytoplasma, was used to analyze the DNA extracted from all 84 collected palm samples. A 450-bp band was resolved by agarose gel electrophoresis from the positive control LDT DNA and from all 84 samples, except three. One of the three palms that did not amplify a product with this LDT-specific primer pair was probably suffering from drought rather than LDT.
Detection of LDT phytoplasma using the phytoplasma universal primer pair P1/P7 In order to study a larger fragment of the phytoplasma 16S rDNA gene and the 16S-23S rDNA intergenic region, DNA from 38 samples representative of the low, medium, and high disease incidence areas were amplified using the phytoplasma universal primer pair P1/P7. A 1.8-kb PCR product was amplified from all 38 samples and from the clover phyllody DNA (positive control), but not from the water control (Fig. 4). Analysis of genetic variation between isolates by sequencing Amplification products obtained with the Rohde primer pair and with P1/P7 were both used to search for genetic variation between the different isolates.
Products obtained with Rohde's primers from 12 EAT coconuts were sequenced. The alignment of the 12 sequences did not reveal any variability throughout the entire 500 bp of the sequenced regions, excepted for the isolate Tanz 08-06, which showed one mutation (R1) (Fig. 3) at the position 1002 (Table 2). First position has been arbitrarily defined as the first base of the P1/P7 theoretical product (A 1 A 2 G 3 A 4 G 5 T 6 T 7 T 8 G 9 A 10 …).
Further analysis was done on a selection of PCR products amplified from 17 geographically representative isolates using the primer pair P1/P7. Sequence alignment revealed a number of variations/mutations/deletions which could group LDT samples into 5 different genotypes named TZ-I to TZ-V ( Table 2). The first genotype (TZ-I) differs from the second one (TZ-II) characterized by a double mutation (M1:451; M2:460) inside the 16S rDNA corresponding to dimers in the secondary structure of the 16S rRNA (Table 2). A third mutation (M3:1242) was found inside the 16S rDNA and defined the genotype TZ-III. This mutation occurred in the middle of the sequence of the Rohde reverse primer. No mutation was observed inside the Ile tDNA. Two deletions were found. The first one (M4:1583) was located inside the internal transcribed spacer region (ITS1) and defined the TZ-IV genotype, whereas the second one (M5:1698-1699) corresponds to a double deletion localized in the ITS2, and separated out genotype TZ-V.
The sequence [X80117], which was derived from a sample collected in a 1993 survey from the Chambezi/Kifumangao region close to Bagamoyo, covering a large part of the 16S rDNA gene, showed 100% of homology with the samples of the genotype TZ-III, IV and V whereas the sequence [EU168773] (Hodgetts et al. 2008), also derived from a sample from the same 1993 survey, but which only covered the 16S-23S ISR, matched with the genotype TZ-I to TZ-III at 100% (Table 2).
One last mutation could be observed at the position 1759. However, this mutation was not considered because of the low number of sequences workable at this position.
Although grouped as genotype TZ-III, the isolate Tanz 08-78, is considered a variant due to the same mutation (R1) as observed for the isolate Tanz 08-06 on the Rohde primer sequences. According to this classification, the samples with Accession X80117 and EU168773 probably belong to the genotype TZ-III and indicate that this sequence type was present in the central region in the early 1990s.
One isolate (Tanz 08-18) presented ambiguous results. The sequence chromatogram displayed the mutations M1 and M2, but for each of those positions, a double signal of equal intensity was shown. At the position 457 (M1), both G and A are present whereas both T and C can be observed at position 460 (M2).
Because the two sequences of the sample Tanz 08-74 were not contiguous, we were not able to assemble them in one unique sequence. The two sequences have been submitted for accession numbers independently. However, according to the mutations observed, this sample has been grouped as genotype TZ-III, the 56 bp missing corresponding to an unvariable sequence. Mapping of the different genotypes revealed a gradient from the north to the south of Tanzania (Fig. 2). The genotypes TZ-I and TZ-II were present in the northern districts of Mkinga, Tanga, and Pangani only. The genotype TZ-III was observed in the Pwani (Coast) region and in the north of the Lindi region (Kilwa district). The genotype TZ-IV was present in the center and the south of the Lindi region (district of Kilwa and Lindi). The last genotype TZ-V has been observed exclusively in the Mtwara region.

Analysis of variation between isolates by restriction fragment length polymorphisms
To be able to evaluate more LDT isolates, the enzyme BstUI was used to digest the PCR products amplified from 38 LDT samples with P1/P7. Two patterns of restriction fragments were obtained (Fig. 5). The first pattern (RFLP-A) is characterized by one band of 816 bp whereas the pattern RFLP-B displays one band of 728 bp. The sizes of the three following bands are too close to be differentiated on agarose gel. The last band of 88 bp can be observed only for the pattern RFLP-B.
Of the 38 digested samples, 10 samples belong to the group RFLP-A (Table 1) and are categorized as either the genotype TZ-I or TZ-II. Those 10 samples originated from the north of the country and correspond to the low to moderate disease incidence areas (Fig. 2). The remaining 28 samples fall in the RFLP-B pattern and correspond to the genotypes TZ-III,   TZ-IV, and TZ-V. They originated from the regions of Pwani, Lindi, and Mtwara (Fig. 2). Virtual RFLP pattern with 17 restriction enzymes revealed only two different patterns with the BstUI enzyme. A total of 45 bands were generated for the isolates of the RFLP-B group, and 46 for the RFLP-A group, with 44 bands being common to both groups, and giving a similarity coefficient F = 0.967.
Phylogenetic relationship of LDT to other coconut lethal yellowing-type syndrome phytoplasmas To confirm the relationship of the current LDT strains to historic strains of LDT and to the LYTS phytoplasmas from the Americas and Africa, a phylogenetic tree was constructed using the 16S rRNA sequences for the 15 samples from this study for which P1/P7 sequences were obtained (Table 1), with sequences retrieved from GenBank. The phylogenetic tree (Fig. 6), confirms the position of the LDT subclade as being distinct from the 16SrIV subgroups A, B, D, E, and F subclade of the Americas, and from the "Ca. Phytoplasma palmicola" 16SrXXII subgroups A and B subclade from West Africa and Mozambique.

Discussion
This study is the first to show a regional distribution of different LDT phytoplasma genotypes. We demonstrated that the different LDT phytoplasma genotypes were not randomly distributed, but rather localized in spatially distributed ecological niches. The results, therefore, confirmed the hypothesis about local adaptation of the LDT pathogen to existing environmental conditions including the host population. The existence of different strains of the LDT phytoplasma was one hypothesis to explain observed differences in disease incidence between various regions of Tanzania. However, genotype differences were not evident using Rohde's primers (Mpunami et al. 1999). For the first time, we report here a strong correlation between the different genotypes of the LDT phytoplasma and the different levels of disease incidence in Tanzania.
Sequence analysis of the 450 bp rDNA sequences from 12 representative LDT samples with Rohde's primers revealed no variability and confirms the previous observation of Mpunami et al. (1999) of nine samples from different areas amplified with these primers. Only one sample, from Kilwa district, displayed a single base substitution. However, using the P1/P7-primed 1.8 Kbp PCR products from 15 representative LDT samples in this current study, the LDT samples could be divided into five phytoplasma genotypes.
This study has revealed one mutation (M3) inside the Rohde's reverse primer. This mutation may have previously affected PCR assays and resulted in false negatives. Moreover, these primers are not able to identify the two subgroups present in Tanzania. For this, it would be necessary to define a new primer pair enclosing the M3 mutation to generate a large PCR product which can be used for restriction enzyme digestion analysis.
Direct sequencing of PCR products without the selective cloning step revealed two sequences for one of our samples (Tanz 08-18). This result could be due to the presence of two heterogenic rRNA operons in the strain as previously described for several phytoplasma strains (Harrison et al. 2004;Liefting et al. 1996;Jomantiene et al. 2002). However, the single 16S-23S sequence for the other samples and the presence of the genotype TZ-I and TZ-II in the Tanga region (Fig.  2) may also suggest a mixed infection.
Two genotype clusters were identified using BstUI restriction enzyme, the first (RFLP A) grouping two genotypes, TZ-I and TZ-II, and the second (RFLP B) grouping the genotypes TZ-III to TZ-V. Samples of the subgroup A are located in the north of Tanzania and are associated with low to moderate disease incidence. According to Schuiling and Harries (1994), the Tanga region is an area where coconuts were first introduced in Tanzania several centuries ago, probably from India. Furthermore, experience gained from years of field experimentation with coconut materials collected from these districts and other parts of the country has demonstrated that some populations of East African Tall palms from Tanga region are more tolerant to LDT even when planted in areas of high disease incidence Kullaya et al. 1995;Mpunami et al. 2008). This suggests that some introductions of the EAT coconut in the area possess disease resistance genes, considering that the introductions were gradually done over an extended period of time and from different sources, hence the heterogeneous nature.
The B subgroup samples occur in regions in which the incidence of disease ranges from moderate to very high. In most of these regions, coconuts were introduced fairly recently when compared to Tanga region. In the area covered by the subgroup, it is possible to distinguish three regions with three different levels of disease incidence. In the Bagamoyo district (North of the subgroup), where the LDT was first diagnosed in 1905, the coconut industry is comparatively older, and the incidence of disease is moderate. While no P1/P7 sequences were derived from samples from this region in the current study, the historic sample that was used to generate the LDT sequences in the GenBank database of X80117 and EU168773 came from this area and appears to be of genotype TZ-III. The samples from the area between Bagamoyo in the Coastal region and Kilwa district in Lindi region have this same genotype TZ-III but are associated with the very high incidence of disease, causing complete devastation during the past 40 years. In the South, in the districts of Mkuranga, Rufiji and Kilwa, where coconut growing was intensified more recently during the German colonial era (1885-1920) using uniform planting material from Mafia Island, extensive areas have been devastated by the disease. The genotype TZ-V from the southern regions of Lindi and Mtwara correspond to areas where rapid disease spread is a recent phenomenon. Although LDT was first reported in the area of Mtwara nearly a century ago (1912), the disease did not spread in these districts until 1998 when it was first observed to be spreading rapidly.
Interestingly, there is a close correlation between the distribution of the LDT phytoplasma strains identified and the distribution of the EAT coconut genotypes found along the coast of Tanzania. In a study conducted by Kullaya et al. (2001), two main clusters of palm populations were revealed, with one cluster of sub-populations from Tanga region only (northern part of the coastal belt), and another cluster composed of subpopulations from Bagamoyo (central part) and the southern districts of Lindi and Mtwara regions. The studies thus confirmed that the EAT is genetically a very heterogeneous population, even if globally, the coconuts in Tanzania result from the introgression of germplasm from South-East Asia into a genetic background that originated from South Asia (Gunn et al. 2009).
In addition to the data on the separation of the LDT population into the A and B subgroups in this study, we have also re-analyzed the relationship of these phytoplasmas to the others that have been reported as being associated with the LYTS. In the Americas, phytoplasmas of group 16SrIV occur in coconut and several other palm species. In this group it is possible to distinguish at least 5 subgroups, − A, B, D, E, and F, and the LY disease of coconut is mainly associated with subgroup 16SrIV-A. The phytoplasmas of subgroup 16SrIV-D have also been detected in coconut but are mainly detected in other palms such as Phoenix spp., in which they cause a decline called Texas Phoenix palm decline (TPPD). The LDT phytoplasma, which was previously classified as 16SrIV-C within this system, is clearly in a distinct subclade (Fig. 5) as it has been suggested before (Tymon et al. 1998;Mpunami et al. 1999;Marinho et al. 2008;Harrison et al. 2014). We advocate that the 16SrIV-C should no longer be used for this phytoplasma, and it should be given a new 16Sr group designation. The candidate species could be the one already suggested "Ca. Phytoplasma cocostanzaniae." However, as the same phytoplasma occurs in diseased coconuts in Kenya and in the north of Mozambique, "cocostanzania" may be too restrictive. The 16SrXXII-A "Ca. Phytoplasma palmicola" and 16SrXXII-B "Ca. Phytoplasma palmicola"-related strains from west Africa and Mozambique clearly form a third distinct subclade, as previously reported by Harrison et al. (2014).
Additional phylogeny studies on both the LDT phytoplasma and the coconut host from Tanzania, Kenya and Mozambique using more polymorphic genes and microsatellites to increase precision and involving more samples could help to understand the relationship between the phytoplasmas, the different level of resistance or tolerance to LDT and their co-evolution in East Africa.