21 Nov Genetic diversity and genomic epidemiology of SARS-CoV-2 during the first 3 years of the pandemic in Morocco: comprehensive sequence analysis, including the unique lineage B.1.528 in Morocco
S. Djorwé , A. Malki , N. Nzoyikorera , J. Nyandwi , S. P Zebsoubo , K. Bellamine and A. Bousfiha
Methods
Study design : This study was conducted and reported in accordance with STROBE (Strengthening the Reporting of Observational Studies in Epidemiology) guidelines [10].
Sequence data acquisition : The complete genomic sequences of SARS-CoV-2 isolates, collected in Morocco from 2020 to 2023, were extracted in FASTA format from the GISAID EpiCoV database (https://gisaid.org/, accessed on 23 January 2024) [11]. The genomic sequences of the Moroccan isolates were compared with the Wuhan-Hu-1 reference genome, identified by accession number NC_045512.2 in the GenBank database.
File S1 contains the digital object identifier (DOI) and EPI_SET identifier of the 2274 SARS-CoV-2 genomic sequences used in this study. The collection dates range from 2 February 2020 to 3 November 2023.
Sequence alignment and phylogenetic analysis of Moroccan genomesIn this study, we used standard dynamic classification systems to assign genetic lineages and viral clades. The classification of genomic sequences into lineages was achieved by Pangolin COVID-19 lineage assigner version 4.3, which is a Phylogenetic Assignment tool for Named Global Outbreak LINeages developed by the Centre for Genomic Pathogen Surveillance (https://cov-lineages.org/resources/pangolin.html, accessed on 23 January 2024) and/or Nextstrain web tool version 3.3.1 (https://clades.nextstrain.org/, accessed on 23 January 2024) [12–14]. Rigorous quality checks and the assignment of viral clades were performed using the Nextclade Web and GISAID. The phylogenetic tree was generated using the UCSC UShER Web interface (https://genome.ucsc.edu/util.html, accessed on 25 January 2024), Microreact (https://microreact.org/upload, accessed on 3 February 2024), and Nextclade. Viral clades were defined based on shared mutation profiles among the analysed genomic sequences [14, 15].
Analysis of mutation profiles and assignment of lineages and cladesThe GISAID database, the Nextclade and Coronapp web tool (http://giorgilab.unibo.it/coronannotator/, accessed on 20 February 2024) [16] were used to detect and annotate all mutations, thus establishing the single nucleotide polymorphism (SNP) profile of the 2274 genomic sequences. This was achieved by identifying substitutions (amino acid), deletions, or insertions (Indels) in structural protein regions, as well as in some regions of non-structural protein (ORF1ab) [NSP1 to NSP16]. Furthermore, several international reference tools and platforms were used to assign SARS-CoV-2 genomic lineages and clades. The GISAID platform was used to assign lineages and clades. Nextclade Web Tool was used to align sequences and identify specific mutations in comparison with the Wuhan-Hu-1 reference sequence, as well as for the phylogenetic placement of lineages. In addition, Nextclade was also used to assign sequences to lineages and clades according to their specific mutational characteristics. Pangolin was used to assign SARS-CoV-2 lineages according to the PANGO nomenclature and is available both as a web application and as a command-line tool on « Cov-Lineages ». Consequently, these integrated approaches provided a detailed analysis of the 2274 genomic sequences, including their phylogenetic placement and assignment to clades [11, 13, 14].
Genomic diversity and demographic distribution of SARS-CoV-2 sequencesA set of 2274 genomic sequences of SARS-CoV-2 collected in Morocco over the 3 years following the pandemic has been analysed, revealing several variants and lineages. Table 1 shows the temporal distribution of variants and lineages among the 2274 sequences analysed. Of the 2274 sequences, 3.9%(89/2274) of isolates were sequenced in 2020, 22.4%(511/2274) in 2021, 53.5%(1217/2274) in 2022, and 20.1%(457/2274) in 2023. Among the 2274 sequences analysed, 20.2%(460/2274) were assigned to lineages other than the Alpha, Beta, Delta, Eta, Kappa, Mu, and Omicron variants. Of the 460 sequences analysed, 19.3%(89/460) were identified in 2020, 40%(184/460) in 2021, 4.1%(19/460) in 2022, and 36.5%(168/460) in 2023. The Alpha variant had a prevalence of 7.7%(176/2274) among all sequences analysed, of which 81.2%(143/176) of sequences were detected in 2021, 5.1%(9/176) in 2022, and 13.6%(24/176) in 2023. The Delta variant and its subvariants accounted for 11.3%(257/2274) of the analysed sequences, among which 62.6%(161/257) were identified in 2021, 28.4%(73/257) in 2022, and 8.9%(23/257) in 2023. The Omicron variant and its subvariants were identified in 59.5%(1353/2274) of all sequences analysed, of which 1.1%(15/1353) were identified in 2021, 82.3%(1114/1353) in 2022, and 16.5%(224/1353) in 2023 (Table 1). Furthermore, the other variants such as Beta, Eta, Kappa, and Mu were less predominant. These findings illustrate the dynamic evolution of SARS-CoV-2 variants and lineages during the pandemic, highlighting periods of predominance of variants and lineages identified over time.
Conclusion: This study provided a detailed analysis of the genomic epidemiology and genetic diversity of SARS-CoV-2 lineages identified in Morocco during the 3 years of the pandemic, enabling a better understanding of the evolution and phylogenetic relationships among different lineages. Several lineages identified in Morocco were closely related to those observed worldwide, except for lineage B.1.528, before their local spread, highlighting the impact of human mobility on the introduction and spread of these lineages during the pandemic. Viral dynamics in Morocco, characterized by a predominance of Alpha, Delta, Omicron variants, and their subvariants, reflected global trends in their evolution. However, the epidemiological trends of some Delta and Omicron subvariants showed variable patterns compared to those observed in other countries. Additionally, several key mutations identified within the lineages analysed were correlated with variations in transmissibility, pathogenicity and antigenicity, which could have contributed to affecting vaccine efficacy and pandemic management. However, the set-up of the SARS-CoV-2 genomic surveillance consortium in Morocco and vaccination campaigns have contributed to control and reduce infection rates and severe forms of COVID-19, thus mitigating the impact of infections at national level.