Abstract
The human SARS-CoV-2 spike protein sequences from Asia, Africa, Europe, North America, South America and Oceania were analyzed by comparing with the reference SARS-CoV-2 protein sequence from Wuhan-Hu-1, China. Out of 10,333 spike protein sequences analyzed, 8,155 proteins comprised one or more mutations. A total of 9,654 mutations were observed that correspond to 400 distinct mutation sites. The receptor binding domain (RBD) which is involved in the interactions with human ACE-2 receptor and causes infection leading to the COVID-19 disease comprised 44 mutations that included residues within 3.2 Å interacting distance from the ACE-2 receptor. The mutations observed in the spike proteins are discussed in the context of their distribution according to the geographical locations, mutation sites, mutation types, distribution of the number of mutations at the mutation sites and mutations at the glycosylation sites. The density of mutations in different regions of the spike protein sequence and location of the mutations in protein three-dimensional structure corresponding to the RBD are discussed. The mutations identified in the present work are important considerations for antibody, vaccine and drug development.