Molecular Evolution of Type I Collagen (COL1a1) and Its Relationship to Human Skeletal Diseases

149393-Thumbnail Image.png
Description
Skeletal diseases related to reduced bone strength, like osteoporosis, vary in frequency and severity among human populations due in part to underlying genetic differentiation. With >600 disease-associated mutations (DAMs), COL1a1, which encodes the primary subunit of type I collagen, the

Skeletal diseases related to reduced bone strength, like osteoporosis, vary in frequency and severity among human populations due in part to underlying genetic differentiation. With >600 disease-associated mutations (DAMs), COL1a1, which encodes the primary subunit of type I collagen, the main structural protein in bone, is most commonly associated with this phenotypic variation. Although numerous studies have explored genotype-phenotype relationships with COL1a1, surprisingly, no study has undertaken an evolutionary approach to determine how changes in constraint over time can be modeled to help predict bone-related disease factors. Here, molecular population and comparative species genetic analyses were conducted to characterize the evolutionary history of COL1a1. First, nucleotide and protein sequences of COL1a1 in 14 taxa representing ~450 million years of vertebrate evolution were used to investigate constraint across gene regions. Protein residues of historically high conservation are significantly correlated with disease severity today, providing a highly accurate model for disease prediction, yet interestingly, intron composition also exhibits high conservation suggesting strong historical purifying selection. Second, a human population genetic analysis of 192 COL1a1 nucleotide sequences representing 10 ethnically and geographically diverse samples was conducted. This random sample of the population shows surprisingly high numbers of amino acid polymorphisms (albeit rare in frequency), suggesting that not all protein variants today are highly deleterious. Further, an unusual haplotype structure was identified across populations, but which is only associated with noncoding variation in the 5' region of COL1a1 where gene expression alteration is most likely. Finally, a population genetic analysis of 40 chimpanzee COL1a1 sequences shows no amino acid polymorphism, yet does reveal an unusual haplotype structure with significantly extended linkage disequilibrium >30 kilobases away, as well as a surprisingly common exon duplication that is generally highly deleterious in humans. Altogether, these analyses indicate a history of temporally and spatially varying purifying selection on not only coding, but noncoding COL1a1 regions that is also reflected in population differentiation. In contrast to clinical studies, this approach reveals potentially functional variation, which in future analyses could explain the observed bone strength variation not only seen within humans, but other closely related primates.
Date Created
2010
Agent