Structure of the PTH mRNA
The 5’ Untranslated Region
The 5’ untranslated sequence of the longer forms of the human and bovine mRNAs and rat PTH mRNA contains about 120 nucleotides, and the shorter bovine and human cDNAs contain about 100 nucleotides in the 5’ noncoding region.
The average length of the 5' UTR in eukaryotic mRNAs is 80-120 nucleotides.
As a result, the m7G cap at the 5’ terminus of the mRNA is a considerable distance from the initiator codon.
In the bovine sequence, a possible hairpin loop may bring the 5’ end closer to the initiator codon. However, in both the human and rat sequences, deletions of 11 and 16 nucleotides respectively, largely eliminate the sequences involved in the stem of the loop.
Thus, there seems to be little functional significance related to the bovine secondary structure.
In the rat PTH mRNA 5' terminus the first 19 nt of the mRNA may form a stable stem loop structure that could affect PTH mRNA translation, but its function has not been determined. (T. Naveh-Many, unpublished data).
However the most outstanding conclusion from a comparative analysis of the sequences is that no region in the 5’ untranslated region is conserved that has any known functional significance.
The Coding Region
The actual initiator ATG codons for the human and bovine PTH mRNAs have been identified by sequencing in vitro translation products of the mRNAs. In the bovine sequence, the first ATG codon is the initiator codon, in accord with many other eukaryotic mRNAs.
The human and rat sequences have ATG triplets prior to the probable initiator ATG, which are present ten nucleotides before the initiation codon and are immediately followed by a termination codon.
In the rat, another ATG is present 115 nucleotides before the initiator codon.
The designation of the third ATG codon of the rat sequence as the initiator codon is based on indirect evidence, primarily by comparison with the bovine and human cDNAs.
Regardless, the presence of termination codons in phase with the earlier ATG prohibits the synthesis of a long protein initiated at these codons, as is the case in some other genes with premature ATG codons.
However in some systems, small peptides that are translational products of upstream ATGs have been shown to have regulatory functions.
Whether this is the case in the rat PTH mRNA is not known.
The most stringent requirement for optimal initiation of synthesis is for a purine at the –3 position.
Since non of the premature ATG codons in the rat and human PTH mRNAs has a purine at the –3 position, they are likely to be weak initiators.
In contrast, the probable initiator ATG codon has an A at the –3 position in each sequence.
The 3’ Untranslated Region
As noted above, the 3’ untranslated region is the most variable region of the cDNAs requiring significant gaps to maximize homology.
The termination codon in all species except for the rat and mouse is TGA and is followed closely by a second in-phase termination codon.
In the rat and mouse TAA is the termination codon and no following termination codon is present.
Conservation of Protein Binding Elements in the PTH mRNA 3'-UTR
We have defined the cis sequence in the rat PTH mRNA 3’-UTR that determines the stability of the PTH mRNA and its regulation by calcium and phosphate (P). PTH gene expression is regulated post-transcriptional by Ca2+ and P, with dietary induced hypocalcemia increasing and dietary induced hypophosphatemia decreasing PTH mRNA levels.
This regulation of PTH mRNA stability correlates with differences in binding of trans acting cytosolic proteins to a cis acting instability element in the PTH mRNA 3'-UTR.
There is no PT cell line and therefore to study PTH mRNA stability we performed in vitro degradation assays. We did this by incubating the labeled PTH transcript with cytosolic PT proteins from rats on the different diets and measuring the amount of intact transcript remaining with time.
PT proteins from low Ca2+ rats stabilized and low P PT proteins destabilized the PTH transcript compared to PT proteins of control rat.
This rapid degradation by low P was dependent upon the presence of the terminal 60 nt protein binding region of the PTH mRNA.
We have defined the cis sequence in the rat PTH mRNA 3’-UTR that determines the stability of the PTH transcript and to which the trans acting PT proteins bind.
A minimum sequence of 26 nt was sufficient for RNA-protein binding.
One of the trans acting proteins that binds and prevents degradation of the PTH mRNA was identified by affinity purification.
This protein is AU rich element binding protein 1 (AUF1) that is also involved in half life of other mRNAs.
To study the functionality of the cis sequence in the context of another RNA, a 63 bp PTH cDNA sequence consisting of the 26 nt and flanking regions was fused to the growth hormone (GH) cDNA.
Since there is no parathyroid (PT) cell line an in vitro degradation assay was used to determine the effect of PT cytosolic proteins from rats fed the different diets on the stability of RNA transcripts for GH and the chimeric GH-PTH 63 nt.
The GH transcript was more stable than PTH RNA and was not affected by PT proteins from the different diets.
The chimeric GH PTH 63 nt transcript, like the full-length PTH transcript was stabilized by PT proteins from rats fed a low calcium diet and destabilized by proteins from rats fed a low phosphate diet. Therefore, the 63 nt protein binding region of the PTH mRNA 3’-UTR is both necessary and sufficient to regulate RNA stability and to confer responsiveness to changes in PT proteins by calcium and phosphate.
Sequence analysis of the PTH mRNA 3’-UTR of different species revealed a preservation of the 26 nt core protein-binding element in rat, mouse, human, cat and canine 3’-UTRs.
The cis acting element identified is at the 3' distal end in all species that express it and is therefore designated the distal functional cis element.
The conservation of the sequence suggests that the binding element represents a functional unit that has been evolutionarily conserved.
Protein binding experiments by UV cross linking and RNA electrophoretic mobility shift assays showed that there is specific binding of rat and human parathyroid extracts to an in vitro transcribed probe for the rat and human PTH mRNA 3'-UTR 26 nt elements.
In contrast, the 26 nt distal cis element was not present in the 3'-UTR of bovine, porcine and gallus PTH mRNA.
To determine the protein binding pattern of the bovine PTH mRNA, binding experiments were performed with bovine parathyroid gland extracts and RNA probes for different regions of the bovine PTH mRNA.
Binding and competition experiments revealed a 22 nt minimal protein binding element in the bovine PTH mRNA 3'-UTR that was sufficient for protein binding.
The 22 nt element is at the 5' portion of the 3'-UTR and is the proximal cis element. Interestingly this element was also present in the 3'-UTRs of man, dog, cat, non-human primates, horse and porcine PTH mRNA.
Therefore the PTH mRNA 3'-UTRs of man, dog and cat have both sequences, the distal functional cis element of 26 nt that has been characterized in rat PTH mRNA and the 22 nt proximal protein binding element initially characterized in bovine PTH mRNA.
The bovine and porcine mRNAs only have the 22 nt element and the gallus PTH mRNA has neither of the elements.
It is not known if the 26 nt element is present in the horse and macaca because there is only partial sequencing of these mRNAs. Though the 22 nt sequence is a protein binding element, its functionality remains to be determined.
The Polyadenylation Signal
Another region that is well conserved in the PTH mRNA 3'-UTR is the AATAAA polyadenylation signal.
In the bovine sequence, only a single AATAAA has been detected in the 3’ noncoding region, whereas in the human and rat sequences two potential polyadenylation sites are found.
The second AATAAA region in the human sequence is about 60 bases upstream from the first and has been suggested to have resulted from a gene duplication; however, other than the AATAAA regions, there is little homology surrounding the two sites.
Sequences analogous to the human upstream AATAAA are missing in both the bovine and rat sequences.
No cDNAs were detected in which the upstream AATAAA was utilized as a polyadenylation signal; however the probability that these sites function as a polyadenylation signal cannot be ruled out.
The rat sequence also has a second AATAAA site about 115 nucleotides earlier than the functional one.
A single rat PTH mRNA was detected by Northern blot analysis , suggesting that only one polyadenylation site is used, and the size of the mRNA was consistent with the second AATAAA being the site.
There is no direct evidence for the location of the 3’ end by analysis of the rat PTH mRNA or cDNA.
The PTH Gene
The genes for human, bovine, rat and mouse PTH have each been cloned and characterized from genomic libraries in lambda phage.
The human gene was isolated from a total human fetal DNA library prepared in λ phage Charon 4A.
The library was screened initially by filter hybridization with human cloned cDNAs as a probe and later by the recombination selection method.
The structure of the human gene was determined by the analysis of two overlapping clones.
For the bovine gene, Southern analysis of the total bovine DNA showed that the PTH gene was present on an 8000 bp EcoR1 fragment.
To clone the gene, bovine liver DNA was digested with EcoR1 and fragments in the range of 5,000 to 10,000 bp were isolated by sucrose gradient centrifugation.
A partial library was then constructed by ligating the EcoR1 fragments to λ phage Charon 31 arms, which had been isolated after digestion with EcoR1.
Several independent clones were isolated by plaque filter hybridization using cloned bovine PTH cDNA as probes.
The rat gene was isolated from a λ phage Charon 4A rat liver DNA library produced by partial EcoR1 digestion of the rat DNA.
Two independent positive plaques were obtained.
The insert of each of the two phages contained the entire rat PTH gene.
The sequence of the mouse gene was determined from a mouse genomic library.
One recombinant clone contained 14 kb of DNA, encompassing the entire PTH gene.
The transcriptional unit spans 3.2 kb of genomic DNA, analogous to the human PTH gene.
Structure of the Gene
In the human gene the larger intron A is aproximatly twice as long as this intron in bovine and rat.
All sequenced genes contain two introns.
The exact location of the bovine and human gene introns was determined by comparing the sequence of the gene to the previously determined cDNA structure.
The location of intron A of the rat was determined by comparing the gene sequence with the sequence of cDNA to the 5’ end of the mRNA.
The cDNA was synthesized with reverse transcriptase using a synthetic pentadecamer as primer.
Intron B in the rat gene was determined indirectly by homology of the sequence to the human and bovine cDNA sequences.
The locations of the introns are identical in each case as has been found with most other genes.
Intron A splits the 5’ untranslated sequence five nucleotides before the initiator methionine codon.
Intron B splits the fourth codon of the region that codes for the pro sequence of preProPTH.
The three exons that result, thus, are roughly divided into three functional domains.
Exon I, 95 to 121 nucleotides long, contains the 5’ untranslated region.
Exon II, has 91 nucleotides and codes for the pre sequence, or signal peptide and exon III, 375 to 486 nucleotides long, codes for PTH as well as the 3’ untranslated region.
The structure of the PTH gene is thus consistent with the proposal that exons represent functional domains of the mRNA. Although the introns are at the same location, the size of the large intron A in human is about twice as large as those in the rat and bovine.
It is of interest that the human gene is considerably longer in both intron A and the 3’ untranslated region of the cDNA compared to the bovine, rat and mouse.
Knowledge of the structures of other PTH genes from other species will be necessary in order to determine whether the extra sequence was inserted or is less susceptible to deletion in the human gene. Both introns have the characteristic splice site elements.
They have the GT-AG nucleotides at the 5' and 3' ends of the intron and the pyrimidine tract at the 3' end of the intron.
The large intron A, has about 75% homology between the bovine and human PTH genes in over 200 bp of the intron, similar to the homology in the other non-translated regions of the genes.
The rat intron A is only 55-57% homologous to the other species. The sequences of introns are generally only conserved at the cis elements essential for splicing and the relatively large homology for the PTH genes suggests that there may be some constrains on the basis of changes some distance from the intron /exon border.
The second exon, containing 106 and 121 nt in the human and bovine pre-mRNA, is much smaller and more homologous in size among the genes than intron A.
The sequence of intron B is well conserved with homology of 74 and 68% of bovine/human and human/rat, respectively, but is relatively poorly conserved between the rat and bovine genes, with a homology of 49%.
In each of the species, only a single PTH gene appears to be present.
Extensive Southern blot analysis of bovine DNA with cloned PTH cDNA as probe produced single hybridizing bands for restriction enzymes that do not cut within the probe sequence.
The restriction map determined from the Southern analysis of bovine DNA was consistent with that of the cloned gene.
With the exception of a single nucleotide in the 3’ untranslated region, the sequence of the cloned cDNA was identical to the sequence of the exons in the gene.
Less extensive Southern blot analysis of the human and rat genes also were consistent with a single gene per haploid genome.
Furthermore, in the human studies, probes from the 5’ and 3’ ends of the cDNA both hybridized to the same sized fragment, and the strength of the signal from the genomic DNA was about the same as one gene-equivalent of the gene cloned in λ phage. Thus, the PTH gene is a single gene. The genes for PTH and PTHrP (PTH-related protein) are located in similar positions on sibling chromosomes 11 and 12.
It is therefore likely that they arose from a common precursor by chromosomal duplication.
Initiation Site for RNA Transcription
As noted above in the discussion on the cDNA, the 5’ termini of bovine PTH mRNA are heterogeneous.
The large mRNAs contain a TATA sequence in the appropriate location to direct the synthesis of the smaller mRNAs. It was postulated that a second TATA would be found in the gene sequence 5’ of the first one.
In both the human and bovine gene sequences, a second TATA sequence is present in the 5’ flanking region about 25 base pairs from the first one in the appropriate position to direct the synthesis of the larger mRNAs.
The heterogeneity of the 5’ end of the bovine PTH mRNA, originally detected by reverse transcription of the mRNA, was confirmed by S1 nuclease mapping.
The initiation sites for human PTH mRNA have not been determined directly, but were proposed on the basis of analogy with the bovine sequence and the consensus TATA sequences.
The presence of multiple functional TATA sequences has been reported for several other eukaryotic genes.
The rat mRNA appears to be relatively homogeneous at the 5’ terminus on the basis of both primed reverse transcriptions near the 5’ end of the mRNA and S1 nuclease mapping.
The single initiation site for the rat mRNA can be explained by the changes in the rat sequence which alter the second downstream TATA sequence.
The sequence, TATATATAAAA, in the human and bovine genes, is changed to TGCATATGAAA in the rat gene, which is no longer a consensus TATA sequence.
While this change seems the most likely explanation for the difference in length at the 5’ termini between the mRNAs, there are other changes that also occur in this region of the gene and may play a role.
The smaller bovine PTH mRNAs are also heterogeneous with initiation occurring over a range of about eight nucleotides at the 5’ terminus.
The second TATA sequence in the bovine sequence is unusual since the sequence TA is repeated five times, and thus the TATA-like sequence is spread over 12 base pairs.
This may result in a less rigorous delineation of the appropriate start site.
The conclusion that the 5’ end of bovine mRNA is heterogeneous has not been conclusively proven.
Both the S1 nuclease mapping and the primed reverse transcriptase techniques require that the mRNA is intact and not degraded.
Since in the studies described above, it was not demonstrated that all the mRNA had a 5’ methylguanosine cap and thus was intact, the possibility that heterogeneity was introduced during isolation of the mRNA cannot be excluded.
However, the additional indirect evidence provided by the presence of two TATA sequences considerably strengthens the theory that two regions are utilized for initiation of transcripts.
The 5’ Flanking Region
The three PTH genes of human, bovine and rat show homology in the first 200 bp upstream of the RNA initiation site of the 5’ flanking region.
The homology in this region is similar to that in the 5’ untranslated region of the mRNAs.
There are few stretches of sequence in the 5’ flanking region that are completely conserved in all three sequences except for the TATA sequences.
A C-rich sequence, GCACCGCCC, about 75 bp to the 5’ side of the upstream TATA sequence is present in all three sequences, and an AT-rich region of about 25 bp immediately prior to this C-rich region is strongly conserved.
A sequence, CAGAGAA, about 25 bp to the 5’ side of the TATA sequence, is also present in all three sequences.
No CAAT sequence is present 5’ of the TATA sequences. In the bovine gene, an extraordinary stretch of almost 150 nucleotides, located from 250 to 400 nucleotides before the transcript initiator, consists primarily of alternating AT.
A similar region is not present in the rat gene, suggesting it is not critical for the function of the gene.
There are defined functional response elements in the 5'-flanking region that regulate PTH gene transcription, such as the vitamin D response element (VDRE) and the cyclic AMP response element (CRE).
The 3' Flanking Region
In the 3’ flanking region, again there is also considerable homology between the bovine and human sequences.
A small inverted repeat region, that could form a hairpin loop in the transcript, is followed by a stretch of 7 Ts.
There is no direct evidence that this region serves as a transcriptional stop signal in the PTH genes.
A difference in the stem in the human compared to bovine is matched by a second change in the human that maintains the base pairing in the stem.
A similar sequence is not present in the approximately 110 bp of 3’ flanking sequence reported for the rat sequence.
The rat in fact has little homology with either of the other two sequences beyond the polyadenylation signal.
This is surprising in view of the homology retained between the rat PTH gene and the other genes in the 5’ flanking and intron regions.
Perhaps the polyadenylation signal for the rat sequence is derived from a different region of the gene, which was moved into its present position by a deletion of sequence or translocation.
Large gaps must be introduced into the bovine and rat sequences just prior to the polyadenylation signal supporting the idea that this may be a relatively unstable region of the gene.
Overall, the PTH genes are typical eukaryotic genes that contain the consensus sequences for initiation of RNA synthesis, RNA splicing, and polyadenylation.
The PTH genes appear to be represented only once in the haploid genome.
Perhaps the most striking characteristic of the DNA in the region of the genes is its stability.
In addition, regions that diverge rapidly in other genes are relatively stable in the PTH genes, particularly between the human and bovine sequences and to a lesser extent with the rat sequence.
Thus, considerable homology is observed between 5’ and 3’ flanking and untranslated regions, internal regions of introns, and potential sites for silent changes in the coding region.
Since these regions that do not change the amino acid sequence have been estimated to diverge at a rate of 1% 106 years, relatively low homologies would be expected from these sequences that diverged about 60 to 80 x 106 years ago.
Whether this conservation of sequence occurs because the genes happen to be present in a region of the chromosome that is usually stable or reflects some functional constraints inherent in the PTH gene, remains to be elucidated.
The rat and mouse sequences are considerably less homologous to the human and bovine sequences than these sequences are to each other.
This observation is difficult to explain, since evolutionarily each of the sequences is about equidistant from another.
Potentially, differences in the physiology or nutrition of calcium in the rat and mouse compared to the other two species may have resulted in increased acceptance of mutations in the rat PTH gene.
Chromosomal Location of the Human PTH Gene
The location of the human PTH gene on chromosome 11 has been determined independently by two groups. The assignments were made by screening panels of human-mouse 46 or human-mouse and human-Chinese hamster cell 47 hybrids with a human cDNA clone or a cloned fragment of human genomic DNA.
The PTH gene was further localized to the short arm of the chromosome 11 by analysis of human-mouse hybrids with various translocations.
The short arm of chromosome 11 contains several other polymorphic genes including the β globin gene cluster, insulin, and the human oncogene Harvey ras (C-Ha-ras-1).
The polymorphisms in these genes and PTH were used to determine whether the genes are genetically linked and their order on the chromosome.
In addition to these genes, the gene for calcitonin has also been mapped to the short arm of chromosome 11.
Thus, the short arm contains genes for both of the polypeptide hormones that regulate calcium metabolism.
Whether this is a mere coincidence or is somehow related to the evolution or regulation of these calcium regulating genes remains a matter of speculation.
The porcine PTH gene was localized to chromosome 14q25-q28 by in situ hybridization and the equus gene is on chromosome 11p15.3.
Summary
The PTH genes and cDNAs have been isolated and characterized in 10 species. The gene contains two introns, which are in the same position in each species, and dissect the gene into 3 exons that code, respectively, for the 5’ untranslated region, the signal peptide, and PTH plus the 3’ untranslated region.
The mRNAs contain a 7-methyl quanosine cap at the 5’ terminus and a polyadenylation signal at the 3’ terminus.
They are about twice as long as necessary to code for preProPTH.
The 5’ termini of the bovine and human mRNAs are heterogeneous at the 5’ terminus, the basis of which is two TATA sequences in the 5’ flanking regions of the gene.
In contrast, the mouse and rat gene contain a single TATA sequence and the mRNA has a single 5’ terminus.
The initial translational product of the mRNA is preProPTH, and the pre-peptide of 25 amino acids and the pro sequence of 6 amino acids are removed by two proteolitic cleavages.
The mRNAs are very homologous in the region that codes for preProPTH.
But substantial homology is also retained in the mRNA untranslated regions and flanking regions and introns, where sequences are available.
The gallus PTH mRNA is the most distant sequence of the PTH mRNAs.
In the PTH mRNA the 3'-UTR is the region less conserved amongst species. However two protein binding elements in the 3'-UTR were identified and show high homology.
One of these elements is the distal 26 nt cis acting functional element that has been shown to mediate the regulation of PTH mRNA stability in response to changes in serum calcium and phosphate.
This element is expressed in the 3'-UTR of rat, man, dog, cat and mouse.
An additional proximal element of 22 nt is present in the 3'-UTR of bovine, pig, macaca, horse and also in man, cat, dog.
This element binds cytosolic proteins but its function has not been demonstrated.
The conservation of such elements in the 3'-UTR suggests that they represent an evolutionary conserved function.
PTH is central to normal calcium homeostasis and bone strength and the PTH peptide is highly conserved amongst species apart from Gallus.
This conservation is evident in the coding sequence but also, to a less extent in the 5'- and 3'-UTRs.