Particular amino acid sequences mark a protein for secretion, binding to heparin, uptake and internalization into the nucleus.
You can tell a lot about a protein from the sequence of its amino acids. Basic amino acids (arginine and lysine) arranged in groups, for example, usually mean (if it is an extracellular protein) that a protein binds to heparan sulfate proteoglycans.
It seemed strange to me that heparin-binding was so simple when I tried to determine the rules for heparin-binding by looking at the structures of several hundred proteins known to bind to heparin. Since heparin is heavily sulfated and the sulfates are negatively charged, at first I just color-coded the positively-charged , basic amino acids (blue) to look for oppositely charged heparin-binding sites on the surface of the proteins. Obvious blue patches were found on the surfaces of all of the proteins that bound to heparin and scattered blue spots were on the surfaces of other proteins. Moreover, similarly color-coded amino acid sequences showed that the blue patchs almost always had pairs of basic amino acids flanked within six amino acids by a third basic amino acid, i.e. BBxxxxB, where B is either arginine (R) or lysine (K) and x is a hydrophobic amino acid. It was surprisingly simple.
I was shocked at the simplicity, because most binding sites are made up of parts of regular secondary structures of helices or pleated sheets. If there were basic amino acids on these structures, which bound to heparin on one side, then the R/K would be repeated at specific intervals. For a helix, for example, the repeat would be BxxBxxB, because it takes three amino acids to return to the same side as the amino acids wind around in the helix. For the pleated sheet, the amino acids alternate on each side of the sheet, so the pattern is BxBxB. I found these kinds of heparin-binding domains also. The hardest patterns to find from sequences are groups formed as R/K’s on neighboring helices or sheets are brought together in the final folding of the protein.
One of the reasons that the simple pair plus one (BBxxxxB) was found so easily, is because the sequence is typically found on coils that only take shape in the presence of heparin. Thus the rigid binding of the domains to heparin is a result of the shape of the protein induced by the heparin. A related example of this phenomenon is the facilitation of the formation of amyloid fibers in the presence of heparin. The beta amyloid of Alzheimer’s disease for example, consists of a stack of small amyloid peptides with basic amino acids that line up and bind heparin along the length of the stack. Heparin is also an essential component in the amyloids of diabetes. Prions also seem to involve heparin. It is assumed that the cytoplasmic tau fibers of Alzheimer’s disease also have a similar facilitating polyanion (if not heparin), but it has not been identified.
Because of the essential nature of HSPG recycling, it is interesting that amyloid formation is toxic when the amyloid is in contact with cells. Perhaps the amyloid paralyzes HSPG recycling and thereby kills the cells. Treatments that disrupt amyloid binding to heparin, e.g. methylene blue, spare the neurons. This would also suggest the utility of berberine, a fluorescent dye for heparin, which is also a common herbal cure for arthritis, in treatment of many amyloid diseases.
The pair plus one is the minimal grouping of R/K’s that binds heparin, but larger groups bind more strongly and increase the complexity of the interaction between proteins and a cell. A triplet of R/K’s results in a protein binding to the heparan sulfate proteoglycans (HSPGs) on the surface of a cell, but as the HSPGs are recycled by being brought into vesicles within the ce)ll, the bound proteins are also internalized. These internalized proteins are then fused with lysosomes and the proteins are at least partially degraded by proteases. The proteins were released from the HSPGs by the degradation of the heparan. The modified proteins have a variety of fates. Some return to the Golgi for secretion, e.g.HSPGs and heparanase, whereas others are degraded in proteosomes and presented as potential antigen fragments on surface receptors, and still others are are transported to the nucleus. Those proteins transported into the nucleus have four R/K’s or to neighboring pairs of R/K’s, e.g. HIV-TAT, heparanase and transglutaminase 2 (?) Heparanase is intimately involved in cancer proliferation and transglutaminase is involved in Celiac and inflammation.
I have reproduced below the sequences of several human proteins from the National Center for Biotechnology Information. For simplicity, I have deleted the “uninteresting” amino acids between the heparin-binding domains. You will also see an occasional negatively charged amino acids (D/E) within the R/K groups and their hydrophobic neighbors. These amino acids bind to the amino sugars of the heparin.
transglutaminase 2
M---REKLVVRR---KFLKNAGRDCSRR---RRWK---KIRILGEPKQKRK
heparanase
M---REHYQKKFKNSTYSR---KLLRKSTFKNAK---RRKTAKMLKSFLK---RPGKK---KKLVGTK---KRRKLR
Tat [Human immunodeficiency virus 1]
M---KCYCKK---RKKRKHRRGTPQSSK---KEQKKTVASKAER
Chain A, Interleukin- 1 Beta
A---KKKMEKRFVFNK
lactoferrin
M---RRRR---RNMRKVR---RRAR---KGKK---KRKPVTEAR
Is herpes involved in inflammation? If so how & can breakouts be prevented? Can it be cured?
ReplyDelete