Documente Academic
Documente Profesional
Documente Cultură
A substantial portion of the volume of tissues is extracellular space, which is largely filled by an
intricate network of macromolecules constituting the extracellular matrix, ECM. The ECM is composed
of two major classes of biomolecules: glycosaminoglycans (GAGs), most often covalently linked to
protein forming the proteoglycans, and fibrous proteins which include collagen, elastin, fibronectin, and
laminin. These components are secreted locally and assembled into the organized meshwork that is the
ECM.
Connective tissue refers to the matrix composed of the ECM, cells (primarily fibroblasts), and
ground substance that is tasked with holding other tissues and cells together forming the organs.
Ground substance is a complex mixture of GAGs, proteoglycans, and glycoproteins (primarily laminin
and fibronectin) but generally does not include the collagens. In most connective tissues, the matrix
constituents are secreted principally by fibroblasts but in certain specialized types of connective tissues,
such as cartilage and bone, these components are secreted by chondroblasts and osteoblasts,
respectively. In addition to the extracellular matrix, typical connective tissues contain cells (primarily
fibroblasts) all of which are surrounded by ground substance. The ECM is not only critical for connecting
cells together to form the tissues, but is also a substrate upon which cell migration is guided during the
process of embryonic development and importantly, during wound healing. In addition, the ECM is
responsible for the relay of environmental signals to the surfaces of individual cells.
The extracellular matrix is composed of three major classes of biomolecules:
1. Structural proteins: e.g. the collagen, the fibrillins, and elastin.
2. Specialized proteins: e.g. fibronectin, the various laminins, and the various integrins.
3. Proteoglycans: these are composed of a protein core to which is attached long chains of
repeating disaccharide units termed of glycosaminoglycans (GAGs) forming extremely complex high
molecular weight components of the ECM.
Collagens
Collagens are the most abundant proteins found in the animal kingdom. The various collagens
constitute the major proteins comprising the ECM. There are at 44 different collagen genes dispersed
through the human genome. These 44 genes generate proteins that combine in a variety of ways to
create over 28 different types of collagen fibrils. The different collagen fibril types are identified by
Roman numeral designation. Types I, II and III collagens are the most abundant and all three types form
fibrils of similar structure. Of these three major types of collagen, type I is by far the most abundant,
constituting nearly 90% of all the collagen in the human body. Type IV collagen forms a two-dimensional
reticulum and is a major component of all basement membrane. Collagens are predominantly
synthesized by fibroblasts but epithelial cells are also responsible for the synthesis of some of the ECM
collagen.
Collagens are synthesized as preproproteins (see next section) and undergo extensive co- and
post-translational processing. Collagen protein monomers (termed -chains) self-associate into a triple
helical structure. Most of these triple helix structures (termed collagen fibrils) are composed of two
identical alpha chains (e.g. 1) and a different alpha chain (e.g. 2). The nomenclature for collagens
involves the chain composition and the numbering of the collagen gene encoding that particular chain. For example, type I collagen proteins are encoded by the COL1A1 and COL1A2 genes with two of
the triple helix proteins encoded by one gene and one by the other. Therefore type I collagen fibrils are
denoted [1(I)]2[2(I)] where the Roman numeral designates the fibril as type I. Collage proteins have a
unique amino acid composition unlike any other protein in the human body. These proteins contain
upwards of hundreds of repeats the the sequence Gly-Pro-X or Gly-X-Hyp (where X represents any
amino acid except glycine or proline. HyP denotes hydroxyproline. As much as 35% of a collagen
monomer is composed of glycine with another 20-25% being proline
Collagen Synthesis and Processing
Collagens, like the majority of secreted proteins, are synthesized in the rough endoplasmic
reticulum (rough ER). Like all secreted and processed precursor proteins, collagens originate as longer
precursor proteins called preprocollagens. Following removal of the signal peptide from the
preprocollagen precursor, which occurs in the lumen of the rough ER, the remaining protein is referred
to as a procollagen (or tropocollagen). Procollagen proteins contain extra amino acids at the N- and Ctermini that will be removed during the further processing that occurs. For example, type I procollagen
contains an additional 150 amino acids at the N-terminus and 250 at the C-terminus. These pro-domains
are globular and form multiple intrachain disulfide bonds. The disulfides stabilize the proprotein
allowing the triple helical section to form. Collagen fibers begin to assemble in the ER and the Golgi
complex.
In addition to signal sequence removal, numerous additional modifications take place to amino
acids residues on the procollagen proteins. These modifications include hydroxylations and
carbohydrate additions. Specific proline residues are hydroxylated by prolyl 3-hydroxylase and prolyl 4hydroxylase. Specific lysine residues also are hydroxylated by lysyl hydroxylases. Both prolyl and lysyl
hydroxylases are absolutely dependent upon vitamin C as co-factor. Humans express three distinct
prolyl 3-hydroxylase genes (P3H1, P3H2, and P3H3). Human prolyl 4-hydroxylases are functional as
heterotetrameric enzymes composed of two -subunits (the catalytic subunits) and two -subunits.
Humans express three distinct prolyl 4-hydroxylase -subunit genes (P4HA1, P4HA2, and P4HA3) and
one -subunit gene (P4HB). Humans express three distinct lysyl hydroxylase genes identified as PLOD1,
PLOD2, and PLOD3. PLOD stands for procollagen-lysine, 2-oxoglutarate 5-dioxygenase. The PLOD1 gene
is located on chromosome 1p36.22 and is composed of 21 exons that encode a precursor protein of 727
amino acids. The PLOD2 gene is located on chromosome 3q24 and is composed of 22 exons that
generate two alternatively spliced mRNAs encoding two isoforms of the PLOD2 enzyme. The PLOD3
gene is located on chromosome 7q22 and is composed of 17 exons that encode a 736 amino acid
precursor protein.
Glycosylations of the O-linked type also take place during procollagen transit through the Golgi
complex. Many, but not all, hydroxylated lysine residues (HyL), but not the HyP residue, are targets for
O-glycosylation. The most common sugars added during this step are glucose or galactose as monomeric
sugar attachments. Later, within the Golgi complex, oligosaccharides are added to the procollagen
proteins. The hydroxylation and glycosylation reaction allow the procollagen proteins to twist upon
themselves forming the typical triple helical structure.
Typical triple helical structure of a collagen fibril. Individual collagen monomers spontaneously
wrap around each other forming a tightly packed left-handed triple helix. The triple helix forms as the
procollagen monomers are processed while transiting through the ER and the Golgi complex.
At this point the globular ends of the triple helices are loose. Following completion of the
processing within the ER and Golgi complex, procollagen proteins are secreted into the extracellular
space. Several reactions take place to a procollagen protein within the extracellular compartment.
Proteases remove the globular pro-domains at both the N- and C-termini. The collagen molecules then
polymerize to form collagen fibrils. Accompanying fibril formation is the oxidation of certain lysine
residues by the extracellular enzyme lysyl oxidase. Lysyl oxidase is an extracellular Cu2+-dependent
enzyme that is also known as protein-lysine 6-oxidase. The lysyl oxidase gene (symbol: LOX) is located on
chromosome 5q23.2 and is composed of 7 exons that generate two alternatively spliced mRNAs. Lysyl
oxidase acts on lysines and hydroxylysines producing aldehyde groups, which will eventually undergo
covalent bonding between tropocollagen molecules. Lysyl oxidase is a major copper-dependent enzyme.
Defects in copper homeostasis, as is evident in Menkes disease, result in numerous manifestations
related to defective collagen production.
The fundamental higher order structure of all collagens is a long and thin diameter rod-like
protein. The Table below lists the characteristics of the 12 most highly characterized types of collagen
fibrils. As indicated above, there are at least 28 different types of collagen fibrils in the various types of
extracellular matrices of the human body. For example, type I collagen is 300nm long, 1.5nm in diameter
and consists of 3 coiled subunits composed of two 1(I) chains and one 2(I) chain. Characteristic of
type I collagen, but highly similar in all other types, there are three amino acids per turn of the helix and
every third amino acid is a Gly. In addition to the high Gly content, collagens are also rich in Pro and HyP
residues. The R-groups of the latter two amino acids reside on the outside of the triple helix. Lateral
interactions of triple helices of collagens result in the formation of fibrils roughly 50nm diameter. The
packing of collagen is such that adjacent molecules are displaced approximately 1/4 of their length
(67nm). This staggered array produces a striated effect that can be seen in the electron microscope.
Formation of collagen fibers. Collagen fibers are formed within the extracellular compartment. The
assemblage of multiple triple helical collagen fibrils into a collagen fiber requires that certain Lys and
HyL residues be oxidized by extracellular lysyl oxidase. In addition, the loose globular domains at the Nand C-termini are first removed by collagen peptidases prior to fibril assembly into a fiber.
Type
Chain
Composition
Types of Collagen
Gene
Structural
Symbol(s) Details
Comments
I: fibril
forming
[1(I)]2[2(I)]
COL1A1,
COL1A2
II: fibril
forming
[1(II)]3
COL2A1
300nm, small
cartilage, vitreous humor
67nm fibrils
III: fibril
forming
[1(III)]3
COL3A1
IV: sheet
forming
V: fibril
forming
VI: fibril
forming
[1(IV)2[2(IV)]
COL5A1,
[1(V)][2(V)][3(V)] COL5A2,
COL5A3
COL6A1,
[1(VI)][2(VI)][3(VI)] COL6A2,
COL6A3
150nm, N+C
term. globular
domains,
most interstitial tissue, assoc.
microfibrils, with type I
100nm
banded fibrils
VII:
anchoring
[1(VII)]3
COL7A1
VIII
[1(VIII)]3
COL8A1,
COL8A2
COL9A1,
IX:
[1(IX)][2(IX)][3(IX)] COL9A2,
anchoring
COL9A3
[1(X)]3
XI
[1(XI)][2(XI)][3(XI)]
XII:
anchoring
1(XII)
FACIT collagen
(Fibril Associated Collagens
withInterrupted Triple helices)
cartilage, associates with type II
COL12A1
which can shed at the slightest touch in some patients. There are three distinct classifications of EB:
simplex, junctional, and dystrophica. Dystrophic epidermolysis bullosa (DEB) is caused by defects in the
COL7A1 gene. The autosomal dominant form of DEB is also known as Cockayne-Touraine disease,
whereas, the autosomal recessive form is also known as Hallopeau-Siemens disease.
Fibrillins and Elastin
The ECM of tissues that undergo significant stretching and/or bending contains significant quantities of
the protein elastin. Elastin is found in a specialized type of fibril called elastic fibers. Elastic fibers are
composed of large masses of cross-linked elastin interspersed with another family of ECM proteins
called the fibrillins. The walls of large arteries are particularly abundant with elastin (and thus elastic
fibers) which allows them to undergo continual deformation and reformation during changes in
intravascular pressure. The lungs and the skin are additional organs whose tissues are rich in elastin and
elastic fibers.
Elastin
Elastin is synthesized as the precursor, tropoelastin from the elastin gene (symbol: ELN). The ELN gene is
located on chromosome 7q11.23 and is composed of 34 exons that generate 13 alternatively spliced
mRNAs. Tropoelastin has two major types of alternating domains. One domain is hydrophilic and rich in
Lys (K) and Ala (A) while the other domain is hydrophobic and rich in Val (V), Pro (P), and Gly (G) where
these amino acids are frequently contained in repeats of either VPGVG or VGGVG. The hydrophobic
domains of elastin are responsible for its elastic character. Tropoelastin is expressed then secreted as a
mature protein into the extracellular matrix and accumulates at the surface of the cell. After secretion
and alignment with ECM fibrils, numerous K residues are oxidized by lysyl oxidase (gene symbol: LOX), a
reaction which initiates cross-linking of elastin monomers. This process of elastin cross-linking, induced
by lysyl oxidase, is the same as occurs in the cross-linking of collagens. Although lysyl oxidase activity
promotes elastin cross-linking, the process is unique in that three lysine-derived aldehydes (allysyl)
cross-link with an unmodified lysine forming a tetrafunctional structure called a desmosine. The highly
stable cross-linking of elastin is what ultimately imparts the elastic properties to elastic fibers.
Defects in the elastin gene are found associated with the inherited disorder known as Williams-Beuren
syndrome which is characterized by connective tissue dysfunction that plays a causative role in the
supravalvular aortic stenosis typical of this disorder. Defective elastin is also associated with a group of
skin disorders called cutis laxa (autosomal dominant form in the case of ELN gene defects). In these
disorders the skin has little to no elastic character and hangs in large folds.
Fibrillins
The other major proteins in elastic fibers are the fibrillins. Humans express three fibrillin genes identified
as FBN1, FBN2, and FBN3. The FBN1 gene is located on chromosome 15q21.1 and is composed of 66
exons that encode a 2871 amino acid precursor protein. The FBN2 gene is located on chromosome
5q23q31 and is composed of 65 exons that encode a 2912 amino acid precursor protein. The FBN3
gene is located on chromosome 19p13 and is composed of 64 exons that encode a 2809 amino acid
precursor protein.
Fibrillin monomers link head to tail in microfibrils which can then form two and three dimensional
structures. The most abundant fibrillin in elastic fibers is the FBN1 encoded protein, fibrillin 1. Fibrillin 1
serves as the scaffold in elastic fibers upon which cross-linked elastin is deposited. The observed
patterns of fibrillin gene expression are consistent with their roles in extracellular matrix structure of
connective tissue. FBN1 expression is high in most cell types of mesenchymal origin, particularly bone.
FBN2 expression is highest in fetal cells and has more restricted expression in mesenchymal cell types
postnatally. FBN3 is expressed in embryonic and fetal tissues in humans. The patterns of fibrillin gene
expression indicated that these proteins are important in maintaining the structure and integrity of the
extracellular matrix.
Mutations in fibrillin genes result in connective tissue disorders referred to as fibrillinopathies. These
disorders are characterized by structural failure of the extracellular matrix due to the absence or
abnormality fibrillin proteins. The various fibrillinopathies that have been characterized to date result
from mutations in either the FBN1 or FBN2 genes. No diseases are currently known to be associated
with the FBN3 gene in humans. The FBN1-associated fibrillinopathies include Marfan syndrome (MFS),
familial ectopia lentis, familial aortic aneurysm ascending and dissection, autosomal dominant WeillMarchesani syndrome type 2 (WMS2), and MASS syndrome (MASS designates the involvement of the
mitral valve, aorta, skeleton, and skin). The FBN2-associated fibrillinopathy is congenital contractural
arachnodactyly.
Fibronectin
Fibronectin is a major fibrillar glycoprotein of the ECM where its role is to attach cells to a variety of ECM
types. The role of fibronectin is to attach cells to a variety of extracellular matrices. Fibronectin attaches
cells to all matrices except type IV. Type IV matrices involve laminins as the adhesive proteins.
Fibronectin is functional as a dimer of two similar peptide chains. Each chain is 6070nm long and 2
3nm thick. At least 11 different fibronectin proteins have been identified that arise by alternative RNA
splicing of the primary transcript from a single fibronectin gene (symbol: FN1). The FN1 gene is located
on chromosome 2q34 and is composed of 47 exons that generate at least 20 different alternatively
spliced mRNAs. Not all the resulting mRNAs encode functional fibronectin proteins, but at least 11
fibronectin preproproteins have been characterized.
Fibronectin consists of a multimodular structure composed predominantly of three different amino acid
repeat domains termed modules. These repeat domains are termed FN-I, FN-II, and FN-III. The three
fibronectin repeat domains are each composed of two anti-parallel -sheets. In the FN-I and FN-II
domains these -sheets are held together by intrachain disulfide bonding. Each of the two fibronectin
subunits in a functional fibronectin dimer consists of twelve FN-I, two FN-II, and fifteen to seventeen FNIII modules, respectively. These functional modules are responsible for fibronectin binding to fibrin,
collagen, heparan sulfate proteoglycans (HSPGs), DNA, and the integrins in plasma membranes. The
primary amino acid sequence motif in fibronectin that binds to an integrin is a tripeptide, Arg-Gly-Asp
(RGD).
The FN-I domain is approximately 40 amino acids in length and contains four conserved cysteine
residues that are required for disulfide bond formation. Several other proteins, in addition to
fibronectin, contain one or more of the three FN-domains. Tissue plasminogen activator (tPA) and factor
XII, both proteins involved in the regulation of hemostasis, contain FN-I domains. In tPA the FN-I domain
plays a role in binding to fibrin in a fibrin clot. The FN-II domain is composed of approximately 60 amino
acids that, like the FN-I domain, contains four conserved cysteine residues necessary for disulfide bond
formation. Several proteins contain FN-II domains including factor XII, the IGF-2 receptor, and the
receptor for a secretory phospholipase PLA2 (sPLA2) family member. The FN-III domain is composed of
approximately 100 amino acids that forms a -sandwich structure. The FN-III domain is widely
distributed, being found in over 100 human proteins, many of which are involved in the formation of the
extracellular matrix.
Fibronectin is also found in a soluble compact non-functional form in the blood. This form of fibronectin
is also known as cold-insoluble globulin (CIg). The transformation from the compact form to the
extended fibrillar form of fibronectin is referred to as fibrillogenesis. This transformation requires the
application of mechanical forces generated by cells. This occurs as cells bind and exert forces on
fibronectin through transmembrane receptor proteins of the integrin family (see below).
Laminins
All basal lamina contain a common set of proteins, GAGs, and proteoglycans. These include type IV
collagen, heparan sulfate proteoglycans (HSPGs), nidogens (entactins), and laminins. Because of the
presence of type IV collagens, the basal lamina is often referred to as the type IV extracellular matrix.
Each of the components of the basal lamina is synthesized by the cells that rest upon it. Laminins anchor
cell surfaces to the basal lamina.
Laminins are heterotrimeric proteins that contain an -chain, a -chain, and a -chain. The historical
designations for these three protein chains was A, B1 and B2, respectively. Humans express five genes
encoding the -chains, four encoding the -chains, and three encoding the -chains. The five laminin chain encoding genes are identified as LAMA1, LAMA2, LAMA3, LAMA4, and LAMA5. The four laminin chain encoding genes are identified as LAMB1, LAMB2, LAMB3, and LAMB4. The three laminin -chain
encoding genes are identified as LAMC1, LAMC2, and LAMC3. The different laminin proteins have been
found to form at least 15 different types of heterotrimers. The nomenclature for a laminin molecule
relates to the peptide chain composition. For example, the laminin molecule identified as laminin-111
(originally identified as laminin-1) is composed of the 1, 1, and 1 gene encoded proteins (111
composition), and laminin-211 (formerly laminin-2) has the composition, 211. Laminin heterotrimers
are quite large, ranging from under 500,000 to nearly a 1,000,000 Da in mass. Laminins contain common
structural features that include a tandem distribution of globular, rod-like and coiled-coil domains. The
coiled-coil domains are responsible for joining the three chains into a characteristic heterotrimeric
structure.
111
Laminin-1
Laminin-211
211
Laminin-2
Laminin-121
121
Laminin-3
Laminin-211
221
Laminin-4
Laminin-332
332
Laminin-5
Laminn-311
311
Laminin-6
Laminin-321
321
Laminin-7
Laminin-411
411
Laminin-8
Laminin-421
421
Laminin-9
Laminin-511
511
Laminin-10
Laminin-521
521
Laminin-11
Laminin-213
213
Laminin-12
Laminin-423
423
Laminin-14
Laminin-522
522
Laminin-523
523
Laminin-15
The laminins are glycoproteins that constitute the structural scaffolding of all basement
membranes. The heterotrimer composition of laminins results from self-assembly following
secretion of the individual protein subunits. Laminins are critical components of the ECM that
bind to the integrins, the dystroglycans, and numerous other receptors. These laminin
interactions are critical for cell differentiation, cell movement, cell shape, and the promotion of
cell survival. Given this broad range of contributions to tissue formation and survival, it is not
surprising that loss of functional laminin genes can result in potentially devastating disorders. As
described above in the collagen section, the disorders known as epidermolysis bullosa (EB) are
a family of disorders that are associated with excessive blistering in response to mechanical
injury or trauma. The junctional form of EB can be caused by mutations in either an integrin
gene (ITGA6 and ITGB4) or a laminin gene (LAMA3, LAMB3, and LAMC2). A form of congenital
muscular dystrophy is caused by defective laminin-211 production due to defects in the LAMA2
gene.
Gene
Laminin
Location &
Gene
Structure
LAMA1
LAMA2
Comments
18p11.3; 64
exons
6q22q23; 66
exons
LAMA3
18q11.2; 77
exons
LAMA4
6q21; 39 exons
LAMA5
LAMB1
7q22; 35 exons
LAMB2
LAMB3
LAMB4
7q31; 42 exons
LAMC1
1q31; 28 exons
LAMC2
1q24q31: 23
exons
LAMC3
9q34.12; 29
exons
Integrins
The term integrin was derived from the observations that these cell-surface proteins (the integrins)
served as transmembrane linkers (integrators) whose functions were to mediate the interactions
between the extracellular matrix and the intracellular cytoskeleton. Integrins function as heterodimeric
glycoproteins, composed of an - and a -subunit. Humans express 18 integrin -subunit genes and 8
integrin -subunit genes. Both subunits of an integrin are single pass transmembrane proteins, which
bind components of the extracellular matrix or counter-receptors expressed on other cells. Several
different matrix proteins are bound by integrins, such as laminins and fibronectin. Many ligands for the
integrins bind only in the presence of the divalent cations, Ca2+ or Mg2+.
One class of integrin contains an inserted domain (I) in its subunit, and if present (in 1, 2, 10, 11,
D, E, L, M and X), this I domain contains the ligand binding site. All subunits possess a similar Ilike domain, which has the capacity to bind ligand, often recognizing the RGD motif such as that present
in fibronectin. The I and I-like domains of the integrins are what bind the divalent cations that are
essential, in many types of integrin, for ligand binding.
Integrins provide a link between ligand and the actin cytoskeleton via short intracellular domains. This
linkage between the outside (extracellular matrix) and inside (cytoskeleton) of cells, mediated by
integrin-ligand interactions, allows for both outside-in and inside-out signal transduction. Integrin-ligand
interactions can trigger certain intracellular signal transduction pathways via the regulation of the
activity of certain protein kinases. These kinases include focal adhesion kinase (FAK) and integrin-linked
kinase (ILK).
Integrin Common
Gene
Name
Comments
ITGAD
integrin,
alpha D
formerly CD11d
ITGAE
integrin,
alpha E
ITGAL
integrin,
alpha L
ITGAM
integrin,
alpha M
ITGAV
integrin,
alpha V
formerly CD51
ITGAX
integrin,
alpha X
ITGA1
integrin,
alpha 1
ITGA2
integrin,
alpha 2
ITGA2B
integrin,
alpha 2b
ITGA3
integrin,
alpha 3
ITGA4
integrin,
alpha 4
ITGA5
integrin,
alpha 5
ITGA6
integrin,
alpha 6
ITGA7
integrin,
alpha 7
ITGA8
integrin,
alpha 8
ITGA9
integrin,
alpha 9
ITGA10
integrin,
alpha 10
ITGA11
integrin,
alpha 11
ITGB1
integrin, beta commonly called platelet glycoprotein GPIIa (GP2A); also known as
1
the fibronectin receptor, beta polypeptide; formerly CD29
ITGB2
ITGB3
ITGB4
ITGB5
integrin, beta
5
ITGB6
integrin, beta forms a complex with integrin V subunit; complex can bind
6
fibronectin
ITGB7
ITGB8
integrin, beta
8
Glycosaminoglycans
The most abundant heteropolysaccharides in the body are the glycosaminoglycans (GAGs). These
molecules are long unbranched polysaccharides containing a repeating disaccharide unit. The
disaccharide units contain either of two modified sugars, N-acetylgalactosamine (GalNAc) or Nacetylglucosamine (GlcNAc), and a hexuronic acid such as glucuronate (GlcA) or iduronate (IdA). GAGs
are highly negatively charged molecules, with extended conformation that imparts high viscosity to the
solution in which they reside. GAGs are located primarily on the surface of cells or in the extracellular
matrix but are also found in secretory vesicles in some types of cells.
Along with the high viscosity of GAGs comes low compressibility, which makes these molecules ideal for
a lubricating fluid in the joints. At the same time, their rigidity provides structural integrity to cells and
provides passageways between cells, allowing for cell migration. The specific GAGs of physiological
significance are hyaluronic acid, dermatan sulfate, chondroitin sulfate, heparin, heparan sulfate, and
keratan sulfate. Although each of these GAGs has a predominant disaccharide component,
heterogeneity does exist in the sugars present in the make-up of any given class of GAG.
As indicated in the Table below, and discussed in greater detail in the Glycosaminoglycans and
Proteoglycans page, the various GAGs include the hyaluronates, chondroitin sulfates, keratan sulfates,
dermatan sulfates, heparan sulfates, and heparins. Hyaluronic acid (also called hyaluronan) is unique
among the GAGs in that it does not contain any sulfate and is not found covalently attached to proteins
forming a proteoglycan. It is, however, a component of non-covalently formed complexes with
proteoglycans in the ECM. Hyaluronic acid polymers are very large (with molecular weights of 100,000
10,000,000) and can displace a large volume of water. Indeed, the hyaluronans are the largest
polysaccharides produced by vertebrate cells. The immense size of these molecules makes them
excellent lubricators and shock absorbers in the joints.
Characteristics of GAGs
GAG
Localization
Comments
Heparan
sulfate
basement membranes,
components of cell
surfaces
Heparin
component of
intracellular granules of
mast cells, lining the
arteries of the lungs, liver
and skin
valve prolapse
Keratan
sulfate
Proteoglycans
The majority of GAGs in the body are linked to core proteins, forming proteoglycans (also called
mucopolysaccharides). The GAGs extend perpendicularly from the core in a brush-like structure. The
linkage of GAGs to the protein core, in most but not proteoglycans, involves a specific tetrasaccharide
linker composed of a glucuronic acid (GlcA) residue, two galactose (Gal) residues, and a xylose (Xyl)
residue forming a structure such as: GAG(n)GlcAGalGalXylSerprotein. The tetrasaccharide linker
is coupled to the protein core through an O-glycosidic bond to a Ser or Thr residue in the protein. The
tetrasaccharide linker is most commonly seen in proteoglycans that contain heparins, heparan sulfates,
dermatan sulfates, and chondroitin sulfates. Although most common, some GAGs are linked to the
protein core of proteoglycans via a trisaccharide linkage that lacks the GlcA residue. In the case of the
keratan sulfates, attachment of the sugar linker to the core protein can occur via O-linkage or via Nlinkage. There are two major types of keratan sulfates (KSI and KSII) where KSI containing proteoglycans
are formed via N-linkage and KSII containing proteoglycans are formed via O-linkage.
The protein cores of proteoglycans are rich in Ser and Thr residues, which allows multiple sites of
polymeric GAG attachment. Following the formation of the tetrasaccharide linker if the next sugar
added is N-acetylglucosamine (GlcNAc) the resulting attached GAGs will be either heparins or heparan
sulfates. If the next sugar is N-acetlygalactosamine (GalNAc) instead, then the attached GAGs will be
either chondroitin sulfates or dermatan sulfates.
Essentially all mammalian cells have the capacity to synthesize proteoglycans and to secrete them into
the ECM, or insert them into the plasma membrane, or to store them in secretory vesicles. The overall
composition of a given type of ECM will ultimately determine the physical characteristics of the tissues it
surrounds and also the many biological properties of the cells embedded in it. The proteoglycans found
in the ECM interact with other ECM components keeping the level of fluidity high (forming a hydrated
gel-like composition) and providing resistance to compressive forces. Different cell types produce
different types of membrane-associated proteoglycans. Membrane proteoglycans have either a single
membrane-spanning domain (a type I orientation) or they are linked to the membrane via a
glycosylphosphatidylinositol (GPI) anchor. In addition, in some cells the proteoglycans are concentrated
within secretory vesicles along with the other vesicle components. The role of vesicle proteoglycans is to
help sequester and regulate the availability of positively charged vesicle components (e.g. proteases and
bioactive amines such as neurotransmitters) via their interactions with the negatively charged polymeric
GAG chains.
There exists a huge variability of proteoglycans in human tissues and cells (discussed in greater detail in
the Glycosaminoglycans and Proteoglycans page). This variability is due to several factors including the
large number of different proteoglycan core proteins and the ability add one or two different types of
polymeric GAG chains to the protein core. Some proteoglycans contain only one GAG chain (e.g.,
decorin), whereas others can have several hundred GAG chains (e.g., aggrecan). Proteoglycan variability
also results from the stoichiometry of GAG chain substitution. As an example, the proteoglycan,
syndecan-1, has five attachment sites for GAGs, but not all of the sites are used equally. Another level of
variability results from the fact that different cell types produce proteoglycans, from the same protein
core, that exhibit differences in the number of GAG chains, the GAG chain polymeric length, and the
arrangement of sulfated residues within the GAG chain.
Source: http://themedicalbiochemistrypage.org/extracellularmatrix.php