1. Control of transcription initiation
B. Negative Regulation
C. Positive Regulation
B. Antitermination
4. Other mechanisms
B. DNA methylation
C. Supercoiling of DNA
An obvious place to regulate transcription is at or around the
PROMOTER region, thereby controlling access of RNA Polymerase
to the promoter. Site where regulatory protein(s) bind is called
the OPERATOR.
Regulatory proteins may prevent transcription (NEGATIVE CONTROL;
see Figures 25 and 26) or may increase transcription (POSITIVE
CONTROL; see Figures 27 and 28). Regulatory proteins often
respond allosterically to the presence or absence of ligand effector
molecules.
OPERONS are defined as several distinct genes, arranged
in a colinear fashion and under the control of a common regulatory
region, that are transcribed as a polycistronic mRNA. REGULONS
refer to genes scattered around the genome that are nonetheless
under the control of a common regulatory protein.
Reading: Science 271: 1245-1254.
Also see cover of this issue.
Genes of the lac operon are reponsible for utilization
of lactose as a carbon source. Lactose is a disaccharide, that
can be cleaved to form D-glucose and D-galatose.
Lac Operon Model was proposed in 1961 by Jacob and Monod prior
to the discovery of mRNA.
lac operon contains three genes:
lacZ Encodes b-galactosidase, which cleaves lactose into monosaccharides. Tetramer of identical subunits of 116.4 kDa
lacY Encodes lactose permease. Symport permease that couples protonmotive force to lactose uptake. Hydrophobic protein, 46.5 kDa.
lacA Encodes thiogalactoside transacetylase (30
kDa). Transfers acetyl groups to some galactosides that may
be toxic
All three genes are adjacent and co-transcribed. They are under
the control of the LacI (Lactose Repressor) protein. The lactose
repressor forms tetramers of identical subunits of 38.6 kDa each.
Lactose Repressor is member of a large family of proteins known
as Helix-Turn-Helix (HTH) proteins, that have the ability
to bind DNA by making specific base contacts in the major groove
of the DNA double helix. Most HTH bind to DNA as dimers or tetramers,
with each monomer recognizing half of an operator binding site.
LacI repressor binds to lac operator, a region of about
24 nt in length that overlaps the lac promoter that provides
the binding site for RNA polymerase.
The lac promoter region actually actually contains 3 binding
sites for LacI repressor (see Figure 29). The strongest,
O1, overlaps the promoter binding site for RNA polymerase, and
most recent studies indicate that binding of LacI to the O1 operator
precludes RNA polymerase binding. When all 3 binding sites are
present, LacI repressor can suppress initiation of transcription
approximately 1000-fold. If either O2 or O3 is missing, then
repression is only about 500-fold. If both O2 and O3 are missing,
repression is only 20-fold (i.e., 50-times less than the complete
system). When LacI binds to O1 and O3, then looping of the intervening
DNA occurs that may further reduce the binding of RNA polymerase
to the promoter. Formation of O1/O2 looping probably directly
interferes with mRNA synthesis/elongation.
Induction of the lac operon occurs due to the binding of
an INDUCER to the LacI protein, causing an allosteric change in
conformation in the protein that leads to release from the lac
operator. The in vivo inducer is allolactose (galactose
on the 6 position rather than the 4 position; a by-product of
LacZ b-galactosidase catalysis).
GRATUITOUS INDUCERS are molecules which are not substrates
for LacZ but which bind tightly to the LacI repressor.
The lac control region also contains a secondary promoter
that binds RNA polymerase tightly but that allows very inefficient
initiation. Once, initiated, transcription can proceed past the
bound LacI at the operator, however. The message produces a stem-loop
(from the O1 operator) that includes the ribosome binding site
of lacZ gene; poor translation results from the mRNA.
Permease is synthesized along with minimal amount of LacZ.
lac operon is controlled by a second, positive regulation
system. E. coli exhibits diauxic growth when presented
with glucose and another carbon source, such as lactose. No lactose
is used until all glucose is utilized, and lactose utilization
enzymes are not induced (see Figure 30).
E. coli will grow faster on glucose than on any other carbon
source, and when growing on glucose, cAMP (cyclic 3'-5'
AMP) levels are low. cAMP is formed by the enzyme ADENYLATE
CYCLASE. Adenylate cyclase is regulated in turn by Enzyme
IIAGlu of the PTS system (see Figure 18 on
WWW). The phosphorylated form of Enzyme IIAGlu
acts an activator of adenylate cyclase, causing cAMP levels to
rise. The dephosphorylated level of Enzyme IIAGlu
inhibits adenylate cyclase, causing cAMP levels to remain low.
The levels of phosphorylated Enzyme IIAGlu
will be low when glucose is actively being transported by the
PTS system, but will rise as glucose levels are depleted.
How is the cAMP signal used to regulate gene transcription? cAMP
binds to a DNA binding protein called CATABOLITE ACTIVATOR
PROTEIN (CAP; also known as CRP, catabolite repressor protein)
which acts as a positive regulator of many operons, including
lac, gal, ara, and others. REGULONS
are many operons under common control.
Binding of CAP-cAMP complex to DNA occurs at the CAP site, immediately
upstream from the promoter -35 region (see Fig. 6, p. 337 of textbook).
Binding of CAP-cAMP to DNA causes bending of the DNA; the protein
interacts with the a subunits of RNA polymerase to activate transcription.
CAP-cAMP facilitates RNA polymerase binding and promotes helix
destabilization, allowing increased expression from lac
operon.
CAP-cAMP binding to operator actually tightens negative control
of the lac operon under conditions where lactose is not present.
This is logical if one considers that low glucose concentrations,
which would favor high cAMP levels, would signal that the cell
is energy limited. Under conditions of energy sufficiency, tight
control is less important, and O1 binding alone by repressor could
be sufficient to control transcription. However, when energy
is limiting (glucose is limiting), then the cell does not want
to waste any ATPs at all. CAP-cAMP binding facilitates LacI repressor
binding by bending DNA, and the resulting loop places the lac
promoter on the inner surface of the loop, nicely sequestered
from RNA polymerase. IF lactose is available, however, then the
loop is released and CAP-cAMP promotes RNA polymerase binding
and initiation of transcription.
The gal operon (see Figure 31) of E. coli
consists of 3 structural genes: galE (epimerase), galT
(galactose transferase), and galK (galactokinase),
which are transcribed from two overlapping promoters PG1
and PG2 upstream from galE. Regulation
of the operon is complex since the GalE product, an epimerase
that converts UDP-glucose into UDP-galactose, is required for
the formation UDP-galactose for cell wall biosynthesis even when
cells are not using galactose as a carbon/energy source. The
gal operon is controlled by CRP-cAMP as for the lac
operon. CRP-cAMP binds to -35 region promoting transcription
from PG1 but inhibiting transcription from
PG2. When cells are grown in glucose, basal
level transcription occurs from PG2. The
unlinked galR gene encodes the repressor for this system.
A tetrameric GalR repressor binds to 2 operators, one located
at +55 and one located at -60 relative to the PG1
start site. Looping of the DNA blocks the access of RNA polymerase
to promoters and/or inhibits formation of the open complex. When
GalR binds as a dimer to the -60 site only, promoter PG2
is activated, not repressed, allowing basal levels of GalE to
be produced. In this state promoter PG1 is
inactivated through interactions with the alpha subunit of RNA
polymerase.
The carbohydrate L-arabinose is catabolized by E. coli through the use of 3 enzymes, which are the products of the araA, araB, and araD genes (see Figure 32). L-arabinose is converted to L-ribulose by AraA, or L-arabinose isomerase. L-ribulokinase, encoded by araB, phosphorylates L-ribulose to form L-ribulose 5-phosphate, which is isomerized to D-Xylulose 5-phosphate by AraD, or L-ribulose-5-phosphate 4-epimerase. The product can be further metabolized via the pentose phosphate pathway. Regulation of the araBAD operon occurs via araC, which is adjacent to and divergently transcribed from araBAD. The araC promoter (PC) and araBAD promoter (PBAD) are both stimulated by cAMP-CRP. AraC can act as both an activator of PBAD in the presence of arabinose but represses both promoters in the absence of arabinose.
How can AraC be both a repressor and an inducer? As a repressor,
AraC binds to two sites (O2 and I1) to form a loop (see Figure
). This state blocks promoter PC and promoter
PBAD is not activated because AraC is not
positioned appropriately for activation. Addition of arabinose,
which causes a conformation change in AraC, opens the DNA loop
with the assistance of CRP-cAMP. Binding to adjacent sites on
the DNA becomes favored, and activation of promoter PBAD
occurs. AraC is positioned to make contact with RNA polymerase
at promoter PBAD. Promoter PC
remains available transiently (for about 10 minutes usually) until
AraC concentration rises and fills the O1 binding sites and represses
the promoter PC. CRP-cAMP destabilizes the
looping, and thus favors activation of both promoters.
Tryptophan is one of the 20 essential amino acids from which all
proteins are formed. It is typically the least abundant amino
acid in most proteins, however. When available in the medium,
transporters readily take up this amino acid and incorporate it
into proteins. In E. coli, the tryptophan (trp)
operon consists of a promoter, an operator (for repressor binding)
the trpL gene encoding (tryptophan leader peptide), and
five structural genes (trpEDCBA) which encode the critical
enzymatic activities that specifically convert chorismsate into
tryptophan. The trpR gene encoding the repressor protein
occurs at a distant site in the genome. The TrpR repressor is
inactive in the absence of tryptophan, but if tryptophan levels
in cells rise, tryptophan, the COREPRESSOR, binds to repressor,
which then binds to trp operator (between promoter and
trpL gene), and mRNA synthesis is repressed. The end-product
of the tryptophan biosynthetic pathway inhibits the synthesis
of the enzymes (by blocking mRNA synthesis), and this is referred
to as END-PRODUCT REPRESSION. Cell balances the production
of enzymes for biosynthesis of this amino acid with the consumption
of this amino during protein synthesis, leading to energy efficiency.
Repression via TrpR can provide approximately 100-fold regulation
of tryptophan biosynthesis enzymes.
Starvation of E. coli results in derepression of the trp operon in two stages. First, the absence of tryptophan releases the TrpR protein from the operator. A second level of control further regulates the trp operon by increasing a further 10-fold the rate of trp mRNA synthesis in response to insufficient charged tRNATrp. When sufficient tRNATrp is found in cells, 9 of 10 transcripts initiated from the trp promoter terminate prior to the structural genes. This affect is due to ATTENUATION control of transcription and requires a tight coupling of transcription and translation. Attenuation can only occur in procaryotes.
Between the trp promoter and the trpE gene is the trpL
gene, a gene of only 14 codons containing two sequential tryptophan
codons:
Met Lys Ala Ile Phe Val Leu Lys
Trp Trp Arg Thr Ser
The TrpL gene has normal start codon and an strong ribosome binding
sequence. The nucleotide sequence of the trpL region is
such that three possible stem loops can be formed. Either sequences
1 + 2 and 3 + 4 form tandem loops known as the PAUSE LOOP
and the TERMINATOR LOOP or sequences 2 + 3 form the ANTITERMINATOR
LOOP. The terminator loop 3+4 is identical to rho-independent
terminators which we discussed earlier in the course: a G+C-rich
stem-loop followed by 7 U residues in the mRNA sequence. Which
stem-loops form depends upon the rate of translation by ribosomes
and the tRNATrp content of the cells. Note
that the Trp repressor and attenuation actually respond to different
measures of the tryptophan status of the cell.
When RNA polymerase binds to the trp promoter and intiates
mRNA synthesis, the RNA polymerase paues at the Pause Loop. If
a ribosome does not load quickly, an indication of limited protein
synthesis, then RNA polymerase is slowly released and Termination
Loop 3+4 is formed, and transcription is terminated. (No protein
synthesis means the cell has no need to synthesize tryptophan.)
Ribosome binding to the paused transcription complex can also
release RNA polymerase. If there is an adequate supply of tryptophan,
the ribosome moves to the stop codon of trpL; this blocks
the formation of the 2+3 antiterminator and the 3+4 terminator
forms, releasing the transcript and halting transcription. A
small fraction of the time, ribosome release will occur prior
to formation of the terminator, and some readthrough will occur
when the 2+3 terminator forms rather than the 3+4 terminator stem-loop.
This will provide basal levels of the Trp biosynthetic enzymes.
Ribosome binding to the paused transcription complex in the absence
of sufficient tRNATrp will also release the
paused RNA polymerase. However, the ribosome will stall when
it reaches the Trp-Trp codon pair. This prevents the formation
of the 1+2 stem loop and thus favors the formation of antiterminator
loop 2+3, in turn preventing the formation of the terminator loop
3+4. RNA polymerase thus transcribes the complete trp
operon.
Attenuation is an important control mechanism in other organisms
and other operons encoding enzymes for biosynthesis of amino acids.
The following operons have no repressors and are regulated exclusively
by attenuation:
phe, his, leu, thr, ilv
Leader gene sequences in each of the above named operons have
multiple codons for the appropriate amino acids. Since Leu, Ile
and Val are all derived from Threonine, the thr and ilv
leader sequences contain multiple codons for all four amino
acids.
In Bacillus subtilis, regulation of the trp operon
also involves the formation of a leader transcript with the possibility
of alternative structures. However, in B. subtilis it
is not the translating ribosome but a regulatory protein, TRAP,
that determines the outcome. TRAP (trp-RNA-binding
attenuator protein) has 11 identical 8 kDa subunits; when
tryptophan is bound, each subunit can bind a G/U-A-G trinucleotide
which is repeated many times in the leader region. TRAP binding
promotes attenuation of transcription by blocking the formation
of the antiterminator, thereby promoting formation of the terminator.
The pyrBI operon encodes the catalytic (PyrB) and regulatory
(PyrI) subunits of aspartate transcarbamylase, the first commited
step in pyrimidine biosynthesis (U, T, C). This operon can be
regulated ~100-fold by the intracellular concentration of UTP;
UMP is the precursor for UTP, TTP, and CTP.
The pyrBI operon is negatively regulated by RNA polymerase
pausing rather than ribosome stalling, although translational
coupling still plays a role. As for the trp operon, there
is a leader peptide with U-rich (T's in the DNA sequence) stretches.
Pausing of the RNA polymerase as it waits for UTPs allows ribosome
to "catch up" to RNA polymerase. The ribosome prevents
the formation of a termination loop, and the pyrBI genes
are transcribed. If UTP is abundant, then ribosome lags behind
RNA polymerase, the termination loop is allowed to form, and transcription
is attenuated.
Genes encoding enzymes for arginine biosynthesis are scattered
around the E. coli chromosome, but all are regulated by
the ArgR repressor. System is known as the "arg
Regulon." The ArgR repressor functions as a hexamer--relatively
unique; levels of transcription from different operons vary considerably
due to different effects at promoters and different affinities
of repressor for operator sequences. The operator sequence for
ArgR are conserved sequences of 18 bp showing dyad symmetry:

Arginine acts a corepressor to promote binding of ArgR to the
ARG boxes to regulate transcription. Arginine limitation maximally
derepresses the regulon. The concentration of ArgR in cells is
high (about 500 hexamers per cell = 1 mM),
and the affinity for arginine is rather low (Kd
about 10-4 M), but the Kd
for DNA of the active repressor is between 10-9
M and 10-10 M. Repression is largely dependent
upon arginine levels in cells.
Translational control occurs for pyrC, encoding the pyrimidine
biosynthetic enzyme dihydroorotase. Level of enzyme varies 10-fold
as a function of pyrimidine availability, and is not due to changes
in mRNA level but due to changes in mRNA sequence (see Figure
33). Transcription can start an any of 4 nucleotides: 5'
CCGG 3'. When pyrimidines are in excess, mRNA synthesis initiates
preferentially at C2. The resulting sequence can form a stem-loop
that sequesters the ribosome binding site, and no translation
of the mRNA can occur. Alternatively, when there is a limiting
pyrimidine concentration, mRNA synthesis preferentially initiates
with G4. No stem-loop can form, the ribosome binding site is
available for ribosome binding, and dihydroorotase is synthesized.
E. coli has 52 ribosomal proteins and three rRNA components
whose synthesis must be coordinated relative to the growth rate
of the cell. The ribosomal protein operons comprise 16 transcription
units having between 1 to 11 genes. AUTOGENOUS CONTROL MODEL
suggests that the translation of a group of ribosomal proteins,
encoding on a polycistronic mRNA, is inhibited by one of the proteins
encoded within that operon. In the mRNA from the rpl11-rpl1
operon, Rpl1 binds to the mRNA preventing translation. The proteins
that control translation typically bind directly the 16S or 23S
rRNAs, so if growth slows and these rRNAs decrease, the proteins
bind to mRNAs and prevent translation. Regulation is achieved
by having similar binding sites in the mRNA to those normally
bound by the protein in rRNA. Binding of protein to mRNA can
affect the translation of more than one gene.
Salmonella typhimurium has two genes for flagellin, encoded by the fliC and fljB genes. These are significantly different in structure and react differently to antibodies. Only one gene is expressed in any given cell at one time, but the bacterium can switch from one gene to the other at a frequency of about 10-3 to 10-5 per cell per generation. Figure 34 shows the organization of the fliC and fljB genes, which are not close to one another on the genome. The fljB-fljA operon includes an upstream region of 993 bp, called the H region. The H region is flanked by two 26 bp sequences, designated hixL (repeat L) and hixR (repeat R). Each 26 bp sequence is composed of 2 imperfect 13 bp repeats, and thus can serve as sites of recombination. The promoter for fljB-fljA is located within the H region, and in one orientation can transcribe this operon. When the fljB-fljA operon is transcribed, the FljA repressor prevents transcription of the fliC flagellin gene and prevents the formation of the H1 flagellin.
The hin gene, which has its own promoter internal within
the H sequence, encodes a site-specific recombinase that can flip
the H region so that the promoter for fljB-fljA is
no longer oriented in the proper direction to transcribe the fljB-fljA
genes. Inversion by Hin function is stimulated by a second
protein, Fis, which binds to enhancer sequences in the H region.
Hin-Fis protein-protein interactions, forming and "INVERTASOME,"
facilitate H region inversion by a looping mechanism. Similar
mechanisms are used to regulate expression of alternative genes
in many other organisms.