GENE EXPRESSION: INDIVIDUAL OPERONS

READING: Chapter 12 (pp. 320-350)

TRANSCRIPTIONAL CONTROL

General concepts to be familiar with (see text and Figure 3, p. 329):

1. Control of transcription initiation

2. Control of transcription termination

3. Control of translation

4. Other mechanisms


An obvious place to regulate transcription is at or around the PROMOTER region, thereby controlling access of RNA Polymerase to the promoter. Site where regulatory protein(s) bind is called the OPERATOR.

Regulatory proteins may prevent transcription (NEGATIVE CONTROL; see Figures 25 and 26) or may increase transcription (POSITIVE CONTROL; see Figures 27 and 28). Regulatory proteins often respond allosterically to the presence or absence of ligand effector molecules.

OPERONS are defined as several distinct genes, arranged in a colinear fashion and under the control of a common regulatory region, that are transcribed as a polycistronic mRNA. REGULONS refer to genes scattered around the genome that are nonetheless under the control of a common regulatory protein.

LACTOSE OPERON

Reading: Science 271: 1245-1254. Also see cover of this issue.

Genes of the lac operon are reponsible for utilization of lactose as a carbon source. Lactose is a disaccharide, that can be cleaved to form D-glucose and D-galatose.

Lac Operon Model was proposed in 1961 by Jacob and Monod prior to the discovery of mRNA.

lac operon contains three genes:

lacZ Encodes b-galactosidase, which cleaves lactose into monosaccharides. Tetramer of identical subunits of 116.4 kDa

lacY Encodes lactose permease. Symport permease that couples protonmotive force to lactose uptake. Hydrophobic protein, 46.5 kDa.

lacA Encodes thiogalactoside transacetylase (30 kDa). Transfers acetyl groups to some galactosides that may be toxic

All three genes are adjacent and co-transcribed. They are under the control of the LacI (Lactose Repressor) protein. The lactose repressor forms tetramers of identical subunits of 38.6 kDa each.

Lactose Repressor is member of a large family of proteins known as Helix-Turn-Helix (HTH) proteins, that have the ability to bind DNA by making specific base contacts in the major groove of the DNA double helix. Most HTH bind to DNA as dimers or tetramers, with each monomer recognizing half of an operator binding site.

LacI repressor binds to lac operator, a region of about 24 nt in length that overlaps the lac promoter that provides the binding site for RNA polymerase.

The lac promoter region actually actually contains 3 binding sites for LacI repressor (see Figure 29). The strongest, O1, overlaps the promoter binding site for RNA polymerase, and most recent studies indicate that binding of LacI to the O1 operator precludes RNA polymerase binding. When all 3 binding sites are present, LacI repressor can suppress initiation of transcription approximately 1000-fold. If either O2 or O3 is missing, then repression is only about 500-fold. If both O2 and O3 are missing, repression is only 20-fold (i.e., 50-times less than the complete system). When LacI binds to O1 and O3, then looping of the intervening DNA occurs that may further reduce the binding of RNA polymerase to the promoter. Formation of O1/O2 looping probably directly interferes with mRNA synthesis/elongation.

Induction of the lac operon occurs due to the binding of an INDUCER to the LacI protein, causing an allosteric change in conformation in the protein that leads to release from the lac operator. The in vivo inducer is allolactose (galactose on the 6 position rather than the 4 position; a by-product of LacZ b-galactosidase catalysis).

GRATUITOUS INDUCERS are molecules which are not substrates for LacZ but which bind tightly to the LacI repressor.

The lac control region also contains a secondary promoter that binds RNA polymerase tightly but that allows very inefficient initiation. Once, initiated, transcription can proceed past the bound LacI at the operator, however. The message produces a stem-loop (from the O1 operator) that includes the ribosome binding site of lacZ gene; poor translation results from the mRNA. Permease is synthesized along with minimal amount of LacZ.

CATABOLITE CONTROL

lac operon is controlled by a second, positive regulation system. E. coli exhibits diauxic growth when presented with glucose and another carbon source, such as lactose. No lactose is used until all glucose is utilized, and lactose utilization enzymes are not induced (see Figure 30).

E. coli will grow faster on glucose than on any other carbon source, and when growing on glucose, cAMP (cyclic 3'-5' AMP) levels are low. cAMP is formed by the enzyme ADENYLATE CYCLASE. Adenylate cyclase is regulated in turn by Enzyme IIAGlu of the PTS system (see Figure 18 on WWW). The phosphorylated form of Enzyme IIAGlu acts an activator of adenylate cyclase, causing cAMP levels to rise. The dephosphorylated level of Enzyme IIAGlu inhibits adenylate cyclase, causing cAMP levels to remain low. The levels of phosphorylated Enzyme IIAGlu will be low when glucose is actively being transported by the PTS system, but will rise as glucose levels are depleted.

How is the cAMP signal used to regulate gene transcription? cAMP binds to a DNA binding protein called CATABOLITE ACTIVATOR PROTEIN (CAP; also known as CRP, catabolite repressor protein) which acts as a positive regulator of many operons, including lac, gal, ara, and others. REGULONS are many operons under common control.

Binding of CAP-cAMP complex to DNA occurs at the CAP site, immediately upstream from the promoter -35 region (see Fig. 6, p. 337 of textbook). Binding of CAP-cAMP to DNA causes bending of the DNA; the protein interacts with the a subunits of RNA polymerase to activate transcription. CAP-cAMP facilitates RNA polymerase binding and promotes helix destabilization, allowing increased expression from lac operon.

CAP-cAMP binding to operator actually tightens negative control of the lac operon under conditions where lactose is not present. This is logical if one considers that low glucose concentrations, which would favor high cAMP levels, would signal that the cell is energy limited. Under conditions of energy sufficiency, tight control is less important, and O1 binding alone by repressor could be sufficient to control transcription. However, when energy is limiting (glucose is limiting), then the cell does not want to waste any ATPs at all. CAP-cAMP binding facilitates LacI repressor binding by bending DNA, and the resulting loop places the lac promoter on the inner surface of the loop, nicely sequestered from RNA polymerase. IF lactose is available, however, then the loop is released and CAP-cAMP promotes RNA polymerase binding and initiation of transcription.

GALACTOSE (gal) OPERON

The gal operon (see Figure 31) of E. coli consists of 3 structural genes: galE (epimerase), galT (galactose transferase), and galK (galactokinase), which are transcribed from two overlapping promoters PG1 and PG2 upstream from galE. Regulation of the operon is complex since the GalE product, an epimerase that converts UDP-glucose into UDP-galactose, is required for the formation UDP-galactose for cell wall biosynthesis even when cells are not using galactose as a carbon/energy source. The gal operon is controlled by CRP-cAMP as for the lac operon. CRP-cAMP binds to -35 region promoting transcription from PG1 but inhibiting transcription from PG2. When cells are grown in glucose, basal level transcription occurs from PG2. The unlinked galR gene encodes the repressor for this system. A tetrameric GalR repressor binds to 2 operators, one located at +55 and one located at -60 relative to the PG1 start site. Looping of the DNA blocks the access of RNA polymerase to promoters and/or inhibits formation of the open complex. When GalR binds as a dimer to the -60 site only, promoter PG2 is activated, not repressed, allowing basal levels of GalE to be produced. In this state promoter PG1 is inactivated through interactions with the alpha subunit of RNA polymerase.

ARABINOSE (ara) OPERON

The carbohydrate L-arabinose is catabolized by E. coli through the use of 3 enzymes, which are the products of the araA, araB, and araD genes (see Figure 32). L-arabinose is converted to L-ribulose by AraA, or L-arabinose isomerase. L-ribulokinase, encoded by araB, phosphorylates L-ribulose to form L-ribulose 5-phosphate, which is isomerized to D-Xylulose 5-phosphate by AraD, or L-ribulose-5-phosphate 4-epimerase. The product can be further metabolized via the pentose phosphate pathway. Regulation of the araBAD operon occurs via araC, which is adjacent to and divergently transcribed from araBAD. The araC promoter (PC) and araBAD promoter (PBAD) are both stimulated by cAMP-CRP. AraC can act as both an activator of PBAD in the presence of arabinose but represses both promoters in the absence of arabinose.

How can AraC be both a repressor and an inducer? As a repressor, AraC binds to two sites (O2 and I1) to form a loop (see Figure ). This state blocks promoter PC and promoter PBAD is not activated because AraC is not positioned appropriately for activation. Addition of arabinose, which causes a conformation change in AraC, opens the DNA loop with the assistance of CRP-cAMP. Binding to adjacent sites on the DNA becomes favored, and activation of promoter PBAD occurs. AraC is positioned to make contact with RNA polymerase at promoter PBAD. Promoter PC remains available transiently (for about 10 minutes usually) until AraC concentration rises and fills the O1 binding sites and represses the promoter PC. CRP-cAMP destabilizes the looping, and thus favors activation of both promoters.

TRYPTOPHAN OPERON: trp REPRESSOR (TrpR)

Tryptophan is one of the 20 essential amino acids from which all proteins are formed. It is typically the least abundant amino acid in most proteins, however. When available in the medium, transporters readily take up this amino acid and incorporate it into proteins. In E. coli, the tryptophan (trp) operon consists of a promoter, an operator (for repressor binding) the trpL gene encoding (tryptophan leader peptide), and five structural genes (trpEDCBA) which encode the critical enzymatic activities that specifically convert chorismsate into tryptophan. The trpR gene encoding the repressor protein occurs at a distant site in the genome. The TrpR repressor is inactive in the absence of tryptophan, but if tryptophan levels in cells rise, tryptophan, the COREPRESSOR, binds to repressor, which then binds to trp operator (between promoter and trpL gene), and mRNA synthesis is repressed. The end-product of the tryptophan biosynthetic pathway inhibits the synthesis of the enzymes (by blocking mRNA synthesis), and this is referred to as END-PRODUCT REPRESSION. Cell balances the production of enzymes for biosynthesis of this amino acid with the consumption of this amino during protein synthesis, leading to energy efficiency. Repression via TrpR can provide approximately 100-fold regulation of tryptophan biosynthesis enzymes.

ATTENUATION: trp OPERON

Starvation of E. coli results in derepression of the trp operon in two stages. First, the absence of tryptophan releases the TrpR protein from the operator. A second level of control further regulates the trp operon by increasing a further 10-fold the rate of trp mRNA synthesis in response to insufficient charged tRNATrp. When sufficient tRNATrp is found in cells, 9 of 10 transcripts initiated from the trp promoter terminate prior to the structural genes. This affect is due to ATTENUATION control of transcription and requires a tight coupling of transcription and translation. Attenuation can only occur in procaryotes.

Between the trp promoter and the trpE gene is the trpL gene, a gene of only 14 codons containing two sequential tryptophan codons:

Met Lys Ala Ile Phe Val Leu Lys Trp Trp Arg Thr Ser

The TrpL gene has normal start codon and an strong ribosome binding sequence. The nucleotide sequence of the trpL region is such that three possible stem loops can be formed. Either sequences 1 + 2 and 3 + 4 form tandem loops known as the PAUSE LOOP and the TERMINATOR LOOP or sequences 2 + 3 form the ANTITERMINATOR LOOP. The terminator loop 3+4 is identical to rho-independent terminators which we discussed earlier in the course: a G+C-rich stem-loop followed by 7 U residues in the mRNA sequence. Which stem-loops form depends upon the rate of translation by ribosomes and the tRNATrp content of the cells. Note that the Trp repressor and attenuation actually respond to different measures of the tryptophan status of the cell.

When RNA polymerase binds to the trp promoter and intiates mRNA synthesis, the RNA polymerase paues at the Pause Loop. If a ribosome does not load quickly, an indication of limited protein synthesis, then RNA polymerase is slowly released and Termination Loop 3+4 is formed, and transcription is terminated. (No protein synthesis means the cell has no need to synthesize tryptophan.)

Ribosome binding to the paused transcription complex can also release RNA polymerase. If there is an adequate supply of tryptophan, the ribosome moves to the stop codon of trpL; this blocks the formation of the 2+3 antiterminator and the 3+4 terminator forms, releasing the transcript and halting transcription. A small fraction of the time, ribosome release will occur prior to formation of the terminator, and some readthrough will occur when the 2+3 terminator forms rather than the 3+4 terminator stem-loop. This will provide basal levels of the Trp biosynthetic enzymes.

Ribosome binding to the paused transcription complex in the absence of sufficient tRNATrp will also release the paused RNA polymerase. However, the ribosome will stall when it reaches the Trp-Trp codon pair. This prevents the formation of the 1+2 stem loop and thus favors the formation of antiterminator loop 2+3, in turn preventing the formation of the terminator loop 3+4. RNA polymerase thus transcribes the complete trp operon.

Attenuation is an important control mechanism in other organisms and other operons encoding enzymes for biosynthesis of amino acids. The following operons have no repressors and are regulated exclusively by attenuation:

phe, his, leu, thr, ilv

Leader gene sequences in each of the above named operons have multiple codons for the appropriate amino acids. Since Leu, Ile and Val are all derived from Threonine, the thr and ilv leader sequences contain multiple codons for all four amino acids.

In Bacillus subtilis, regulation of the trp operon also involves the formation of a leader transcript with the possibility of alternative structures. However, in B. subtilis it is not the translating ribosome but a regulatory protein, TRAP, that determines the outcome. TRAP (trp-RNA-binding attenuator protein) has 11 identical 8 kDa subunits; when tryptophan is bound, each subunit can bind a G/U-A-G trinucleotide which is repeated many times in the leader region. TRAP binding promotes attenuation of transcription by blocking the formation of the antiterminator, thereby promoting formation of the terminator.

TRANSCRIPTIONAL ATTENUATION: pyrBI

The pyrBI operon encodes the catalytic (PyrB) and regulatory (PyrI) subunits of aspartate transcarbamylase, the first commited step in pyrimidine biosynthesis (U, T, C). This operon can be regulated ~100-fold by the intracellular concentration of UTP; UMP is the precursor for UTP, TTP, and CTP.

The pyrBI operon is negatively regulated by RNA polymerase pausing rather than ribosome stalling, although translational coupling still plays a role. As for the trp operon, there is a leader peptide with U-rich (T's in the DNA sequence) stretches. Pausing of the RNA polymerase as it waits for UTPs allows ribosome to "catch up" to RNA polymerase. The ribosome prevents the formation of a termination loop, and the pyrBI genes are transcribed. If UTP is abundant, then ribosome lags behind RNA polymerase, the termination loop is allowed to form, and transcription is attenuated.

ARGININE REPRESSION

Genes encoding enzymes for arginine biosynthesis are scattered around the E. coli chromosome, but all are regulated by the ArgR repressor. System is known as the "arg Regulon." The ArgR repressor functions as a hexamer--relatively unique; levels of transcription from different operons vary considerably due to different effects at promoters and different affinities of repressor for operator sequences. The operator sequence for ArgR are conserved sequences of 18 bp showing dyad symmetry:


Arginine acts a corepressor to promote binding of ArgR to the ARG boxes to regulate transcription. Arginine limitation maximally derepresses the regulon. The concentration of ArgR in cells is high (about 500 hexamers per cell = 1 mM), and the affinity for arginine is rather low (Kd about 10-4 M), but the Kd for DNA of the active repressor is between 10-9 M and 10-10 M. Repression is largely dependent upon arginine levels in cells.

TRANSLATIONAL CONTROL: pyrC

Translational control occurs for pyrC, encoding the pyrimidine biosynthetic enzyme dihydroorotase. Level of enzyme varies 10-fold as a function of pyrimidine availability, and is not due to changes in mRNA level but due to changes in mRNA sequence (see Figure 33). Transcription can start an any of 4 nucleotides: 5' CCGG 3'. When pyrimidines are in excess, mRNA synthesis initiates preferentially at C2. The resulting sequence can form a stem-loop that sequesters the ribosome binding site, and no translation of the mRNA can occur. Alternatively, when there is a limiting pyrimidine concentration, mRNA synthesis preferentially initiates with G4. No stem-loop can form, the ribosome binding site is available for ribosome binding, and dihydroorotase is synthesized.

TRANSLATIONAL REPRESSION: RIBOSOMAL PROTEINS

E. coli has 52 ribosomal proteins and three rRNA components whose synthesis must be coordinated relative to the growth rate of the cell. The ribosomal protein operons comprise 16 transcription units having between 1 to 11 genes. AUTOGENOUS CONTROL MODEL suggests that the translation of a group of ribosomal proteins, encoding on a polycistronic mRNA, is inhibited by one of the proteins encoded within that operon. In the mRNA from the rpl11-rpl1 operon, Rpl1 binds to the mRNA preventing translation. The proteins that control translation typically bind directly the 16S or 23S rRNAs, so if growth slows and these rRNAs decrease, the proteins bind to mRNAs and prevent translation. Regulation is achieved by having similar binding sites in the mRNA to those normally bound by the protein in rRNA. Binding of protein to mRNA can affect the translation of more than one gene.

RECOMBINATIONAL REGULATION OF GENE EXPRESSION: FLAGELLAR PHASE VARIATION

Salmonella typhimurium has two genes for flagellin, encoded by the fliC and fljB genes. These are significantly different in structure and react differently to antibodies. Only one gene is expressed in any given cell at one time, but the bacterium can switch from one gene to the other at a frequency of about 10-3 to 10-5 per cell per generation. Figure 34 shows the organization of the fliC and fljB genes, which are not close to one another on the genome. The fljB-fljA operon includes an upstream region of 993 bp, called the H region. The H region is flanked by two 26 bp sequences, designated hixL (repeat L) and hixR (repeat R). Each 26 bp sequence is composed of 2 imperfect 13 bp repeats, and thus can serve as sites of recombination. The promoter for fljB-fljA is located within the H region, and in one orientation can transcribe this operon. When the fljB-fljA operon is transcribed, the FljA repressor prevents transcription of the fliC flagellin gene and prevents the formation of the H1 flagellin.

The hin gene, which has its own promoter internal within the H sequence, encodes a site-specific recombinase that can flip the H region so that the promoter for fljB-fljA is no longer oriented in the proper direction to transcribe the fljB-fljA genes. Inversion by Hin function is stimulated by a second protein, Fis, which binds to enhancer sequences in the H region. Hin-Fis protein-protein interactions, forming and "INVERTASOME," facilitate H region inversion by a looping mechanism. Similar mechanisms are used to regulate expression of alternative genes in many other organisms.