C2006/F2402 '07 OUTLINE OF LECTURE #10

(c) 2007 Dr. Deborah Mowshowitz, Columbia University, New York, NY. Last update 02/20/2007 03:05 PM .

Handouts:  10A&B
10A -- Regulatory Elements & Picture of a typical Eukaryotic Gene (not on web).
10B-- Alternative splicing;
Extra copies of all handouts are in boxes on 7th floor of Fairchild.

I. How Do you turn a Eukaryotic Gene On? Note this section is different from the corresponding section in the last lecture. It has been updated.

    A. The Problem: Need to unfold/loosen chromatin before transcription possible. Can't just add RNA polymerase to DNA and start transcription. DNA is in chromatin and must be made accessible.

    B. So how can transcription occur?

1. Need multiple steps

a. First must de-condense (loosen up) euchromatin  to a transcribable state = relatively loose (compared to heterochromatin and compared to inactive euchromatin). Pull out 30nm fiber to beads-on-a-string stage?

b. Then basal transcription factors (proteins that stimulate transcription) can bind to DNA.

c. Polymerase binds to TF's (not directly to the DNA) and you get actual transcription

 2. What changes state of chromatin?  (To tighten or loosen.)

a. Remodeling proteins: these are responsible for moving and/or loosening up nucleosomes. See Purves 14.16 (14.17).

a. Enzymes that modify histone tails. Changes in modification may have a direct effect and/or affect binding of regulatory proteins.

3. What triggers the unfolding process? Possible Models

a. Addition of TF's is primarily what opens up the chromatin -- binding of  TF triggers remodeling/modification step.

b. Remodeling/modification occurs first to allow TF's to get to DNA; then TF's bind and trigger transcription.

c. Both unfolding or decondensation and addition of TF's occur simultaneously. 

d. Current combination model (combines a & b, two possibilities listed last time):

(1). Two types of TF's -- basal and regulatory.

(2). Regulatory TF's (activators) bind first; that triggers remodeling, modification, etc.

(3). After chromatin is loosened up, basal TF's (& possibly more regulatory TF's) can bind to the DNA, pol II can bind to the TF's, and transcription occurs.

4. How does this fit with the DNase sensitivity results?

a. Loosest -- Regions where transcription factors bind -- have nucleosomes removed = hypersensitive sites.

b. Looser -- Regions being transcribed -- have nucleosomes somehow "loosened up" or "remodeled" but not removed.

c. Loose -- Regions not being transcribed -- have regular nucleosomes ('loose', relative to heterochromatin).

II. Introduction to Regulation of Eukaryotic Gene Expression
--  What has to be done to make a protein?  What steps can be regulated?

    A. In prokaryote (for comparison) -- process relatively simple. 

1. Most regulation at transcription. 

2. Translation in same compartment as transcription; translation follows automatically.

3. Most mRNA has short half-life.

    B. In eukaryote -- Gene expression has many more steps & complications than in prokaryotes -- more additional points of regulation. See Becker fig. 23-11 (21-11).

1. Transcription is main point of control (see below), but

a. Need to unfold chromatin to make transcript.  (See above.)

b. Once transcript is made it is not automatically translated. (See below.)

2. Transcript must be processed (capped, spliced, polyadenylated, etc.) -- any of these steps can be regulated, and there is more than one way to process most primary transcripts.

3. Transcription & translation occur in separate compartments.

a.  mRNA must be transported to cytoplasm.

b. Translation can be regulated (independently of transcription)   -- can control usage and/or fate of mRNA, not just supply of mRNA. For any particular mRNA, can regulate 1 or both of following:

(1). Rate of initiation -- can control how often ribosomes attach and start translation. 

(2). Rate of degradation -- can control half life of mRNA. Some mRNA's are long lived and some have a  very short half life.

III. Major features of gene regulation in Eukaryotes 
   What steps are actually regulated?

    A. How can amount of protein be controlled? If cell makes more or less protein, which step(s) are regulated? Many possible points of regulation in eukaryotes. See Becker fig. 23-11 (21-11), or Purves 14.11 (14.13), and list of steps above.

        1. Most common point of regulation is at transcription (in both euk. & prok.) If you need more protein, usually make more mRNA.

        2. Transcription is not the only step controlled, especially in euk. For example, half life of mRNA may be regulated. (Some examples will be discussed later.)

    B. If cells make different proteins, how is that controlled? If two eukaryotic cells (from a multicellular organism) make different proteins, what is (usually) different between them?

1. Is DNA different? (No, except in cells of immune system.)

2. Is state of chromatin usually different? (Ans: yes) How is this tested? Method & result described last time. See figure 23-17 (21-17) in Becker. What causes the difference in states of chromatin? Not clear what is cause and what is effect. More on this below & when we get to development.

3. Is mRNA usually different? (Ans: yes). This means you can get tissue specific sequences from a cDNA library. (cDNA library = collection of all cDNA's from a particular cell type.) DNA from each cell type is the same; mRNA and therefore cDNA is not. See Becker fig. 23-20 (21-20).

4. If mRNA's are different, why is that? Is the difference due entirely to differences in transcription?

a. Transcription is different in different cells. It could be that all cells transcribe all genes, but only some RNA's are exported to the cytoplasm and the remaining nuclear RNAs are degraded.  This is not the case. Only selected genes are transcribed in each cell type, and RNA's from those genes are processed to make mRNA.

b. Processing: Splicing and processing of same primary transcripts can be  different (in different cells or at different times). Different mRNA's (& therefore proteins) can be produced from the same  transcript by alternative splicing and/or poly A addition.                               

Try Problems 4-10 & 4-11.

IV. Details of transcription in eukaryotes (as vs. prokaryotes)
See Becker Ch 21, pp 665-670 (Ch 19, pp 640-644).

    A. More of everything needed for transcription in eukaryotes.

1. Multiple RNA Polymerases (see last lecture). We will focus on pol II (makes mRNA).

2. More Regulatory Sequences & Regulatory Proteins -- An Overview & Some terminology

a. Control elements/sequences -- Two types: cis & trans acting. (See 2nd table on handout 10A.)

  • Cis acting regulatory element =  affects only the nucleic acid molecule on which it occurs. Usually is a DNA sequence that binds some regulatory protein.

  • Trans acting regulatory element = affects target nucleic sequences anywhere in the cell. The regulatory sequence codes for a regulatory molecule -- usually a protein -- that binds to a target -- usually a DNA sequence.

  • The term "trans acting" can be used to refer to the regulatory molecule (usually a protein) or to the DNA sequence that codes for it.

  • In euk. the number of different types of cis and trans acting control elements is larger.

b. Regulatory Proteins = trans acting factors = products of trans acting regulatory elements

  • Two Common Types. Regulation can be "+" or "-" depending on the function of the protein -- repressor or activator? Negative control (use of repressors) seems to be more common in prok.; positive control (use of activators) in euk. (See 1st table on handout 10A)

  • How you tell positive and negative control apart -- by effects of deletions.

  • Terminology: Eukaryotic proteins that affect transcription are usually called transcription factors or TF's. (Details below.)

            (i) Basal TF's -- required for transcription in all cells -- all exert a + effect.

            (ii) Regulatory TF's -- decrease or increase basal transcription (usually called repressors or activators, respectively). Exert a + or - effect.

c. How Trans and Cis acting elements work together

  • Cis acting elements = DNA itself = same in all cells of multicellular organism = target of trans acting regulatory molecules.

  • Trans acting regulatory molecules = product of DNA = TF's & other molecules = different in different cell types and at different times.

    B. Details of regulatory sites in the DNA. Prokaryotes have promotors and operators. What sequences do eukaryotes have in the DNA that affect transcription? (The following discussion refers mostly to regulation of transcription by RNA pol II. See texts esp. Becker for details about promotors etc. for pol I & III.) See Purves Fig. 14.13 (14.15) or Becker fig.23-21 (21-21) or handout 10A for structure of regulatory sites for a typical protein coding gene. Three types of regulatory sites:

1. Start of Transcription

a. Numbering. Position of bases is usually counted along the sense strand from the start of transcription. For example:

(1). "Start" = Point where transcription actually begins (usually marked with bent arrow) = zero.

(2). +10 = 10 bases downstream from start = 10 bases after start of transcription. (Downstream = Going toward the 3' end on sense strand = in direction of transcription)

(3). -12 = 12 bases upstream  from start = 12 bases before reaching start of transcription. (Upstream = Going toward 5' end on sense strand = in opposite direction from transcription)

(4). +1 = first base in transcript; one that gets a cap.

Note: in some cases, the position of bases is counted along the sense strand from the start of translation. If it is done this way, the A in the first AUG is +1. However, numbering is assumed to be from the start of transcription unless specified otherwise.

b. Core Promotor. Includes

(1). Actual point for start of transcription (where bent arrow is)

(2). Binding sites: Part where basal TF's and RNA polymerase binding starts -- often includes short sequence called a TATA box. Close to start (usually about 25 bases before start). 

(3). Additional Features:  Core promotor is defined by what you need to allow RNA polymerase to start in the right spot. Includes some additional sequences besides those specified. (If you are interested in details, see Becker 21-13, b, or 23-21)

2. Proximal Control Elements (Proximal = Near).

a. Location: Near core promotor and start of transcription; usually "upstream" (on 5' side of start of transcription.) Usually includes regulator elements up to -100 or -200 (bases). 

b. Terminology: Sometimes considered part of core promotor.

c. Function -- binding of appropriate proteins promotes or inhibits transcription. Identified by effects of deletions. Sequence and mechanism of action varies. 

3. Distal Control Elements (Distal = Far)

a. Two kinds: Enhancers & silencers. These control elements can decrease (silencers) or increase transcription (enhancers).

b. These can be quite far from the gene they control (in either 5' or 3' direction = upstream or downstream)

c. These can work in both orientations -- Inverting them has no effect, unlike with promotors. See Becker fig. 23-22 (21-22).

d. Mechanism of action -- bind TF's; see below.

4. Terminology & Misc. Details -- this is for reference; may not be discussed in class.

a. Boxes = short sequences that are found in regulatory regions (ex: TATA box)

b. Consensus sequences = sequence containing the most common base found at each position for all sequences of that type. Any individual version of sequence is likely to be different from the consensus at one or more positions. (Ex: TATAAAA = consensus sequence for TATA box. Means T is most common base in first position, A is most common in second position, etc.)

c.  For multicellular organisms, term "operator" is not used for site/DNA sequence where a regulatory protein sits. Why? Because no polycistronic mRNA & no operons in higher eukaryotes. (Are some in unicellular euk.)

    C. Details of regulatory proteins or TF's = transcription factors

1. Basal TF's. Needed to start transcription in all cells. See Purves fig. 14.12 (14.14) or Becker fig. 21-14 (19-14).

a. Many basal TF's needed.

b. Basal TF's for RNA pol. II.

(1). Terminology: Basal TF's for pol II are called TFIIA, TFIIB, etc.

(2). Major one is TFIID; it itself has many subunits. Most studied subunit is TBP (TATA binding protein -- See Becker fig. 21-15 (19-15).) Recognizes TATA box when there is one.

(3). Other polymerases have TF's too, but TF's for pol II are of major interest, since pol II mRNA

c. Basal TF's bind first to core promotor, and then RNA pol binds to them. Takes a lot of proteins to get started. RNA polymerase does not bind directly to the DNA.

2. Regulatory or Tissue Specific TF's --  used only in certain cell types or at certain times. See Becker fig. 23-24 (21-24).

a. Bind to areas outside the core promotor -- usually to enhancers or silencers (distal control elements) but sometimes to proximal control elements

b. When regulatory TF's bind, can decrease or promote transcription.

(1). Activators. TF's called activators if bind to enhancers and/or increase transcription.

(2). Repressors. TF's called repressors if bind to silencers and/or decrease transcription.

c. How regulatory TF's affect transcription: DNA thought to loop around so silencer/enhancer is close to core promotor.  TF's on enhancer help stabilize (or block) binding of  basal TF's directly or indirectly to core promotor. (See Becker fig. 23-23 (21-23) or Purves 14.13 (14.15) and section on regulatory TF's below.)

d. Role of Co-activators -- Proteins that bind to TF's on the enhancer and influence transcription (but don't bind directly to the DNA) are often called co-activator (or co-repressor) proteins. There are 2 ways co-activators affect transcription:

(1). Connect two parts of the transcription machine -- Bind to TF on enhancer and to basal transcription factors (or pol II) at core promotor.

(2). Modify state of chromatin. Bind to TF on enhancer and loosen up chromatin in gene to be transcribed. Remodeling proteins and histone modifying enzymes are included in this category.

To review gene structure & TF's, try problem s 4R-2, 4R-5A & 4R6-A.

e. Co-ordinate control.  A group of genes can all be turned on of off at once in response to the same signal (heat shock, hormone, etc.).

(1). Prokaryotes vs. Eukaryotes: Both prok. and euk. exhibit co-ordinate control, but mechanism is different. (See table below.)

(2). Location:

(a). In prokaryotes, coordinately controlled genes are located together in operons.

(b). In eukaryotes, coordinately controlled genes do not need to be near each other -- they just have to have the same control elements. See Purves 14.14 (14.16). 

(3). Common control elements: All genes turned on in the same cell type and/or under the same conditions have the same control elements -- therefore these genes all respond to the same regulatory TF's. Result is multiple mRNA's, all made in response to same signal (s). 

(4). Comparison of situation in prokaryotes vs eukaryotes: 


Prokaryotes Multicellular Eukaryotes

Co-ordinately controlled genes are



Messenger RNA is

Polycistronic (1 mRNA/operon)

Moncistronic (1 mRNA/gene)




Control elements are found

Once per operon

Once per gene

Control can be positive or negative but is more often Negative  --repressors needed to turn gene off Positive -- activators needed to turn gene on.

e. Structure & Function of regulatory TF's is modular

(1). Each TF has multiple domains.

(a). Each TF has a transcription regulation domain (also called trans acting domain or in many cases transcription activating domain) -- determines effects of DNA binding by given TF (activation vs inhibition of transcription)

(b). Each TF has a DNA-binding domain -- specific for particular sequence(s) and/or gene(s)

(c). TF's that are hormone receptors also have a hormone-binding domain.

(d). TF may have additional domains, such as dimerization domain. Many TF's must dimerize to work. (Monomer is inactive.) Some form dimers with other molecules of the same protein (result is a homodimer) and some form dimers with a different protein (result is a heterodimer). 

(2). Modules (domains) can be switched -- Recombinant DNA methods can be used to make hybrid TF's. This has allowed us to figure out what domain does what, and has many uses in research. Some examples are in later problem sets (for example, 6-19). 

(3). How do regulatory TF's act?

     (a) By binding to control elements with their DNA-binding domains, and

     (b) Binding to other proteins with their transcriptional regulation domains. "Other proteins" can be basal TF's or other regulatory TF's (or to co-activator or co-repressor proteins)

(4). Types of DNA-binding domains. These are often classified by the structural motifs in the DNA binding region. Some common motifs (secondary structural elements) found in DNA binding domains are listed below for reference only (FYI). For pictures, see Purves 14.15 or (p. 273 in 6th ed) or Becker fig. 23-25 (21-25), use Google images, or the Pubmed bookshelf.  

  • Zinc finger

  • Leucine zipper

  • Helix-loop-helix

  • Helix-turn-helix.

To review transcription, try problems 4R-5 and 4R-6A.

V. Domains & Motifs in General  -- This section is included for reference

A. Terminology: The terms "domain" and "motif" are sometimes used interchangeably. Strictly speaking they are different -- A domain is considered a unit of tertiary structure, while a motif is considered a unit of secondary structure.

1. Domains. A domain refers to a discrete, locally folded unit of tertiary structure. It usually has a particular function (DNA binding, kinase, intracellular, etc.) and it may contain one or more motifs or unusual structures.   Domains are sometimes referred to by their functions (such as DNA binding) or by their structures (for example, SH2). It is assumed that domains of common structure have common functions, but not necessarily vice versa. (All DNA binding domains do not have the same structure -- see above.) Domains are usually named after the first protein in which they are found.

2. Motifs. A motif refers to a region with a particular combination of secondary structural elements such as alpha helices, beta sheets, loops, turns, etc. A certain motif (such as a zinc finger) always has the same 3D structure.

B. Significance of Domains 

1.  Common structure implies common function.  Existence of domain(s) of known function in a new protein is often used to deduce function(s) of the newly discovered genes/proteins. This is an important tool in analysis of results from the human genome project.

2. Common structure implies common origin. (FYI)

a. Same domains turn up again and again in different proteins (in different combinations). How did same domain end up in different places?

b. Domains & exons: Often one domain (in protein) corresponds to one exon (or group of exons) in DNA

c. Origin: Modular nature of proteins & genes (see a & b above) implies new genes may arise from reshuffling of old modules, not only from duplication and divergence of entire genes. One possible mechanism for reshuffling of exons & domains is as follows:

Two, nonhomologous genes pair up (by mistake). Recombination occurs in introns new combination of exons protein with new combination of preexisting functional modules (domains) encoded by separate exons.

VI. Regulation at Splicing -- Alternative Splicing -- this example will be done in #11

    A. There are two ways to get a collection of similar proteins

1. Gene families -- multiple, similar genes exist due to duplication and divergence of genes. Example: the globin genes constitute a family. Different family members code for myoglobin, beta-chains, alpha-chains, delta-chains, etc.

2. Alternative splicing etc (See C below)  -- only one gene, but primary transcript spliced in more than one way.

    B. An example of alternative splicing -- see handout 10 B and Becker fig. 23-31 (21-31) -- how to get either soluble or membrane-bound antibody from alternative splicing of the same transcript. (See Purves 14.20 for another example.)

1. Antibody can be membrane bound or secreted. Fate of antibody depends on whether peptide has a stop transfer sequence or not.

a. If has stop transfer: locks into membrane of ER and goes thru Golgi etc. to plasma membrane; stays in membrane.

b. If has no stop transfer: enters lumen of ER, goes thru Golgi etc., and then is secreted.

2. Gene has two alternative polyA addition sites. Which one is used determines final location of protein. 

a. Option 1: If one (at end of exon 4) is used, protein contains no hydrophobic stop transfer sequence, and protein is secreted.

b. Option 2: If other one (at end of exon 6) is used, protein contains hydrophobic sequence encoded by exons 5 & 6, and protein stays in plasma membrane.

3.  mRNA can be spliced and/or poly A added in two alternate ways. Location of protein (antibody) depends on whether splicing of intron 4 or poly A addition happens first. Think of it as a competition. Either

a.  Poly A adding enzymes get there before the spliceosome. In that case, poly A is added to site near end of exon 4, and rest of intron 4 (and rest of gene) is never transcribed, or

b. The spliceosome gets there first. In that case, Intron 4 is transcribed and spliced out before poly A can be added. (In this case, poly A is added at the end of exon 6 instead.)

4. Why are 2 forms of antibody needed?

a. Membrane-bound form of antibody: serves as receptor for antigen = trap to detect when antigen is present. Binding of antigen (ligand) to antibody (receptor) serves as trigger to start secreting antibody.

b. Secreted (soluble) form: binds to soluble antigen in body fluids and triggers destruction of antigen in multiple ways.

    C. The general Principle -- You can get many different proteins from a single gene by the processes listed below. Therefore biologists are interested in proteomics (study of all proteins made in a cell) not only genomics (study of entire DNA or gene content).

1. Starting transcription at different points

2. Ending transcription (adding poly A) at different points

3. Splicing out different sections (exons as well as introns) of the primary transcript -- alternative splicing.

 To review regulation & alternative splicing, try problems 4-12 & 4-13, & 4R-6.

Next Time: Regulation at translation and afterwards. We'll Finish Regulation & then start Signaling.