C2006/F2402 '04 OUTLINE OF LECTURE #9

Handouts (not on web): 8A & 8B from last time (nucleosome structure) & 9 (Regulatory Elements & Picture of a typical Eukaryotic Gene)

I. Histones & Details of Chromatin Structure

A. Structure of Chromatin -- how are proteins and DNA arranged? The evidence.

1. Appearance in EM -- low salt ("beads on a string" appearance) vs physiological salt (30 nm fiber). See Becker fig. 16-17 or Purves fig. 9.7 (9.8).

2. Results of treatment with DNase (microccocal nuclease; cuts exposed DNA at random; does not cut at specific base sequences.)

a. Treat chromatin with a little mic. nuclease; then remove protein and isolate the DNA, then run DNA on a gel. Get 200 Base Pair (BP) "ladder" = sequences of multiples of 200 BP on gel. Implies repeating structure ("bead") with about 200 BP per repeat; DNA exposed (so easily cut) about once per 200 BP. (See Becker fig. 16-18 & 16-19.)

b. More treatment with nuclease --> resistant core of 140-145 BP. Implies part of 200 BP repeat is relatively protected in/on core of "bead"; rest goes between beads and is more exposed.

3. Histones

a. 5 types: H2A, H2B (slightly lys rich), H3, H4 (arg rich) & H1 (lys rich). All relatively small, basic proteins.

b. All histones are highly conserved evolutionarily (this implies histones carry out a critical function that depends on a particular structure -- can't change structure much without ruining function).

b. How much per 200 BP of DNA? 2 molecules each of H2A, H2B, H3, H4 plus one molecule of H1.

c. Low salt removes H1.

B. Model for Chromatin (Beads on string level) -- nucleosomes (See small picture in '91 article (8A) and Becker fig. 16-20 or Purves fig. 9.7). Where is the protein relative to the DNA?

1. Octamer of 2 each of H2A, H2B, H3, H4 (+ some DNA) = one bead.

2. DNA wound 2X around (on outside of) each octamer -- protects core.

3. Linker DNA between cores = 50-60 BP; most exposed & most sensitive to nuclease used above.

4. H1 is on outside of DNA/octamer

5. Nucleosome = repeating unit = 200 BP DNA + octamer (H1 optional).

C. Nucleosomes & Higher Levels of Structure (requires H1) -- how does chromatin fold up?. See big picture in '97 article (8B) -- and books for detailed pictures. (Becker fig. 16-21 or Purves fig. 9.7 (9.8 or 14.3)) Stages of folding are as follows (details in table):

1. Nucleosomes. DNA + histones --> Chain of nucleosomes; about 1/7 original length of DNA (see table below).

2. 30 nm fiber. Chain of nucleosomes folds back on itself (supercoils) --> 30 nm fiber (sometimes called solenoid). Exact structure unclear. Fiber is about 3 beads across; 6 beads/turn = 1/42 length of DNA. Need tails (of histones) and H1 to form 30nm fiber. (See 97 article.)

3. Loops. 30 nm fiber --> loops about 300nm in diameter (1/750 orig. length). Different sections may be tighter or looser. Individual loops are stretched out (probably to beads-on-a-string stage) when actually transcribed.

4. Higher Orders of Folding. Looped structure folds further --> --> heterochromatin (not transcribable)

a. Form structures/fibers about 700 nm across (per chromatid)

b. At metaphase = tightest = 1/15,000-1/20,000 orig. length (Chromosome is 4-5 microns long but contains 75 mm of double helical DNA)

D. Summary of States of Folding -- compare to handout 8B.

Structure	Compaction relative to previous	Packing Ratio*	Diameter	H1 Needed?
DNA	none	1	2 nm	--
Nucleosomes -- beads on a string	7X	7 (1/7th length of DNA)	10-11 nm	no
30 nm fiber	6X	42	30 nm	yes
loops	15-20X	750	300 nm	yes
heterochromatin (metaphase)	20X	15-20,000	700 nm (per chromatid)	yes

* Packing ratio = length of DNA or DNA/protein complex relative to original length of DNA.

E. Modifications of histones -- helps tighten up or loosen chromatin

1. Phosphorylation of H1 occurs in M; changes in kinase and phosphatase activity affect state of histones and folding of chromatin in parallel with changes in lamins.

2. Acetylation of lys side chains of histones may serve regulatory purpose. Acetylation of histones --> more active, looser chromatin (see '91 article). Acetylation of H3 & H4 is higher in active chromatin.

3. Other modifications, such as methylation, occur and may serve regulatory purposes as well. (Both DNA and histones can be methylated.)

To review nucleosome structure, try problems 4-1, 4-3 A, 4-4 A & 4-8 A.

II. Does Genetic Activity Correlates with States of Chromatin? How to test?

A. Polytene chromosomes (See Becker Ch. 21 esp. fig. 21-15 & 21-16) Special Interesting Case. Not covered in lecture; is here FYI.

1. Advantages:

a. Special case where many chromatids -- about 1000 -- are lined up together with genes and loops of chromatin aligned; therefore can see differences in state of folding in light microscope.

b. Can compare changes in folding of chromatin and level of transcription (by incorporation of labeled U) in a single tissue as it responds to hormones that alter gene expression.

2. Disadvantage: Can't compare regions of active/inactive chromatin from many dif. tissues.

3. Results: Active (transcribed) regions are clearly looser -- individual loops stretched out more. (See Becker )

B. Results of treatment with DNase I (of ordinary chromosomes/chromatin)

1. The problem -- Not all euchromatin is transcribed in any one cell -- is it all folded to the same degree or not?

a. All (eu)chromatin looks about the same in the light microscope. Therefore indirect methods (such as DNase treatment) are necessary to test state of folding.

b. Use of DNases. States of folding of (eu)chromatin are often distinguished by effects of treatment with various types of DNase. State of (eu)chromatin will determine relative sensitivity of DNA (while still in chromatin) to degradation by various DNases. DNA that is in tighter areas of chromatin will be more protected from degradation. (Some examples of this will be discussed below and are in problem sets.)

2. Method -- see Becker fig. 21-17. Look at chromatin region containing a particular gene, say globin genes, from dif. tissues. (Note: this region is euchromatic in interphase in all these tissues, even though it is not expressed in all of them. No gross difference is visible; that's why indirect methods are necessary.)

a. Overall idea: can't isolate areas of chromatin that are more or less active and then test for which DNA (which genes) are present. Have to do it in reverse. So treat chromatin with DNase, first, then fish out known genes and see if they were degraded or not.

b. Details of DNase treatment: Treat chromatin with DNase I (different enzyme from one used previously to get a "ladder"). Then isolate DNA and see what state it's in. (See c.)

(1). Can vary conditions -- amount of enzyme, time, etc. to distinguish various degrees of sensitivity to the enzyme.

(2). Can test chromatin from many different tissues, say erythrocytes (in chickens -- still have nuclei) vs. brain

(3). Can test state of many different genes using diff. probes (in step c below).

(4). DNase I cuts the DNA differently than micrococcal nuclease. DNA that is simply wound around nucleosomes is not totally protected from DNase I -- further folding up is required. So resistance to DNase I is a measure of how tightly the nucleosomes are folded in on themselves. DNase I does not cut preferentially at any particular sequence or at any particular place relative to the position of the nucleosome. Does not give "ladder."

c. Expected result: If chromatin is relatively loose, DNA will be unprotected and readily degraded by DNase I. If chromatin is relatively tight, DNA in that region will NOT be easily degraded.

d. How to measure state of DNA (after treatment of chromatin):

(1). Prepare DNA. Extract DNA (remove histones); treat naked DNA with restriction enzyme (to give pieces of reasonable size IF DNA was protected from DNase I). If DNA was in loose region of chromatin, it will already have been degraded by DNase I.

(2). Find regions corresponding to known genes. Run restriction fragments on gel; do blot, identify regions of interest (say, globin genes) with probe. Only regions from areas with relatively tight chromatin will give clear, undegraded bands on the gel.

3. Results of treatment of euchromatin with DNase I and other nucleases.

a. Almost all eukaryotic DNA is in nucleosomes. (See (c) for exceptions.) How do we know?

(1). Almost all DNA is much more resistant to digestion by DNase I than naked DNA

(2). Almost all DNA forms ladder when treated with micrococcal DNase.

b. Actually transcribed (coding) DNA is more sensitive to DNase I than ordinary euchromatic DNA. But transcribed DNA is much more resistant than naked DNA -- it is still in nucleosomes.

c. Hypersensitive sites exist -- these are the only sections not in nucleosomes.

(1). Some hypersensitive sites found = very sensitive regions (100X more sensitive to degradation by DNase I than heterochromatin; 10X more sensitive than active euchromatin.)

(2). Hypersensitive sites correspond to sites without nucleosomes

(3). DNA in hypersensitive sites is not naked -- it has other proteins, but no histones. See Becker fig. 21-18.

(4). Hypersensitive sites correspond to regulatory, not coding, regions (in areas of active transcription).

(5). Why are hypersensitive sites so sensitive to DNase? Transcription factors (regulatory proteins) have replaced histones and the other proteins don't protect the DNA as well as histones do.

C. Possible States of Interphase Chromatin -- what is inferred from all the results, and the implications. (See table below. )

a. Super Tight = heterochromatin = essentially not uncoiled from mitosis. Genetically inactive.

b. Euchromatin -- 3 major states?

(1). Loose. Looser than heterochromatin, but not being transcribed (genetically inactive) at the moment. 30 nm fiber?

(2). Looser -- but still has nucleosomes. Beads-on-a-string? Stretched out loops? Example: coding region that are active -- they are actually being transcribed. (May include genes that were transcribed recently, or were next to transcribed regions, etc.)

(3). Loosest or hypersensitive region (missing some histones; has regulatory proteins instead). Example: active promotor or enhancer.

c. A caution: There are probably intermediate states, and the ones above are simply the only ones that have been clearly distinguished using the methods currently available. For example, all inactive euchromatin is probably not the same. See ** below.

D. Summary of States of Interphase Chromatin (for reference):

Interphase Chromatin
Heterochromatin*		Euchromatin
(relatively tight)		(relatively loose)
Constitutive	Facultative	Loose	Looser	Loosest
Always heterochromatic	Sometimes heterochromatic -- depends on time, tissue, etc.	inactive **	actually being transcribed	regulatory region w/ regulatory proteins instead of histones
Never Transcribed	Transcribed Sometimes
				Only region/state not in nucleosomes

* All chromatin in mitosis/meiosis is heterochromatin.
**Most chromatin in interphase is euchromatin even if the DNA is unlikely to be transcribed in that interphase. The "loose" or "inactive" category may include several different states of "transcribability."

To review the differences in states of chromatin, uses of DNase, etc., try the rest of 4-3 to 4-8.

III. Introduction to Regulation of Eukaryotic Gene Expression

A. What has to be done to turn a eukaryotic gene on/off? What steps can be regulated?

1. In prokaryote (for comparison) -- process relatively simple.

a. Most regulation at transcription.

b. Translation in same compartment as transcription; translation follows automatically.

c. Most mRNA has short half-life.

2. In eukaryote -- Most regulation is at transcription, but you have more steps & complications -- more additional points of regulation. See Becker fig. 21-11.

a. Need to unfold/loosen chromatin before transcription possible. We know that modification of histones, binding of certain non-histone proteins, and/or methylation of DNA is correlated with state of folding. Not clear what is primary cause and what is secondary effect of unfolding. Two possible models (most current data favors the first):

(1) Two step model for regulation. See Becker fig. 21-10.

(a). First must de-condense (loosen up) euchromatin to a transcribable state = loose (compared to heterochromatin and compared to inactive euchromatin). Pull out 30nm fiber to beads-on-a-string stage?

(b). Then add transcription factors (more or less as for prokaryotes) --> actual transcription

(1). Regions with transcription factors = nucleosomes removed = hypersensitive sites

(2). Regions being transcribed = nucleosomes are somehow "loosened up" or "remodeled" but not removed.

(c). Proteins responsible for removing and/or loosening up nucleosomes are called remodeling proteins.

(2). One Step Model -- addition of TF's is primarily what opens up the chromatin -- don't need a separate remodeling step first. Alternatively, both unfolding/decondensation and addition of TF's occur simultaneously.

b. Transcript must be processed (capped, spliced, polyadenylated, etc.) -- any of these steps can be regulated, and there is more than one way to process most primary transcripts. (An example will be discussed next time.)

c. Transcription & translation occur in separate compartments.

(1). mRNA must be transported to cytoplasm.

(2). Translation can be regulated (independently of transcription) -- can control usage of mRNA, not just supply of mRNA. (An example will be discussed next time.) For any particular mRNA, can regulate 1 or both of following:

(a). rate of initiation of mRNA translation (how often ribosomes attach and start translation)

(b). rate of degradation of mRNA

IV. Major features of gene regulation in Eukaryotes

A. How can amount of protein be controlled? If cell makes more or less protein, which step(s) are regulated? Many possible points of regulation in eukaryotes. See Becker fig. 21-11 and list of steps above.

1. Most common point of regulation is at transcription (in both euk. & prok.) If you need more protein, usually make more mRNA.

2. Transcription is not the only step controlled, especially in euk. (Some examples will be discussed later.)

B. If cells make different proteins, how is that controlled? If two eukaryotic cells (from a multicellular organism) make different proteins, what is (usually) different between them? Is the difference this time at transcription?

1. Is DNA different? (No, except in immune system.)

2. Is state of chromatin usually different? (Ans: yes) How is this tested? Method & result described above. See figure 21-17 in Becker. What causes the difference in states of chromatin? Not clear what is cause and what is effect.

3. Is mRNA usually different? (Ans: yes). This means you can get tissue specific probes from a cDNA library. (cDNA library = collection of all cDNA's from a particular cell type.) DNA from each cell type is the same; mRNA and therefore cDNA is not.

4. If mRNA's are different, why is that? Is the difference due to differences in transcription? (Ans: yes, but not entirely.)

a. Transcription is different in different cells. It could be that all cells transcribe all genes, but only some RNA's are exported to the cytoplasm and the remaining nuclear RNAs are degraded. This is not the case. Only selected genes are transcribed in each cell type, and RNA's from those genes are processed to make mRNA.

b. Splicing and processing of same primary transcripts can be different (in different cells or at different times). Different proteins can be produced from the same gene by alternative splicing and/or poly A addition.

To review possible steps in regulation, try problems 4-9 to 4-11.

V. Details of transcription in eukaryotes (as vs. prokaryotes) See Becker Ch 19, pp 640-644 (651-655).

A. More of everything needed for transcription in eukaryotes.

1. Multiple RNA Polymerases (see last lecture).

2. More Regulatory Sequences & Regulatory Proteins -- An Overview & Some terminology

a. Control elements/sequences -- Two types: cis & trans acting.

Cis acting = a DNA sequence

Trans acting = codes for a protein that binds to a DNA sequence. (The term "trans acting" can be used to refer to the protein or to the gene for the protein.)

In euk. the # of different types of cis and trans acting control elements is larger.

Type of control element	cis acting	trans acting
Works on or affects	only the DNA on which it occurs	all copies of target DNA's in the cell
Examples	promotor, operator, or enhancer	[gene for] repressor or activator protein or TF
Control element is	binding site for a regulator protein	[gene for] a regulator protein.

b. Regulatory Proteins -- Two Types. Regulation can be "+" or "-" depending on the function of the protein. Negative control seems to be more common in prok.; positive control in euk.

Type of Control	positive	negative
Type of Regulatory Protein	activator protein	repressor protein
Regulatory Protein needed to	turn on transcription	turn off transcription
Effect of loss of Regulatory Protein	No transcription	Constitutive Transcription

B. Details of regulatory sites in the DNA. Prokaryotes have promotors and operators. What sequences do eukaryotes have in the DNA that affect transcription? (The following discussion refers mostly to regulation of transcription by RNA pol II. See texts esp. Becker for details about promotors etc. for pol I & III.) See Purves Fig. 14.15 (14.17 in 5th ed.) or Becker fig.21-21 or handouts for structure of regulatory sites for a typical protein coding gene. Three types of regulatory sites:

1. Core Promotor

a. Structure. Usually divided into

(1). "Start" = Point where transcription actually begins (usually marked with bent arrow)

(2). Part where basal TF's and RNA polymerase binding starts -- often includes short sequence called a TATA box. Close to start (usually about 25 bases before start).

b. Numbering. Position of bases is counted from the start of transcription. For example:

(1). +10 = 10 bases downstream from start = 10 bases after start of transcription. (Downstream = Going toward the 3' end on sense strand = in direction of transcription)

(2). -12 = 12 bases upstream from start = 12 bases before reaching start of transcription. (Upstream = Going toward 5' end on sense strand = in opposite direction from transcription)

(3). +1 = first base in transcript; one that gets a cap.

2. Proximal Control Elements (Proximal = Near).

a. Location: Near core promotor and start of transcription; usually "upstream" (on 5' side of start of transcription.) Usually includes regulator elements up to -100 or -200 (bases).

b. Terminology: Sometimes considered part of core promotor.

c. Function -- binding of appropriate proteins promotes or inhibits transcription. Identified by effects of deletions. Sequence and mechanism of action varies.

3. Distal Control Elements (Distal = Far)

a. These control elements can decrease (silencers) or increase transcription (enhancers).

b. These can be quite far from the gene they control (in either 5' or 3' direction) and work in both orientations (see Becker fig. 21-22), unlike promotors.

c. Mechanism of action -- DNA thought to loop around and silencer/enhancer helps stabilize (or block) binding of TF's directly or indirectly to core promotor. (See Becker fig. 21-23 or Purves 14.15 (14.17) and section on regulatory TF's below.)

4. Terminology & Misc. Details -- this is for reference; may not all be discussed in class.

a. Boxes = short sequences that are found in regulatory regions (ex: TATA box)

b. Consensus sequences = sequence containing the most common base found at each position for all sequences of that type. Any individual version of sequence is likely to be different from the consensus at one or more positions. (Ex: TATAAAA = consensus sequence for TATA box)

c. For multicellular organisms, term "operator" is not used for site/DNA sequence where a regulatory protein sits. Why? Because no polycistronic mRNA & no operons in higher eukaryotes. (Are some in unicellular euk.)

C. Details of regulatory proteins or TF's = transcription factors

1. Basal TF's. Needed to start transcription in all cells. (See Purves fig. 14.14 (14.16) or Becker fig. 19-14.)

a. Many basal TF's needed.

b. People usually are interested in basal TF's for RNA pol. II. (since pol II --> mRNA)

(1). These are called TFIIA, TFIIB, etc.

(2). Major one is TFIID; it itself has many subunits. Most studied subunit is TBP (TATA binding protein -- See Becker fig. 19-15.) Recognizes TATA box when there is one.

c. Basal TF's bind first to core promotor, and then RNA pol binds to them. Takes a lot of proteins to get started. RNA polymerase does not bind directly to the DNA.

2. Regulatory or Tissue Specific TF's -- used only in certain cell types or at certain times.

a. Bind to areas outside the core promotor -- usually to enhancers or silencers (distal control elements) but sometimes to proximal control elements

b. When regulatory TF's bind, can decrease or promote transcription.

(1). Activators. TF's called activators if bind to enhancers and/or increase transcription.

(2). Repressors. TF's called repressors if bind to silencers and/or decrease transcription.

(3). Co-activators. Proteins that connect TF's to each other (but don't bind directly to the DNA) are often called co-activator (or co-repressor) proteins.

c. Co-ordinate control. A group of genes can all be turned on of off at once in response to the same signal (heat shock, hormone, etc.). These genes do not need to be near each other -- they just have to have the same control elements. See Purves 14.16.

(1). Common control elements: All genes turned on in the same cell type and/or under the same conditions have the same control elements -- therefore these genes all respond to the same TF's. Result is multiple mRNA's, all made in response to same signal (s).

(2). Compare to situation in prokaryotes:

	Prokaryotes	Eukaryotes
Co-ordinately controlled genes are	Linked	Unlinked
mRNA is	polycistronic (1 mRNA/operon)	moncistronic (1 mRNA/gene)
Operons?	yes	no
Control elements are found	once per operon	once per gene

d. Structure & Function of regulatory TF's is modular

(1). Each TF has multiple domains.

(a). Each TF has a DNA-binding domain -- specific for particular sequence(s) and/or gene(s)

(b). Each TF has a transcription regulation domain (also called trans acting domain or in many cases transcription activating domain) -- determines effects of DNA binding by given TF (activation vs inhibition of transcription)

(c). TF's that are hormone receptors also have a hormone-binding domain.

(d). TF may have additional domains, such as dimerization domain. Many TF's must dimerize to work. (Monomer is inactive.) Some form dimers with other molecules of the same protein (result is a homodimer) and some form dimers with a different protein (result is a heterodimer).

(2). Modules (domains) can be switched -- Recombinant DNA methods can be used to make hybrid TF's. This has many uses in research; some examples are in the problem sets.

(3). How do regulatory TF's act?
By binding to control elements with their DNA-binding domains and to basal TF's or each other (or to co-activator or co-repressor proteins) with their transcriptional regulation domains.

(4). Types of DNA-binding domains. These are often classified by the shape of the DNA binding region. Some common DNA binding shapes (called structural motifs) are listed below for reference only (FYI). For pictures, see Purves p. 273 (6th ed) or fig. 14.18 (5th ed) or Becker fig. 21-25.

Zinc finger

leucine zipper

helix-loop-helix

helix-turn-helix.

To review transcription, try problems 4-16 and 4-17 A.

VI. Domains & Motifs in General -- This section is included for reference

A. Terminology: The terms "domain" and "motif" are sometimes used interchangeably. Strictly speaking they are different -- A domain is considered a unit of tertiary structure, while a motif is considered a unit of secondary structure.

1. Domains. A domain refers to a discrete, locally folded unit of tertiary structure. It usually has a particular function (DNA binding, kinase, intracellular, etc.) and it may contain one or more motifs or unusual structures. Domains are sometimes referred to by their functions (such as DNA binding) or by their structures (for example, SH2). It is assumed that domains of common structure have common functions, but not necessarily vice versa.

2. Motifs. A motif refers to a region with a particular combination of secondary structural elements such as alpha helices, beta sheets, loops, turns, etc. A certain motif (such as a zinc finger) always has the same structure.

B. Significance of Domains

1. Common structure implies common function. Existence of domain(s) of known function in a new protein is often used to deduce function(s) of the newly discovered genes/proteins. Important tool in analysis of results from human genome project.

2. Common structure implies common origin. (FYI)

a. Same domains turn up again and again in different proteins (in different combinations). How did same domain end up in different places?

b. Often one domain corresponds to one exon (or group of exons)

c. Modular nature of proteins & genes (see a & b above) implies new genes may arise from reshuffling of old modules, not only from duplication and divergence of entire genes. One mechanism for reshuffling of exons & domains could occur as follows:

Two, nonhomologous genes pair up (by mistake). Recombination occurs in introns --> new combination of exons --> protein with new combination of preexisting functional modules (domains) encoded by separate exons.