Have you taken BISC505 at USC. This might be helpful for final's preparation.
Mike's Section.
1) Sequence Assembly
History: When was DNA helix discovered. 1953.
When was the first free living orgranism sequenced.
When was the human sequence announced.
What is shotgun sequencing.
Facts about repeats in HG sequence.
Why was the hg sequencing difficult.
What were some of the innovations
Look at the exercise and midterm question.
There are many relevant databases. Be able to name a few databases.
Multiple Choices (can be expected).
Tim Chens: Around 3 questions , 1 for each.
2) Gene Finding, Proteomics and Genomics circuit.
1) Gene Finding:
Overview of the gene structures, prokaryotes and eukaryotes. Methods of gene finding (content of the methods). Score the signals
2) Proteomics:
a) Mass spectometry, lots of times. Exercise whether this is spectrum generated by the peptides. Applications of MS. Y2H , Gene knockout.
Pros and cons of each technique.
3) Genomic Circuits. Spent a lot of time talking about functions of each module. Also talk about interactions between module. Interactions about DNA motifs or transmission factors. Important to know how to draw the conclusions. How to design an experiment. I want to study the function of one module how do you do that. What kind of things you want to design to make sure that inference is correct. Too many variables so no clean comparision. Function of F, so this guy with F and this guy w/o F. How do you know the alternative translations. Design an experiment to study a particular function.
No memorization. Just general functions. woohoooooooooooooo
Homology search means protein level homology search.
Simon Tavare.
4 Questions two on things upto and including 2nd midterm
- Population genetics: Variation in human genome, Chromosome 21. Some things about LD.
One of the midterms. Something general about
- Quantitative side of microarrays
- Signals or motifs.
Techniques:
1) Genescan. Thats based on a fancy version of motif detection algorithm
2) Principal : Principles of R.
3) Motif detection. You need to know what the idea of the methods are.
4) Microarray things: Biological problem, came with method to solve this. Because of 1 slide, 1 bad slide. Probably it was saturated or overscanned. Bad from getting go. Thats how it is. All too common. Same applies to affymetrix chips. Same story. Although they appear to be much better. They are littel bit better.
We will put you in much better position in LAB.
When you writeup arguments that have quantitiave things. You need to tell about how it goes. For basis of more arrays you need to do more experiments. You need to get organized. Like lab notebooks. That gets specially true. Not of the spots were what they were told what they are. When you find mysterious things, you should see what got wrong.We did some pretty good things about biology
8 15 min questions
1) Clustering example ask you to compare 3 different clustering methods. 2 aspects to any method. What distance you use. The example we were asked slightly different. What we did in class, i.e hamming distance or edit distance. Its not a good method is becaue of evolution. Distance method somehow reflect evolutionary time. The measure we use , one that we use for clustering rather then phylogeny tree. Good for species that have not diverged long time ago. If we were comparing human sequence comapred to other, its good.
Which metric do you use
Clustering method. Complete clustering(look for biggest distance), single linkage clustering (distance between 2 groups is smalles), Avg. (distance between 2 groups is avg). This things produce different shapes, different trees and different topologies. Last part is to decide what the clusters are. The way you decide whats the cluster, you decide how they work. Some level ofclustering in sense how they may be. K means. Variety of different ways. SRM, SVM. Methods which did not talk about, Cluster both genes and expression patterns simultaneously. (no questions on them)
Control spot questions. One thing in particular is have it normalize properly. Different methods for normalizing and different methods for bkground correction. Most microarrays have a lot of spots either they are blank, and they are put sometimes in random places. They also put control organism material. They are actually 4 different organisms other then c elegans. Last 6 spots were blank. The first 20 were mixture of ecoli. You put your ecolo material on chip it shouldn't hybridize (not for evolutionary nearby organism). They also tell you something about background and they also tellyou something about ???. Correction within slide to get background between in slide and then between slide to recenter and rescale the plots. There is no recipee about what to do. More hype rather than success. Amarshem (bought by GE). Nice product.
Dont' remember functionames of R microaray. The basic idea of what R does and structure of it.
Mike's Section.
1) Sequence Assembly
History: When was DNA helix discovered. 1953.
When was the first free living orgranism sequenced.
When was the human sequence announced.
What is shotgun sequencing.
Facts about repeats in HG sequence.
Why was the hg sequencing difficult.
What were some of the innovations
Look at the exercise and midterm question.
There are many relevant databases. Be able to name a few databases.
Multiple Choices (can be expected).
Tim Chens: Around 3 questions , 1 for each.
2) Gene Finding, Proteomics and Genomics circuit.
1) Gene Finding:
Overview of the gene structures, prokaryotes and eukaryotes. Methods of gene finding (content of the methods). Score the signals
2) Proteomics:
a) Mass spectometry, lots of times. Exercise whether this is spectrum generated by the peptides. Applications of MS. Y2H , Gene knockout.
Pros and cons of each technique.
3) Genomic Circuits. Spent a lot of time talking about functions of each module. Also talk about interactions between module. Interactions about DNA motifs or transmission factors. Important to know how to draw the conclusions. How to design an experiment. I want to study the function of one module how do you do that. What kind of things you want to design to make sure that inference is correct. Too many variables so no clean comparision. Function of F, so this guy with F and this guy w/o F. How do you know the alternative translations. Design an experiment to study a particular function.
No memorization. Just general functions. woohoooooooooooooo
Homology search means protein level homology search.
Simon Tavare.
4 Questions two on things upto and including 2nd midterm
- Population genetics: Variation in human genome, Chromosome 21. Some things about LD.
One of the midterms. Something general about
- Quantitative side of microarrays
- Signals or motifs.
Techniques:
1) Genescan. Thats based on a fancy version of motif detection algorithm
2) Principal : Principles of R.
3) Motif detection. You need to know what the idea of the methods are.
4) Microarray things: Biological problem, came with method to solve this. Because of 1 slide, 1 bad slide. Probably it was saturated or overscanned. Bad from getting go. Thats how it is. All too common. Same applies to affymetrix chips. Same story. Although they appear to be much better. They are littel bit better.
We will put you in much better position in LAB.
When you writeup arguments that have quantitiave things. You need to tell about how it goes. For basis of more arrays you need to do more experiments. You need to get organized. Like lab notebooks. That gets specially true. Not of the spots were what they were told what they are. When you find mysterious things, you should see what got wrong.We did some pretty good things about biology
8 15 min questions
1) Clustering example ask you to compare 3 different clustering methods. 2 aspects to any method. What distance you use. The example we were asked slightly different. What we did in class, i.e hamming distance or edit distance. Its not a good method is becaue of evolution. Distance method somehow reflect evolutionary time. The measure we use , one that we use for clustering rather then phylogeny tree. Good for species that have not diverged long time ago. If we were comparing human sequence comapred to other, its good.
Which metric do you use
Clustering method. Complete clustering(look for biggest distance), single linkage clustering (distance between 2 groups is smalles), Avg. (distance between 2 groups is avg). This things produce different shapes, different trees and different topologies. Last part is to decide what the clusters are. The way you decide whats the cluster, you decide how they work. Some level ofclustering in sense how they may be. K means. Variety of different ways. SRM, SVM. Methods which did not talk about, Cluster both genes and expression patterns simultaneously. (no questions on them)
Control spot questions. One thing in particular is have it normalize properly. Different methods for normalizing and different methods for bkground correction. Most microarrays have a lot of spots either they are blank, and they are put sometimes in random places. They also put control organism material. They are actually 4 different organisms other then c elegans. Last 6 spots were blank. The first 20 were mixture of ecoli. You put your ecolo material on chip it shouldn't hybridize (not for evolutionary nearby organism). They also tell you something about background and they also tellyou something about ???. Correction within slide to get background between in slide and then between slide to recenter and rescale the plots. There is no recipee about what to do. More hype rather than success. Amarshem (bought by GE). Nice product.
Dont' remember functionames of R microaray. The basic idea of what R does and structure of it.
Comments