Archive

Archive for the ‘bioinformatics’ Category

Saved a few million lives today

March 14, 2009 edwinhere 1 comment

My professor gave the class an assignment to train a support vector machine with gene expression data of 329 children with various types of Acute lymphoblastic leukemia.

The children were diagnosed by oncologists at the St. Judes Children’s Research Hospital, which is one of the best repair shops for pediatric cancer. We used their help because oncologists at that hospital make very few mistakes. (BTW A wrong diagnosis and the resulting wrong treatment for all types of Acute lymphoblastic leukemia results in death and a lot of suffering and pain.)

But such expert doctors are not available all around the planet, so a support vector machine is trained to look at those doctor’s conclusions along with the gene expression data of the cancer cells of those 329 children. The gene expression levels were collected using the Affymetrix GeneChip which is a DNA microarray.

After training the support vector machine, it “learns” to classify cancers of patients into their subtypes by just looking at the gene expression levels. So we don’t need the best doctors at St. Judes anymore! We can give the software to any mediocre oncologist, and he/she will be able to put some RT-PCRed cDNA from a cancer cell of patient onto the DNA microarray, and our SVM will look at the data, and automatically classify with great accuracy what subtype of Acute lymphoblastic leukemia, does a child have.

The SVM was so accurate that it was even able to point out mistakes made by St. Judean oncologists in the data they gave to test the accuracy of the algorithm.

This assignment was just a replication of original research by my professor, and according to his statistics, it saves our university’s teaching hospital 51.6 million USD per year while simultaneously providing 75-80% cure rates.

Finally, I feel morally superior to a lot of faith healers..

Science…It works, Bitches

Categories: bioinformatics

I think I know why Transcription Start Sites are hard to predict

February 7, 2009 edwinhere Leave a comment

Yesterday, Prof. Limsoon Wong taught Dragon Promoter Finder which is a party trick to predict the location of Transcription Start Sites. Unlike Translation Initiation Sites, TSSs are hard to predict because there is no start codon.

I think I know why TSSs are hard to predict. Experimental evidence shows most of the DNA is wound on histones and made to stick to nuclear membrane. Only some regions are available for transcription.

So all we need to do is to train a classifier to recognize sequences that tend to stick to histones and nuclear membranes and subject the rest to TSS prediction techniques.

Categories: bioinformatics