vincebuffalo.org
My First Recommendation to New Scientific Coders: Learn Visualization | vince buffalo
http://www.vincebuffalo.org/2012/11/14/learn-visualization.html
The blind play of genes. My First Recommendation to New Scientific Coders: Learn Visualization. Scientists are learning programming at an unprecedented rate. I’ve expressed. Over the fast-paced growth of computing across the sciences and what this could mean for reproducibility and incorrect findings in the sciences. Perhaps the best example that illustrates the severity of this issue is Coombes and Baggerly’s Duke Saga. Problems Look Differently When You Can Visualize Quickly. Visualization also drops t...
vincebuffalo.org
Bioinformatics and Interface Design | vince buffalo
http://www.vincebuffalo.org/2013/01/26/bioinfo-interfaces.html
The blind play of genes. Bioinformatics and Interface Design. Day to day bioinformatics involves interfacing and executing many programs to process data. We end up with some refinement of the data from which we extract biological meaning through data analysis. Given how much interfacing bioinformatics involves, this process undergoes very little thought or design optimization. I’m hardly the only one to complain about this. Fred Ross had this sadly accurate description. Estimating from my own experience ...
vincebuffalo.org
Using Names Pipes and Process Substitution in Bioinformatics | vince buffalo
http://www.vincebuffalo.org/2013/08/08/the-mighty-named-pipe.html
The blind play of genes. Using Names Pipes and Process Substitution in Bioinformatics. It’s hard not to fall in love with Unix as a bioinformatician. In a past post. I mentioned how Unix pipes are an extremely elegant way to interface bioinformatics programs (and do inter-process communication in general). In exploring other ways of interfacing programs in Unix, I’ve discovered two great but overlooked ways of interfacing programs: the named pipe and process substitution. Why We Love Pipes and Unix.
vincebuffalo.org
Using Bioconductor to Analyze your 23andme Data | vince buffalo
http://www.vincebuffalo.org/2012/03/12/23andme-gwascat.html
The blind play of genes. Using Bioconductor to Analyze your 23andme Data. Bioconductor is one of the open source projects of which I am most fond. The documentation is excellent, the community wonderful, the development fast-paced, and the software. There’s a new package in the development branch (due to be released as 2.10 very soon) called. Is a package that serves as an interface to the NHGRI. Database of genome-wide association studies. Loading the package with. Instance of SNPs and their diseases.
vincebuffalo.org
The Unbelievable Debate: Some Ramblings on Machine Learning in Science | vince buffalo
http://www.vincebuffalo.org/2012/03/03/the-unbelievable-debate.html
The blind play of genes. The Unbelievable Debate: Some Ramblings on Machine Learning in Science. In between refactoring some. Code this morning and looking at RNA. Seq data, I grabbed some cold brew coffee and caught up on some missed tweets. Admittedly, my brain glosses over most tweets, but this tweet. From Drew Conway had the right mix of keywords to actually make me click and read the link:. The data science debate: domain expertise or machine learning? Social Sciences and Machine Learning Caution.
vincebuffalo.org
Please developers, don’t be dicks. | vince buffalo
http://www.vincebuffalo.org/2012/02/21/dont-be-a-dick.html
The blind play of genes. Please developers, don’t be dicks. Please developers, don’t be dicks. We’ve All Been There ( WABT. The first reason to never be a dick is that We’ve All Been There (I’m going to give this the acronym WABT. Even the most voracious and diligent manual readers can suffer from the XY. Not grasp anything. They’re not going to have an ah ha! Moment when they’re too busy trying to come up with a witty response to your burn on IRC. Has the same number of letters as RTFM. Someone was once...
dataists.com
dataists » Hilary Mason
http://www.dataists.com/author/hilary
Fresher than seeing your model doesn't have heteroscedastic errors. Live stream the Strata NY Data Science Conference! September 19th, 2011 Author:. Strata New York 2011. Has just begun, and you can view the livestream here:. Snippet: Where the F* k Was I? June 24th, 2011 Author:. James Bridle had an interesting reaction to the revelation that his iPhone was tracking his location: he made a book! He describes his reaction to his phone’s data collection habits rather poetically:. April 15th, 2011 Author:.
vincebuffalo.org
Elucidating k-mer Contamination with Kullback-Leibler Divergence | vince buffalo
http://www.vincebuffalo.org/2012/03/01/kmer-kl.html
The blind play of genes. Elucidating k-mer Contamination with Kullback-Leibler Divergence. Recently a coworker showed me a FASTQ. File from an Illumina HiSeq run (which will be packaged in the new release of my Bioconductor package qrqc. That was severely contaminated. Below is the file in. With a string highlighted:. A severely contaminated file in less, with many contaminants highlighted. Holy contamination, Batman! Will match contaminated reads and remove them. My program Scythe. In this case,. Lookin...
vincebuffalo.org
Thoughts on Julia and R | vince buffalo
http://www.vincebuffalo.org/2012/03/07/thoughts-on-julia.html
The blind play of genes. Thoughts on Julia and R. Is an exciting new technical computing language. It’s still in its infancy, but it’s fast (see below), and already does a lot. Comparison of Julia to other languages. What’s wrong with R? Now methods papers in many fields are often accompanied by CRAN. Or Bioconductor packages. It’s also a brilliant platform for reproducible, open research, as Bioconductor beautifully illustrates with packaged and version-controlled genomes, microarray probesets, etc.
SOCIAL ENGAGEMENT