myresearchdiaries.blogspot.com
My Research Diaries: November 2012
http://myresearchdiaries.blogspot.com/2012_11_01_archive.html
Things that make me think, Things that excite me, Things I yearn to learn. Sunday, November 25, 2012. TREC tracks and their meanings: A high level overview. There are multiple tasks and sub-tasks in this track of TREC. These include blog distillation and then opinion polarity. There are about 100,000 blogs in this dataset and 50 queries for which opinion polarity was provided as ground truth. These opinions are categorized into (relevant, not relevant, negative, positive, mixed). DNA and RNA sequences (g...
myresearchdiaries.blogspot.com
My Research Diaries: December 2012
http://myresearchdiaries.blogspot.com/2012_12_01_archive.html
Things that make me think, Things that excite me, Things I yearn to learn. Thursday, December 27, 2012. Trying to revive the C3M algorithm. Here are some of the claims made by their paper " Concept and Effectiveness of the Cover Coefficient Based Clustering Methodology for Text Databases by Fazli Can and Esen OZkarahan". A) Clusters are stable. B) Algorithm is independent of order of documents, hence, we will always a unique clusters. C) The memory overhead is really low. Creating the C matrix. For each ...
myresearchdiaries.blogspot.com
My Research Diaries: Data Science Lightning Talks: #GHC14
http://myresearchdiaries.blogspot.com/2014/10/data-science-lightning-talks-ghc14.html
Things that make me think, Things that excite me, Things I yearn to learn. Friday, October 10, 2014. Data Science Lightning Talks: #GHC14. The lightning talks at GHC's Data Science track were super fun and covered a wide range of topics. Although data scientists or machine learning folks already know most of these concepts, its great to get a refresher. As a plus the passion of the speakers was contagious. Trusting User Annotations on the Web for Cultural Heritage Domains Presenter by. This was a fun end...
myresearchdiaries.blogspot.com
My Research Diaries: May 2014
http://myresearchdiaries.blogspot.com/2014_05_01_archive.html
Things that make me think, Things that excite me, Things I yearn to learn. Thursday, May 1, 2014. Building Apache Spark Jars. I already have a cluster of spark set up on a set of machines. Let me call this the "lab" cluster. This lab cluster came pre-installed with Hadoop. I want to run spark jobs (written in scala) on this cluster. There are two ways to do it. A) Running sbt run. From the root of the sbt project that contains the scala code. B) Run the fat jar created by sbt assembly as follows.
myresearchdiaries.blogspot.com
My Research Diaries: April 2013
http://myresearchdiaries.blogspot.com/2013_04_01_archive.html
Things that make me think, Things that excite me, Things I yearn to learn. Thursday, April 11, 2013. Surviving the PhD program. A great resource for those who are procrastinating on writing. This webpage also contains tons of other helpful material, like balanced-life chart, tools on staying organized, positive affirmations on writing and so on. Life is easier when you can laugh at yourself. Here are some daily affirmations. For doctoral students. But I stayed away from PhD comics. As much as I could.
myresearchdiaries.blogspot.com
My Research Diaries: The Power of Context in Real-World Data Science Applications #GHC14
http://myresearchdiaries.blogspot.com/2014/10/the-power-of-context-in-real-world-data.html
Things that make me think, Things that excite me, Things I yearn to learn. Friday, October 10, 2014. The Power of Context in Real-World Data Science Applications #GHC14. The Data Science in Practical Applications session at GHC Data Science Track ranged from AI to Fraud detection and much more. You can access the notes of the session here. I elaborate a little further on these talks and add a little 2 cents of my perspective. AI: Return to Meaning" by David from Bridgewater Associates. David then went on...
myresearchdiaries.blogspot.com
My Research Diaries: August 2014
http://myresearchdiaries.blogspot.com/2014_08_01_archive.html
Things that make me think, Things that excite me, Things I yearn to learn. Friday, August 1, 2014. Adding code to your blogger blog. A big shout out to those who wrote this awesome tool that takes in code and formats it in html so that you can paste it in your blog. Http:/ codeformatter.blogspot.com/. Posted by Shivani Rao. Running Apache Spark Unit Tests Sequentially with Scala Specs2. Is based on the latest version of Spark and has the right properties to set, namely spark.driver.port. Import org.s...
myresearchdiaries.blogspot.com
My Research Diaries: July 2013
http://myresearchdiaries.blogspot.com/2013_07_01_archive.html
Things that make me think, Things that excite me, Things I yearn to learn. Thursday, July 18, 2013. Matlab Tip: Changing line-width of all the lines via commandline. Get the handle to the line using:. Hline = findobj(gcf, 'type', 'line');. Then you can change some property for all the line objects:. Or just for some of them :. Idx = [4 5];. This is not an original post. Here is the original post. Posted by Shivani Rao. Subscribe to: Posts (Atom). What did I learn today? Mining Massive Datasets Course.
myresearchdiaries.blogspot.com
My Research Diaries: Difference between Machine Learning and Data Science: Student Opportunity Lab #GHC14
http://myresearchdiaries.blogspot.com/2014/10/difference-between-machine-learning-and.html
Things that make me think, Things that excite me, Things I yearn to learn. Friday, October 24, 2014. Difference between Machine Learning and Data Science: Student Opportunity Lab #GHC14. I was invited (by a young aspiring data scientist Sally. Last but not the least, Data Scientists may do a lot more with presenting their findings. In pure Machine Learning, there are pre defined metrics where one can show improved performance or not. In industry problems, these metrics may not be the end-all and ...
myresearchdiaries.blogspot.com
My Research Diaries: March 2014
http://myresearchdiaries.blogspot.com/2014_03_01_archive.html
Things that make me think, Things that excite me, Things I yearn to learn. Wednesday, March 26, 2014. The past few days I have been playing with Hive for some data analysis and I wanted to put down what I learned. A) Exporting data from hive to csv. If you are using hue, then it provides a convenient way to export to csv or excel format. But if not then you can use the following preamble before the select statement. B) Hive does not allow "select" statements in the "where" clause. for example. Simple tem...