understructure.com
MongoDB | Understructure
http://understructure.com/home/tag/mongodb
A Journey Through Data Science With Statistics, R, and Python. Use Pig To Load Data To MongoDB (Hadoop 2.6). March 15, 2015. For the past 24 hours or so, I’ve been banging my head on this little problem. I know that Hadoop 2.6 isn’t technically supported by MongoDB, but I figured I’d give it a shot. It took a while, but the pain in my head (and arse) is starting to subside. Here’s what I’m working with:. Hadoop 2.6 (Apache distribution) running on Parallels on a MacBook Pro. MongoDB 2.6.8. NOTE: The jar ...
understructure.com
Bio | Understructure
http://understructure.com/home/bio
A Journey Through Data Science With Statistics, R, and Python. I’ve been using SPSS since 1991, SQL Server since 1999, and R since 2011. My previous jobs include Database Administration, Web Development, Web Project Management, BI Engineering, and ETL Specialization. I have a Masters in Industrial and Organizational Psychology, and I currently do statistical and SQL Server consulting. Leave a Reply Cancel reply. You must be logged in. To post a comment. Use Pig To Load Data To MongoDB (Hadoop 2.6).
understructure.com
Storm | Understructure
http://understructure.com/home/tag/storm
A Journey Through Data Science With Statistics, R, and Python. Hortonworks Hadoop Sandbox 2.1, Kafka, and Storm. October 4, 2014. In my copious spare time, I’ve been working through Hortonworks’ excellent series of tutorials on simulating realtime event publication and subscription with Apache Kafka and Apache Storm. These tutorials. Walk you through setting up Kafka as a publisher. Of events from a .kml file from a New York City trucking company, setting up Storm to ingest the events. I’ll post mo...
understructure.com
NLP | Understructure
http://understructure.com/home/category/nlp
A Journey Through Data Science With Statistics, R, and Python. Natural Language Processing – Thoughtly’s NLP Tutorial Series. February 15, 2015. I can’t seem to figure out exactly where I found this originally, but Thoughtly.co is starting a series on natural language processing (NLP) that looks very promising. I’ve just finished reading the first article in the series. And I’m about to start the second article, which focuses on probability. October 29, 2014. The idea behind a trigram is that you’re look...
understructure.com
maashu | Understructure
http://understructure.com/home/author/maashu
A Journey Through Data Science With Statistics, R, and Python. All posts by maashu. Genomic Data Science with Python – Tips For the Final Exam. September 19, 2015. 8212;—————————————————————————————. Break down your functions into small, manageable pieces. For instance, I had one function that counted the number of codons (given a reading frame), one that parsed out the codons into a structure for easier management, and one that got the ORFs. I was able to do about 80% of the assignment with just...Under...
understructure.com
Agile Data Science, Pig, and Cloudera’s CDH v4 | Understructure
http://understructure.com/home/2014/09/agile-data-science-pig-and-clouderas-cdh-v4
A Journey Through Data Science With Statistics, R, and Python. Agile Data Science, Pig, and Cloudera’s CDH v4. September 12, 2014. I’ve been doing a fair bit of Pig lately, since I took Bill Howe’s EXCELLENT Intro to Data Science class through Coursera.org. I’d highly recommend it for anyone willing to jump into the fire and get wet with it. Yup, mixed metaphors. I got ‘em, you want ‘em. Found this command here. Which allowed the build to succeed:. Anyone else working through this book? To post a comment.
understructure.com
Natural Language Processing – Thoughtly’s NLP Tutorial Series | Understructure
http://understructure.com/home/2015/02/natural-language-processing-thoughtlys-nlp-tutorial-series
A Journey Through Data Science With Statistics, R, and Python. Natural Language Processing – Thoughtly’s NLP Tutorial Series. February 15, 2015. I can’t seem to figure out exactly where I found this originally, but Thoughtly.co is starting a series on natural language processing (NLP) that looks very promising. I’ve just finished reading the first article in the series. And I’m about to start the second article, which focuses on probability. Leave a Reply Cancel reply. You must be logged in. UIMA, cTAKES...
understructure.com
Hadoop | Understructure
http://understructure.com/home/category/hadoop
A Journey Through Data Science With Statistics, R, and Python. Use Pig To Load Data To MongoDB (Hadoop 2.6). March 15, 2015. For the past 24 hours or so, I’ve been banging my head on this little problem. I know that Hadoop 2.6 isn’t technically supported by MongoDB, but I figured I’d give it a shot. It took a while, but the pain in my head (and arse) is starting to subside. Here’s what I’m working with:. Hadoop 2.6 (Apache distribution) running on Parallels on a MacBook Pro. MongoDB 2.6.8. NOTE: The jar ...
understructure.com
Hortonworks Hadoop Sandbox 2.1, Kafka, and Storm | Understructure
http://understructure.com/home/2014/10/hortonworks-hadoop-sandbox-2-1-kafka-and-storm
A Journey Through Data Science With Statistics, R, and Python. Hortonworks Hadoop Sandbox 2.1, Kafka, and Storm. October 4, 2014. In my copious spare time, I’ve been working through Hortonworks’ excellent series of tutorials on simulating realtime event publication and subscription with Apache Kafka and Apache Storm. These tutorials. Walk you through setting up Kafka as a publisher. Of events from a .kml file from a New York City trucking company, setting up Storm to ingest the events. I’ll post mo...
understructure.com
Use Pig To Load Data To MongoDB (Hadoop 2.6) | Understructure
http://understructure.com/home/2015/03/use-pig-to-load-data-to-mongodb-hadoop
A Journey Through Data Science With Statistics, R, and Python. Use Pig To Load Data To MongoDB (Hadoop 2.6). March 15, 2015. For the past 24 hours or so, I’ve been banging my head on this little problem. I know that Hadoop 2.6 isn’t technically supported by MongoDB, but I figured I’d give it a shot. It took a while, but the pain in my head (and arse) is starting to subside. Here’s what I’m working with:. Hadoop 2.6 (Apache distribution) running on Parallels on a MacBook Pro. MongoDB 2.6.8. NOTE: The jar ...