mattfaus.com
May | 2014 | Matt Faus
http://mattfaus.com/2014/05
Skip to primary content. Skip to secondary content. Monthly Archives: May 2014. Improving Khan Academy’s student knowledge model for better predictions. May 8, 2014. 8212;———————————————————————————————. These feature vectors allow us to build the following statistical model to predict a student’s ability to correctly answer the next question in an exercise, even if the next question is the very first for that exercise. This project sets about to achieve two goals:. Upgrade the KnowledgeState mechanism s...
mattfaus.com
July | 2014 | Matt Faus
http://mattfaus.com/2014/07
Skip to primary content. Skip to secondary content. Monthly Archives: July 2014. Khan Academy Mastery Mechanics. July 3, 2014. If you’ve used Khan Academy, I’m sure you’re familiar with this graphic. It is shown at the end of each task to inform you of how you have progressed. You may not be familiar with the (somewhat complex) mechanics behind how students progress through each level of mastery, and that’s what I hope to clarify. Tutorial Mode: Plan your own practice problems. There are three ways stude...
mattfaus.com
March | 2014 | Matt Faus
http://mattfaus.com/2014/03
Skip to primary content. Skip to secondary content. Monthly Archives: March 2014. BigQuery at Khan Academy. March 26, 2014. Previously, I wrote about the three frameworks. We use for data analysis at Khan Academy. Since then, we have automated the export of production data into BigQuery and are regularly using it to perform analysis. We have all but deprecated our Hive pipeline and things are going great! Here, I’ll go over what has gone well, what concerns we have, and how we set everything up. Our BigQ...
mattfaus.com
April | 2014 | Matt Faus
http://mattfaus.com/2014/04
Skip to primary content. Skip to secondary content. Monthly Archives: April 2014. Data engineering at startups. April 22, 2014. I’ve spent the last year on the data science (a.k.a. analytics) team at Khan Academy. Here are some of the lessons I have learned during that time. These lessons won’t apply to everyone, but if you’re working at a small company that fosters a data-driven process across the company, they should help you be more effective. Running some aggregations over an existing table is easy&#...
mattfaus.com
Matt Faus | Matt Faus
http://mattfaus.com/author/admin
Skip to primary content. Skip to secondary content. Author Archives: Matt Faus. Speeding up GAE Datastore Reads with Protobuf Projection. December 4, 2014. Like most frameworks these days, the Google Appengine (henceforth, GAE) SDK provides an API for reading and writing objects derived from your classes to the datastore. This saves you the boring work of validating raw data returned from the datastore and repackaging it into an easy-to-use object. In particular, GAE uses protocol buffers. Using the magi...
mattfaus.com
Matt Faus | Programming! | Page 2
http://mattfaus.com/page/2
Skip to primary content. Skip to secondary content. Newer posts →. Google Appengine Mapreduce, In Depth. October 16, 2013. With appengine-mapreduce, you can easily spin up machine instances inside the appengine datacenters for each of your shards, assign work to them, and then coalesce the results. The library has its quirks, but thankfully the source code is available for exploring and tinkering with. Here’s what I’ve learned from working with it over the past few months. DatastoreInputReader – wh...
mattfaus.com
Uncategorized | Matt Faus
http://mattfaus.com/category/uncategorized
Skip to primary content. Skip to secondary content. Speeding up GAE Datastore Reads with Protobuf Projection. December 4, 2014. Like most frameworks these days, the Google Appengine (henceforth, GAE) SDK provides an API for reading and writing objects derived from your classes to the datastore. This saves you the boring work of validating raw data returned from the datastore and repackaging it into an easy-to-use object. In particular, GAE uses protocol buffers. Using the magic of monkey patching (run wi...
mattfaus.com
October | 2014 | Matt Faus
http://mattfaus.com/2014/10
Skip to primary content. Skip to secondary content. Monthly Archives: October 2014. Building a machine learning pipeline in Google’s Cloud. October 30, 2014. With Khan Academy’s existing Google Cloud systems (read: pretty much everything) led me to choose Google’s new managed VMs. With some help from BigQuery to implement this system. I have discussed the training process for the knowledge model. Before, but a quick refresher is as follows:. Here’s a data flow diagram of the entire system:. Was pretty ea...