Showing posts from November, 2015

Good Food + Data. I can't ask for more

Been a couple of months since the last post...
Moving from Beer Recommendation that I posted in my previous blog, this time the topic is Food. The data set comes from a Kaggle Competition "What's Cooking"  that consists of approximately 40,000 recipes comprising of 20 global cuisines.  
How am I using this data ?
In this post I will begin with the data preparation steps such as standardization of ingredient names, data reshaping, aggregations and make the data ready to be easily consumed downstream.  I will then lead into Gephi where I will do network graph visualization to understand the connections between different elements. i-e How do different ingredients map into different cuisines and also how-often they are used. I know "Garam-Masala" will be the most important predictor for Indian cuisine, however my confidence intervals are pretty wide when it comes to something  like "Anchovies".

Hopefully, by the end, you & I will learn the art of data …