Boilerplate – Apache Spark with Spring profile

This is a boilerplate you can try out to get started with Apache Spark (version 2.10) with Spring profile quickly: https://github.com/shenghuahe/sparkwithspringprofile This should allow you to configure environment specific properties (i.e. path to read some input file) really easily.

Read More »

Boilerplate – Groovy and Spock

This is a boilerplate you can try out to get started with Groovy and Spock quickly https://github.com/shenghuahe/groovywithspock. I created this because it can be tricky to find all the right dependencies & plugins to get started with Spock. This should get you started in no time.

Read More »

How to: Install a Virtual Apache Hadoop Cluster with Vagrant and Cloudera Manager on a Mac

Feel free to skip some of the steps if you already have certain packages installed Get Cask brew install caskroom/cask/brew-cask Get Vagrant & Vagrant plugins brew cask install virtualbox brew cask install vagrant brew cask install vagrant-manager vagrant plugin install vagrant–hostmanager Install Hadoop git clone [email protected]:richardhe-awin/vagrant-hadoop-cluster.git cd vagrant-hadoop-cluster vagrant up Configure Cloudera Manager (mostly referenced from http://blog.cloudera.com/blog/2014/06/how-to-install-a-virtual-apache-hadoop-cluster-with-vagrant-and-cloudera-manager/) […]

Read More »