-
Notifications
You must be signed in to change notification settings - Fork 59
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
updated readme with quick start instructions
- Loading branch information
Showing
1 changed file
with
19 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,22 @@ | ||
# hadoop-multi-server-ansible | ||
# Hadoop multi-node cluster with Ansible | ||
Multi-server deployment of Hadoop using Ansible | ||
|
||
The hadoop installation anticipates that hadoop binary release is available in | ||
roles/common/templates/hadoop-2.7.1.tar.gz | ||
This repository contains a set of Vagrant and Ansible scripts that make it fast and easy to build a fully functional Hadoop cluster, including HDFS, on a single computer using VirtualBox. In order to run the scripts as they are, you will probably need about 16GB RAM and at least 4 CPUs. | ||
|
||
This can be downloaded here: | ||
http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz | ||
## Quick Start | ||
|
||
- Clone this repository | ||
- Download a binary release of hadoop (e.g. http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz) and saved it to `roles/common/templates/hadoop-2.7.1.tar.gz` | ||
- Open a command prompt to the directory where you cloned the code | ||
- Run `vagrant up` | ||
- Use the commented lines in `bootstrap-master.sh` to do the following | ||
- Run the ansible playbook: `ansible-playbook -i hosts-dev playbook.yml` | ||
- Format the HDFS namenode | ||
- Start DFS and YARN | ||
- Run an example job: `hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar pi 10 30` | ||
|
||
## Additional Details and Explanation | ||
|
||
I wrote up a detailed article about how to understand and run these scripts. This includes the expected output and instructions to modify the process to accommodate proxy environments and low RAM environments. You can find that here: | ||
|
||
http://software.danielwatrous.com/install-and-configure-a-multi-node-hadoop-cluster-using-ansible/ |