Hadoop is an open-source Apache project that enables processing of extremely large datasets in a distributed computing environment. There are three different modes in which it can be run: 1. Standalone Mode2. Pseudo-Distributed Mode3. Fully-Distributed Mode This post covers setting up of Hadoop 2.5.1 in a Pseudo-distributed mode on an Ubuntu machine. For setting up … Continue reading Configuring Hadoop on Ubuntu in pseudo-distributed mode
Author: Anjana Shankar
Git Basics – A cheat sheet for your daily git needs.
This post is for anyone to refer to for their daily git needs. We will not be covering any advanced git concepts here. Git is a distributed version control system.Some basic terminologies:Directory: A folder that contains multiple files.Repository: A directory where Git has been initialized to start version controlling your files. I have created an … Continue reading Git Basics – A cheat sheet for your daily git needs.
Configuring Hadoop on Mac OSx in pseudo-distributed cluster mode.
Hadoop is an open-source Apache project that enables processing of extremely large datasets in a distributed computing environment. There are three different modes in which it can be run: 1. Standalone Mode2. Pseudo-Distributed Mode3. Fully-Distributed Mode This post covers setting up of Hadoop 2.5.1 in a Pseudo-distributed mode. A Pseudo-Distributed mode is one where each … Continue reading Configuring Hadoop on Mac OSx in pseudo-distributed cluster mode.
Setting up keyboard shortcuts in Mac OSX
How many of you have wished that you could maximize your window size without having to drag your mouse around to the green button on the title bar. You will have to make your own keyboard shortcut for this one, since it isn’t set by default. This post aims to show you how this can … Continue reading Setting up keyboard shortcuts in Mac OSX
Splitting a git repository
Sometimes when you are starting with version control for your code, you dont know how big your repository is going to become. The decision whether to create a new repository for every module or to keep them all in one repository may be a difficult one.When the modules are smaller, you would like to keep … Continue reading Splitting a git repository
An Introduction to Zookeeper – Part I of the Zookeeper series
A distributed system consists of multiple computers that communicate through a computer network and interact with each other to achieve a common goal.Major benefits that distributed systems offer over centralized systems is scalability and redundancy. Systems can be easily expanded by adding more machines as needed, and even if one of the machines is unavailable, … Continue reading An Introduction to Zookeeper – Part I of the Zookeeper series
Multiple Java Installations on OSX and switching between them
Sometimes it may be necessary to have two versions of java on your OSX, and to switch between the two in a fast, reliable and convenient manner. This post aims to show how this can be achieved. I am currently using version 10.9.4 of OSX, and I was required to have both JAVA6 and JAVA7 … Continue reading Multiple Java Installations on OSX and switching between them
Java Inheritance – Super keyword
Inheritance is an important principle in object oriented programming. It is a mechanism in which one object acquires the property of another. The object which gives the properties is called as super/base/parent class and the one that receives the property is called as the sub/derived/child class. The child class inherits all fields and methods of … Continue reading Java Inheritance – Super keyword
Configuration and Coordination with Zookeeper
It took me a while to understand the concept of Zookeeper and it took me another some to understand how to use it for the task that I had begun with. This post is intended to help others cross the bridge faster.Dynamic Configuration Management for today's system comes with all the nitty-gritties that are involved … Continue reading Configuration and Coordination with Zookeeper