Agile software development is a group of software development methods in which requirements and solutions evolve through collaboration between self-organizing, cross-functional teams. It promotes adaptive planning, evolutionary development, early delivery, continuous improvement and encourages rapid and flexible response to change …
The following is the summary of compiler storage allocation. 1. Static vs Dynamic Static: Storage can be made by compiler looking only at the text of the program. One reason for statically allocating as many data objects as possible is …
Hadoop – the data processing engine based on MapReduce – is being superceded by new processing engines: Apache Tez, Apache Storm, Apache Spark and others. YARN makes any data processing future possible. But Hadoop the platform – thanks to YARN …
You might be wondering what bearing a history lesson may have on a technology project such as Apache Drill. In order to truly appreciate Apache Drill, it is important to understand the history of the projects in this space, as well …
Apache Sqoop provides a framework to move data between HDFS and relational databases in a parallel fashion using Hadoop’s MR framework. As Hadoop becomes more popular in enterprises, there is a growing need to move data from non-relational sources like mainframe …
The set of data storage and processing technologies that define the Apache Hadoop ecosystem are expansive and ever-improving, covering a very diverse set of customer use cases used in mission-critical enterprise applications. At Cloudera, we’re constantly pushing the boundaries of …
Hadoop is a popular open-source implementation of MapReduce framework designed to analyze large data sets. It has two parts; Hadoop Distributed File System (HDFS) and MapReduce. HDFS is the file system used by Hadoop to store its data. It has …
Big data techniques are becoming mainstream in an increasing number of businesses, but how do people get self-service, interactive access to their big data? And how do they do this without having to train their SQL-literate employees to be advanced …
The HDFS docs have some information, and logically it makes sense to separate the network of the Hadoop nodes from a “management” network. However, in our experience, multi-homed networks can be tricky to configure and support. The pain stems from …
Template Method is a behavioral design pattern and it’s used to create a method stub and deferring some of the steps of implementation to the subclasses. Template method defines the steps to execute an algorithm and it can provide default implementation that might be common …