Every day people rely on search engines to find specific content in the many terabytes of data that exist on the Internet, but have you ever wondered how this search is actually performed? One approach is Apache’s Hadoop, which is a software framework that enables distributed manipulation of vast amounts of data. This article introduces the Hadoop framework and shows you why it’s one of the most important Linux-based distributed computing frameworks.