Have you heard the term Hadoop before, mentioned by a colleague and quickly changed the conversation because you didn’t want to admit that you had been left in the dark in this area? Fear not - I’m here to save the day.
Hadoop is defined as “an open source software framework that simplifies the workloads in Big Data Applications and runs on open standard-based computing platforms,” according to Jeff Hudgins, vice president of Marketing at appliance deployment provider NEI.
Rather than being forced to rely on expensive hardware to ensure reliability, users that leverage Hadoop can instead overcome an array of obstacles and network challenges at the application level.
As the platform continues to gain traction by being increasingly utilized by IT managers, Hudgins notes that “…One of the primary goals is self-service business intelligence and analytics. Once these massive amounts of data are handled in real-time, the end user can react to changes in real-time.”
In fact, the framework enables a huge amount of unstructured or partially-structured data like Web server logs, image and video files, independent data sets, and e-mail to be stored, altered and then relocated to a different location such as a data warehouse.
A recent NEI blog post explains that “...By doing all of the heavy analysis of this data in raw form in Hadoop (versus the structured format of the operating system), expensive disk space and computing resources can be unloaded and reserved so the operating system can handle more of the direct processing responsibilities. In addition, this lowers the capital expense and support costs to small and large enterprises.”
Big data appliances benefit significantly from Hadoop due to the fact that they can evolve to incorporate plug-and-play processor functionality within an operating system. However, it is not required that Hadoop runs within an appliance, as it can also be adjusted to perform on installed servers.
At this time, there are multiple Hadoop appliances deployed that range from four to 18 nodes, with 12 cores, 48 GB RAM (News - Alert) and 28 to 36 TB of capacity per node. While the open source community is constantly adding functionality and increasing performance, independent software vendors (ISV) are able to improve their offerings and reduce time to market in vertical markets that have long struggled with a lack of IT resources. This solution will allow key business units to move forward and not have to wait for IT resources to become available.
Want to learn about the details about Hadoop? Turn to Massachusetts-based NEI, which is deeply rooted in the appliance deployment space.
Edited by Braden Becker