Eclipse Authors: Pat Romanski, Elizabeth White, Liz McMillan, David H Deans, JP Morgenthal

Blog Feed Post

Hadoop’s Impact on the Future of Data Management: Insights from Mike Olson at Strata and Hadoop World 2013


big_data_5Below and at this link is a video of Mike Olson at the opening of the 2013 Strata Conference and Hadoop World. The context in this discussion is important for technologists, strategists, analysts and executives alike, it provides insights in easy to understand ways and succinctly articulates the impact of some key trends in the environment.

The ecosystem around Apache Hadoop has continued to mature so updates like this from one of its key leaders are critically important.

Mike opened with a great overview of where the community has come in the last five years:

  • In 2008 the Big Data meme had not happened yet. And you probably hadn’t heard of Hadoop yet.
  • In 2009 the first Hadoop world was held and over 700 people showed up. This year, 3000 people attended in a sold out venue.
  • Now consider the many vendors in the ecosystem.
  • And consider the big trends.
  • When Hadoop was born was a compliment to traditional data processing. It was off on the side. Good for batch and for storage but could not handle real time. Much of the market did not pay much attention. But real time was always desired. Just a year ago Cloudera announced a real time platform, Impala, an open source real time SQL engine.

For context on where were are today and where we are going Mike reviewed that:

  • Other real time capabilities have been added including Cloudera search. In the single year since they have been announced over 5000 enterprises have added Impala and Search. Real time has always mattered.
  • Now that real time and search are both available, more work can be done on the real platform. Now other applications and uses can be supported on Hadoop. This platform is attracting work and attracting data. And it is attracting more and more users.
  • At enterprise deployments are showing a strong trend. Hadoop is emerging as an enterprise data hub. This meme is big in the industry now.

What is a data hub? A scale-out, affordable, reliable platform. Can hold any data in any format for as long as you want it. It is a storage layer with security built in that can do access control, auditing, logging, providence of data. And a secure storage substrate would also require a rich collection of engines for working with data. You want query, search, machine learning and analytics in place without moving it out. That collection of capabilities is hugely valuable and lets you work with the data where it  lives. But still this is not a hub. A hub needs to connect to the infrastructure you already rely on. That makes a hub and makes this concept very virtuous. This is something new. It is an enterprise data hub. This is a very big deal.

Cloudera announced, via Mike, the release of Cloudera 5, the industry’s first Enterprise Data Hub.

Bottom line of this new Enterprise Data Hub capability: Scale out storage, security, good data governance capability, a rich collection of engines for working on the data in place and delivering results to your systems and people.

For more see Mike expand on this concept here



Read the original blog entry...

More Stories By Bob Gourley

Bob Gourley writes on enterprise IT. He is a founder of Crucial Point and publisher of CTOvision.com

IoT & Smart Cities Stories
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
DXWorldEXPO LLC announced today that Telecom Reseller has been named "Media Sponsor" of CloudEXPO | DXWorldEXPO 2018 New York, which will take place on November 11-13, 2018 in New York City, NY. Telecom Reseller reports on Unified Communications, UCaaS, BPaaS for enterprise and SMBs. They report extensively on both customer premises based solutions such as IP-PBX as well as cloud based and hosted platforms.
In his keynote at 18th Cloud Expo, Andrew Keys, Co-Founder of ConsenSys Enterprise, will provide an overview of the evolution of the Internet and the Database and the future of their combination – the Blockchain. Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life ...
Chris Matthieu is the President & CEO of Computes, inc. He brings 30 years of experience in development and launches of disruptive technologies to create new market opportunities as well as enhance enterprise product portfolios with emerging technologies. His most recent venture was Octoblu, a cross-protocol Internet of Things (IoT) mesh network platform, acquired by Citrix. Prior to co-founding Octoblu, Chris was founder of Nodester, an open-source Node.JS PaaS which was acquired by AppFog and ...
The Founder of NostaLab and a member of the Google Health Advisory Board, John is a unique combination of strategic thinker, marketer and entrepreneur. His career was built on the "science of advertising" combining strategy, creativity and marketing for industry-leading results. Combined with his ability to communicate complicated scientific concepts in a way that consumers and scientists alike can appreciate, John is a sought-after speaker for conferences on the forefront of healthcare science,...
"The Striim platform is a full end-to-end streaming integration and analytics platform that is middleware that covers a lot of different use cases," explained Steve Wilkes, Founder and CTO at Striim, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-c...
Predicting the future has never been more challenging - not because of the lack of data but because of the flood of ungoverned and risk laden information. Microsoft states that 2.5 exabytes of data are created every day. Expectations and reliance on data are being pushed to the limits, as demands around hybrid options continue to grow.
Dion Hinchcliffe is an internationally recognized digital expert, bestselling book author, frequent keynote speaker, analyst, futurist, and transformation expert based in Washington, DC. He is currently Chief Strategy Officer at the industry-leading digital strategy and online community solutions firm, 7Summits.
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regu...