Welcome!

Eclipse Authors: Liz McMillan, Elizabeth White, XebiaLabs Blog, Ken Fogel, Sematext Blog

Blog Feed Post

The CTOvision Big Data Top Enterprise Tech List: The most important data technologies to accelerate into your infrastructure

By

We produced this list as an aid to the enterprise CTO seeking information on the most capable mission-enabling infrastructure technologies. This is a companion piece to our list of the top analytical technologists on our Analyst One website. Our methodologies are at the bottom of this list.

We trust you will find this list interesting and informative. Have some new technologies to suggest for the list? Let us know at our Contact Page.

 

The CTOvision Big Data Top Enterprise Technologies List

Aerospike
Aerospike: Aerospike delivers the first flash-optimized in-memory database and the most reliable NoSQL database for revenue critical, real-time big data applications. The database of choice in advertising, Aerospike is the user store and system of engagement for Internet-scale, interaction platforms, such as AppNexus, Bluekai, eXelate, The Trade Desk and [x+1], predictably processing terabytes of data and billions of transactions per day, with 10x better performance, 10x fewer servers and zero downtime. Developers in mobile, video, gaming, social, ecommerce, retail and more can create the most compelling interactions extending Aerospike to fit their applications. Aerospike is headquartered in Silicon Valley; investors include Alsop Louie, Draper Associates and NEA.
Appfluent
Appfluent: Appfluent provides IT organizations with visibility into usage and performance of data warehouse and business intelligence systems. IT decision makers can view exactly which enterprise data is being used or not used, determine how business intelligence is performing and identify causes of database performance issues. With Appfluent, customers can address exploding data growth and start the smart move to Hadoop and Big Data.

Arista Networks
Arista Networks: Arista Networks was founded to deliver networking solutions for large data center and HPC environments and delivers a portfolio of Gigabit and 10GBE switches that redefine network architectures, brings extensibility to networking and dramatically changes the price/performance of data center networks. At the core of Arista’s platform is the Extensible Operating System (EOS™), a ground-breaking network operating system with single-image consistency across hardware platforms, and modern core architecture enabling in-service upgrades and application extensibility.

Azul Systems
Azul Systems: Azul Zing™ is essential technology for Big Data applications that are critical to business results. Zing is the only Java performance solution that delivers both very low latency and high sustained throughput for real-time analytics and self-service business intelligence. With Zing your Big Data applications can utilize massive in-memory datasets while delivering predictable performance, allowing reports to be run on more live data with faster results. Zing even reduces or eliminates the need for extra caching applications.

Basho Technologies
Basho Technologies: Basho Technologies is the creator and developer of Riak, an open-source distributed database, providing extreme high-availability, fault-tolerance, and operational simplicity even at scale. Riak has rapidly gained adoption throughout the Fortune 100 and has become foundational to many of the world’s fastest-growing Web-based, mobile and social applications.

Cloudera
Cloudera: Cloudera pioneered the business case for Hadoop with CDH, the world’s most comprehensive, tested and widely deployed distribution of Hadoop. Its Platform for Big Data, Cloudera Enterprise, empowers enterprises to Ask Bigger Questions™ and gain rich, actionable insights from all their data to derive real business value and competitive advantage. As the top contributor to the Apache open source community and leading educator of data professionals, with tens of thousands of nodes under management and hundreds of customers across diverse markets, Cloudera is the category leader that sets the standard for Hadoop in the enterprise.

Couchbase
Couchbase: Couchbase is a leading provider of NoSQL database technology and the company behind the Couchbase open source project. Couchbase Server, the company’s flagship product, is a NoSQL document-oriented database with production deployments at AOL, Cisco, Concur, LinkedIn, Orbitz, Salesforce.com, Shuffle Master, Zynga and hundreds of other household names worldwide. It is particularly well suited for interactive applications, providing easy scalability, consistent high performance, 24×365 availability, and a flexible data model for ease of development.

Data Direct Networks
Data Direct Networks: DDN is the world’s largest, privately-held, data storage infrastructure provider. With a unique and exacting focus on the requirements of today’s massive unstructured data generators, DDN has innovated a comprehensive product portfolio for Big Data applications which are optimized for the world’s most data-intensive environments.  hScaler, the world’s first truly unified analytics appliance factory configured and solutions ready has the ability to be deployed in hours and answer queries in seconds.  hScaler can put you in charge of your big data while truly lowering your TCO.

Dataguise, Inc.
Dataguise, Inc.: Dataguise provides data privacy protection and risk assessment analytics allowing organizations to safely leverage and share enterprise data. Their solutions simplify governance by automatically protecting the data (masking or encryption) and providing actionable compliance intelligence. These capabilities simplify risk management and reduce regulatory compliance costs.

GridGain
GridGain: GridGain develops scalable, distributed, in-memory data platform technology for real time data processing. The company’s Java-based middleware products enable development of applications and services that can instantly access terabytes to petabytes of information from any data source or file system, distribute computational tasks across any number of machines, and produce results orders of magnitude faster than traditionally architected systems. GridGain’s customers include innovative web and mobile businesses, leading Fortune 500 companies, and top government agencies. The company is headquartered in Foster City, California.

Hadapt
Hadapt: Hadapt has developed the industry’s only Big Data analytic platform natively integrating SQL with Apache Hadoop. The unification of these traditionally segregated platforms enables customers to analyze all of their data (structured, semi-structured and unstructured) in a single platform-no connectors, complexities or rigid structure. The company’s core technology began as research in the Yale Computer Science department under co-founders Dr. Daniel Abadi and Ph.D student Kamil Bajda-Pawlikowski. In 2011, led by co-founder and CEO Justin Borgman, Hadapt raised $9.5MM Series A round of funding from Bessemer Venture Partners and Norwest Venture Partners. The company is headquartered in Cambridge, MA.

Jaspersoft
Jaspersoft: Jaspersoft empowers millions of people every day to make faster decisions by bringing them timely, actionable data inside their apps and business processes. Its embeddable, cost-effective reporting and analytics platform allows anyone to quickly self-serve and get the answers they need and scales architecturally and economically to reach everyone.

Kognitio
Kognitio: Kognitio is an in-memory analytical platform that can be tightly integrated with Hadoop for high-performance advanced analytics that make Big Data more consumable for enterprises, especially those with mature BI environments or engrained tools. An MPP platform itself, it enables ad-hoc queries in real-time, wrapped in industry-standard SQL for easy dissemination without MapReduce. Parallelizing standard binary languages like R and Python to run statistical and algorithmic functions in-memory, it is used by Data Scientists, BI professionals and Systems/Database Administrators to give fast access to data that persists in Hadoop and other data storage layers, enabling a Logical Data Warehouse model.

LucidWorks
LucidWorks: LucidWorks, the trusted name in Search, Discovery and Analytics, transforms the way people access information to enable data-driven decisions. Leveraging both structured and unstructured data built on the power of Apache Lucene/Solr open source search, LucidWorks delivers unmatched stability, scalability, and time-to-delivery for search applications. LucidWorks Search provides ease of use development to access up to billions of documents with sub-second query and faceting response time. LucidWorks Big Data tightly integrates key Apache projects needed to build and deploy applications providing ubiquitous access to the data trapped inside Hadoop.

Mellanox Technologies
Mellanox Technologies: Mellanox Technologies (NASDAQ: MLNX, TASE: MLNX) is a leading supplier of end-to-end InfiniBand and Ethernet interconnect solutions and services for servers and storage. Mellanox interconnect solutions increase data center efficiency by providing the highest throughput and lowest latency, delivering data faster to applications and unlocking system performance capability. Mellanox offers a choice of fast interconnect products: adapters, switches, software and silicon that accelerate application runtime and maximize business results for a wide range of markets including high performance computing, enterprise data centers, Web 2.0, cloud, storage and financial services. More information is available at www.mellanox.com. Founded in 1999, Mellanox Technologies is headquartered in Sunnyvale, California and Yokneam, Israel.

MemSQL
MemSQL: MemSQL is a distributed database for real-time analytics. Data scientists, analysts, and developers can query high velocity workloads and historical data simultaneously, all through a convenient SQL interface. By combining significant speed and throughput advantages with complex analytics, an enterprise can gain instant insight to their business and stay competitive in a fast-moving environment.

MetaScale
MetaScale: An early adopter of big data and legacy modernization initiatives, MetaScale provides cutting-edge technologies, Hadoop training and technology solutions to its customers. As a subsidiary of Sears Holdings Corporation, we understand the value of heritage and the need for constant innovation to drive growth. Through this heritage, we offer a deep understanding of employing complex big data tools to solve traditional business problems in the enterprise. Our team brings extensive experience in the migration of workloads off mainframe, large-scale private open-source cloud computing, Hadoop for big data BI and legacy infrastructure modernization.

MongoDB
MongoDB: MongoDB (from humongous) is reinventing data management and powering big data as the leading NoSQL database. Designed for how we build and run applications today, it empowers organizations to be more agile and scalable. MongoDB enables new types of applications, better customer experience, faster time to market and lower costs. It has a thriving global community with over 4 million downloads, 100,000 online education registrations, 20,000 user group members and 20,000 MongoDB Days attendees. The company has more than 600 customers, including many of the world’s largest organizations.

copy-cropped-optensity_logo_header-e1351894976132
Optensity: provides AppSymphony. AppSymphony is a platform that enables businesses and government organizations to exploit big data sources while leveraging scalable computing environments and their current workforce.  AppSymphony’s execution engine runs across a variety of compute environments including Amazon EC2, Rackspace, and Google Compute Engine.  Once an analytic workflow, or “App”, has been authored and validated, it is discoverable and useable by anyone else in the enterprise, maximizing the App’s utility to the entire organization.
Pentaho
Pentaho: Pentaho is building the future of business analytics. Pentaho’s open source heritage drives our continued innovation in a modern, integrated, embeddable platform built for accessing all data sources. With support for all of the leading Hadoop distributions, NoSQL databases and high performance analytic databases, Pentaho provides the broadest support for big data analytics, as well as integration and orchestration of big data and traditional sources.

Platfora
Platfora: Platfora’s mission is to empower customers to transform their businesses into fact-based enterprises. Platfora masks the complexity of Hadoop, making it easy for customers to understand all the facts in their business across events, actions, behaviors and time. For more details, visit www.platfora.com or follow @platfora and #FactBased on twitter.

Progress DataDirect
Progress DataDirect: Progress DataDirect is the world leader in data connectivity, offering the most comprehensive software solutions for connecting the world’s most critical applications to data and services, running on any platform, using proven and emerging standards. Progress Software’s DataDirect Cloud product helps you address the challenges associated with cloud data connectivity by providing a managed service offering that delivers standards based SQL connectivity to a broad spectrum of SaaS, Big Data, Social, and NoSQL data sources. With a proven, 20-year history, strong technical leadership and robust product line, software architects worldwide depend on Progress Software’s DataDirect line of products to connect their applications to an unparalleled range of data sources using standard-based interfaces such as ODBC, JDBC, ADO.NET, XQuery and SOAP.

Protegrity
Protegrity: Protegrity, the innovative leader of groundbreaking enterprise data security software, provides high performance, infinitely scalable end-to-end data security solutions for organizations worldwide. Protegrity helps its customers secure all of their sensitive data in Hadoop and across the enterprise, ensuring compliance with all PCI, PHI and Privacy regulations. Protegrity’s solutions give corporations the ability to implement a variety of data protection methods, including vaultless tokenization, strong encryption, masking and monitoring to ensure the protection of their sensitive data.

Rogue Wave Software
Rogue Wave Software: Rogue Wave Software is the largest independent provider of cross-platform software development tools and embedded components for the next generation of HPC applications. Offering a broad portfolio, Rogue Wave enables developers to increase productivity and harness the power of multicore computing while reducing the complexity of developing multi-processor and data-intensive applications. With Rogue Wave’s IMSL Numerical Libraries, businesses and organizations reduce development time, realize a lower total cost of ownership, and improve quality and maintainability. The robust and portable collection of embeddable math and statistical functions available in native C, C++, C#, Fortran, and Java™ provide sophisticated analytics for high-performance, mission-critical applications.

SGI
SGI: SGI, the trusted leader in technical computing, helps customers solve their most demanding business and technology challenges by delivering high performance computing (HPC), Big Data, and data storage solutions that accelerate time to discovery, innovation, and profitability. Delivering extreme speed, scale, and efficiency, SGI server and storage offerings are utilized by scientific, business, and government communities to solve challenging, data-intensive computing and data management problems, typically requiring large amounts of computing power and fast and efficient data movement both within the computing system and to and from large-scale data storage installations.
SiSense
SiSense: SiSense Prism is a Big Data Analytics Solution that provides the benefits of In-Memory without its disadvantages. SiSense In-Memory Columnar Datastore analyzes 100 times more data at 10 times the speed of comparable solutions. No need to set up complex data warehouse systems or OLAP cubes. No need for programming either, regardless where data comes from or how big it is.

Skytree Inc.
Skytree Inc.: Skytree’s Machine Learning platform gives organizations the power to discover deep analytic insights, predict future trends, make recommendations and reveal untapped markets and customers. Predictive Analytics and Machine Learning are quickly becoming must-have technologies in the age of Big Data, and Skytree provides the Enterprise-grade foundation. Skytree’s flagship product – Skytree Server – is the only general purpose scalable Machine Learning system on the market, built for the highest accuracy at unprecedented speed and scale.

logo_sag
SoftwareAG: provides big data tools and infrastructure including Enterprise Ehcache. Enterprise Ehcache. Enterprise Ehcache snaps into enterprise applications for a faster, easier, more broadly applicable approach to achieving high-performance scalability. Based on the de facto caching standard for enterprise Java, Enterprise Ehcache is an easy-to-deploy solution for hard-to-solve problems. With just a few config changes, you can: Achieve 10-times improvement in application response times, Gain headroom for terabytes of data growth, Offload slow, expensive databases or mainframes, Save on licensing, administration and hardware costs.
Splunk
Splunk: Splunk Inc. (NASDAQ: SPLK) provides the engine for machine data. Splunk software collects, indexes and harnesses the machine-generated big data coming from the websites, applications, servers, networks and mobile devices that power business. Splunk software enables organizations to monitor, search, analyze, visualize and act on massive streams of real-time and historical machine data. More than 4,800 enterprises, universities, government agencies and service providers in over 80 countries use Splunk Enterprise to gain Operational Intelligence that deepens business and customer understanding, improves service and uptime, reduces cost and mitigates cyber-security risk. Splunk Storm, a cloud-based subscription service, is used by organizations developing applications in the cloud.

Sqrrl
Sqrrl: Sqrrl is a Big Data software company whose employees have dealt with the world’s largest, most complex, and most sensitive datasets for the last decade. Sqrrl’s software product, Sqrrl Enterprise, is the most secure and scalable Big Data platform for building real-time analytical applications and is powered by Apache Accumulo™ and Hadoop. Sqrrl Enterprise extends the capabilities of Accumulo with additional data ingest, security, and real-time analytical features that help unlock the power of Big Data.

Zettaset
Zettaset: Zettaset, the leader in secure Big Data management, automates, accelerates, and simplifies Hadoop deployment for the enterprise. Zettaset Orchestrator&tade; is the only Big Data management solution designed to address enterprise requirements for security, high availability, manageability and scalability in a distributed computing environment. Orchestrator helps organizations move Hadoop from pilot into production, replacing open source management with a more robust approach that easily fits into existing enterprise security and policy frameworks. Zettaset Orchestrator provides comprehensive fail-over for all critical cluster services, facilitates integration with the most widely adopted ETL and analytics applications, and is compatible with the leading Hadoop distributions.

 

Our Methodologies 

We firmly believe that technologies must be supported by strong companies, so we focus on companies with proven ability to serve in real enterprises. In most cases we select VC backed firms because those come with staying power. We love open source, but open source solutions should also be supported by a strong firm. We also believe it is important to only report on firms that have products that are really available now (no vaporware).  Additionally, we believe most firms that have a capability that can make a difference for the modern analyst will be interested in demonstrating that capability at Hadoop World. This last assumption allowed us to get a jumpstart on our first list. We started our process by reviewing the full list of sponsors and exhibitors at the coming Hadoop World (for a full list of all exhibitors see here). We then reviewed previous research at our  CTOlabs.com and CTOvision.com sites to round out this initial list.

We know our methodology has some holes. But as good analysts we are going to keep our eyes and ears open for other technologies we can report on and will modify this list as required. We also know we have you, dear readers, to check our assumptions and give us feedback on the list. If you have or know of a firm we should consider for this, let us know.

 

 

Read the original blog entry...

More Stories By Bob Gourley

Bob Gourley writes on enterprise IT. He is a founder and partner at Cognitio Corp and publsher of CTOvision.com

@ThingsExpo Stories
For basic one-to-one voice or video calling solutions, WebRTC has proven to be a very powerful technology. Although WebRTC’s core functionality is to provide secure, real-time p2p media streaming, leveraging native platform features and server-side components brings up new communication capabilities for web and native mobile applications, allowing for advanced multi-user use cases such as video broadcasting, conferencing, and media recording.
IoT generates lots of temporal data. But how do you unlock its value? You need to discover patterns that are repeatable in vast quantities of data, understand their meaning, and implement scalable monitoring across multiple data streams in order to monetize the discoveries and insights. Motif discovery and deep learning platforms are emerging to visualize sensor data, to search for patterns and to build application that can monitor real time streams efficiently. In his session at @ThingsExpo, ...
There will be new vendors providing applications, middleware, and connected devices to support the thriving IoT ecosystem. This essentially means that electronic device manufacturers will also be in the software business. Many will be new to building embedded software or robust software. This creates an increased importance on software quality, particularly within the Industrial Internet of Things where business-critical applications are becoming dependent on products controlled by software. Qua...
SYS-CON Events has announced today that Roger Strukhoff has been named conference chair of Cloud Expo and @ThingsExpo 2016 Silicon Valley. The 19th Cloud Expo and 6th @ThingsExpo will take place on November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. "The Internet of Things brings trillions of dollars of opportunity to developers and enterprise IT, no matter how you measure it," stated Roger Strukhoff. "More importantly, it leverages the power of devices and the Interne...
Machine Learning helps make complex systems more efficient. By applying advanced Machine Learning techniques such as Cognitive Fingerprinting, wind project operators can utilize these tools to learn from collected data, detect regular patterns, and optimize their own operations. In his session at 18th Cloud Expo, Stuart Gillen, Director of Business Development at SparkCognition, discussed how research has demonstrated the value of Machine Learning in delivering next generation analytics to imp...
In addition to all the benefits, IoT is also bringing new kind of customer experience challenges - cars that unlock themselves, thermostats turning houses into saunas and baby video monitors broadcasting over the internet. This list can only increase because while IoT services should be intuitive and simple to use, the delivery ecosystem is a myriad of potential problems as IoT explodes complexity. So finding a performance issue is like finding the proverbial needle in the haystack.
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform. In his session at @ThingsExpo, Craig Sproule, CEO of Metavine, demonstrated how to move beyond today's coding paradigm and shared the must-have mindsets for removing complexity from the develo...
SYS-CON Events announced today that MangoApps will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. MangoApps provides modern company intranets and team collaboration software, allowing workers to stay connected and productive from anywhere in the world and from any device.
The IETF draft standard for M2M certificates is a security solution specifically designed for the demanding needs of IoT/M2M applications. In his session at @ThingsExpo, Brian Romansky, VP of Strategic Technology at TrustPoint Innovation, explained how M2M certificates can efficiently enable confidentiality, integrity, and authenticity on highly constrained devices.
SYS-CON Events announced today that LeaseWeb USA, a cloud Infrastructure-as-a-Service (IaaS) provider, will exhibit at the 19th International Cloud Expo, which will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA. LeaseWeb is one of the world's largest hosting brands. The company helps customers define, develop and deploy IT infrastructure tailored to their exact business needs, by combining various kinds cloud solutions.
Early adopters of IoT viewed it mainly as a different term for machine-to-machine connectivity or M2M. This is understandable since a prerequisite for any IoT solution is the ability to collect and aggregate device data, which is most often presented in a dashboard. The problem is that viewing data in a dashboard requires a human to interpret the results and take manual action, which doesn’t scale to the needs of IoT.
Internet of @ThingsExpo has announced today that Chris Matthieu has been named tech chair of Internet of @ThingsExpo 2016 Silicon Valley. The 6thInternet of @ThingsExpo will take place on November 1–3, 2016, at the Santa Clara Convention Center in Santa Clara, CA.
When people aren’t talking about VMs and containers, they’re talking about serverless architecture. Serverless is about no maintenance. It means you are not worried about low-level infrastructural and operational details. An event-driven serverless platform is a great use case for IoT. In his session at @ThingsExpo, Animesh Singh, an STSM and Lead for IBM Cloud Platform and Infrastructure, will detail how to build a distributed serverless, polyglot, microservices framework using open source tec...
“delaPlex Software provides software outsourcing services. We have a hybrid model where we have onshore developers and project managers that we can place anywhere in the U.S. or in Europe,” explained Manish Sachdeva, CEO at delaPlex Software, in this SYS-CON.tv interview at @ThingsExpo, held June 7-9, 2016, at the Javits Center in New York City, NY.
"My role is working with customers, helping them go through this digital transformation. I spend a lot of time talking to banks, big industries, manufacturers working through how they are integrating and transforming their IT platforms and moving them forward," explained William Morrish, General Manager Product Sales at Interoute, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
The 19th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Digital Transformation, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportuni...
A critical component of any IoT project is what to do with all the data being generated. This data needs to be captured, processed, structured, and stored in a way to facilitate different kinds of queries. Traditional data warehouse and analytical systems are mature technologies that can be used to handle certain kinds of queries, but they are not always well suited to many problems, particularly when there is a need for real-time insights.
IoT is rapidly changing the way enterprises are using data to improve business decision-making. In order to derive business value, organizations must unlock insights from the data gathered and then act on these. In their session at @ThingsExpo, Eric Hoffman, Vice President at EastBanc Technologies, and Peter Shashkin, Head of Development Department at EastBanc Technologies, discussed how one organization leveraged IoT, cloud technology and data analysis to improve customer experiences and effi...
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform and how we integrate our thinking to solve complicated problems. In his session at 19th Cloud Expo, Craig Sproule, CEO of Metavine, will demonstrate how to move beyond today's coding paradigm ...
Big Data engines are powering a lot of service businesses right now. Data is collected from users from wearable technologies, web behaviors, purchase behavior as well as several arbitrary data points we’d never think of. The demand for faster and bigger engines to crunch and serve up the data to services is growing exponentially. You see a LOT of correlation between “Cloud” and “Big Data” but on Big Data and “Hybrid,” where hybrid hosting is the sanest approach to the Big Data Infrastructure pro...