Welcome!

Cloud Security Authors: Pat Romanski, Liz McMillan, Elizabeth White, Zakia Bouachraoui, Yeshim Deniz

Related Topics: SYS-CON ITALIA, Containers Expo Blog

SYS-CON ITALIA: Article

Cloud Computing: Making Analytics in the Cloud a Reality

There will soon be a myriad of announcements of DBMS offerings in the cloud

There will soon be a myriad of announcements of DBMS offerings in the cloud. Many of these will NOT be marriages made in heaven. However, the most innovative new DBMS software combined with new cloud computing services are here today and truly take advantage of the cloud architecture in order to change the economics and the responsiveness of business analytics.

My belief is that cloud computing will change the economics of business intelligence (BI) and enable a variety of new analytic data management projects and business possibilities. It does so by making the hardware, networking, security, and software needed to create data marts and data warehouses available on demand with a pay-as-you-go approach to usage and licensing.

A computing cloud, such as the Amazon Elastic Compute Cloud, is composed of thousands of commodity servers running multiple virtual machine instances (VMs) of the applications hosted in the cloud. As customer demand for those applications changes, new servers are added to the cloud or idled and new VMs are instantiated or terminated.

Cloud computing infrastructure differs dramatically from the infrastructure underlying most in-house data warehouses and data marts. There are no high-end servers with dozens of CPU cores, SANs, replicated systems, or proprietary data warehousing appliances available in the cloud. Therefore, a new DBMS software architecture is required to enable large volumes of data to be analyzed quickly and reliably on the cloud's commodity hardware. Recent DBMS innovations make this a reality today, and the best cloud DBMS architectures will include:

  1. Shared-nothing, massively parallel processing (MPP) architecture. In order to drive down the cost of creating a utility computing environment, the best cloud service providers use huge grids of identical (or similar) computing elements. Each node in the grid is typically a compute engine with its own attached storage. For a cloud database to successfully "scale out" in such an environment, it is essential that the database have a shared-nothing architecture utilizing the resources (CPU, memory, and disk) found in server nodes added to the cluster. Most databases popularly used in BI today have shared-everything or shared-storage architectures, which will limit their ability to scale in the cloud.

  2. Automatic high availability. Within a cloud-based analytic database cluster, node failures, node changes, and connection disruptions can occur. Given the vast number of processing elements within a cloud, these failures can be made transparent to the end user if the database has the proper built-in failover capabilities. The best cloud databases will replicate data automatically across the nodes in the cloud cluster, be able to continue running in the event of 1 or more node failures ("k-safety"), and be capable of restoring data on recovered nodes automatically -- without DBA assistance. Ideally, the replicated data will be made "active" in different sort orders for querying to increase performance.

  3. Ultra-high performance. One of the game-changing advantages of the cloud is the ability to get an analytic application up quickly (without waiting for hardware procurement). However, there can be some performance penalty due to Internet connectivity speeds and the virtualized cloud environment. If the analytic performance is disappointing, the advantage is lost. Fortunately, the latest shared-nothing columnar databases are designed specifically for analytic workloads, and they have demonstrated dramatic performance improvements over traditional, row-oriented databases (as verified by industry experts, such as Gartner and Forrester, and by customer benchmarks). This software performance improvement, coupled with the hardware economies of scale provided by the cloud environment, results in a new economic model and competitive advantage for cloud analytics.

  4. Aggressive compression. Since cloud costs are typically driven by charges for processor and disk storage utilization, aggressive data compression will result in very large cost savings. Row-oriented databases can achieve compression factors of about 30% to 50%; however, the addition of necessary indexes and materialized views often swells databases to 2 to 5 times the size of the source data. But since the data in a column tends to be more similar and repetitive than attributes within rows, column databases often achieve much higher levels of compression. They also don't require indexes. The result is normally a 4x to 20x reduction in the amount of storage needed by columnar databases and a commensurate reduction in storage costs.

  5. Standards-based connectivity. While there are a number of special-purpose file systems that have been developed for the cloud environment that can provide high performance, they lack the standard connectivity needed to support general-purpose business analytics. The broad base of analytic users will use existing commercial ETL and reporting software that depend on SQL, JDBC, ODBC, and other DBMS connectivity standards to load and query cloud databases. Therefore, it's imperative for cloud databases to support these connection standards to enable widespread use of analytic applications.
In summary, cloud databases with the architectural characteristics described above will be able to not just run in the cloud, but thrive there by:

  • "Scaling out," as the cloud itself does
  • Running fast without high-end or custom hardware
  • Providing high availability in a fluid computing environment
  • Minimizing data storage, transfer, and CPU utilization (to keep cloud computing fees low)

More Stories By Jerry Held

Jerry Held is Executive Chairman of Vertica and CEO of the Held Consulting Group, a firm that provides strategic consulting to CEOs and senior executives of technology firms ranging from startups to very large organizations and private equity firms. Prior to his current position, Held was a senior executive at both Oracle Corp. and Tandem Computers.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


IoT & Smart Cities Stories
At CloudEXPO Silicon Valley, June 24-26, 2019, Digital Transformation (DX) is a major focus with expanded DevOpsSUMMIT and FinTechEXPO programs within the DXWorldEXPO agenda. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of business. Only 12% still survive. Similar percentages are found throug...
The platform combines the strengths of Singtel's extensive, intelligent network capabilities with Microsoft's cloud expertise to create a unique solution that sets new standards for IoT applications," said Mr Diomedes Kastanis, Head of IoT at Singtel. "Our solution provides speed, transparency and flexibility, paving the way for a more pervasive use of IoT to accelerate enterprises' digitalisation efforts. AI-powered intelligent connectivity over Microsoft Azure will be the fastest connected pat...
As you know, enterprise IT conversation over the past year have often centered upon the open-source Kubernetes container orchestration system. In fact, Kubernetes has emerged as the key technology -- and even primary platform -- of cloud migrations for a wide variety of organizations. Kubernetes is critical to forward-looking enterprises that continue to push their IT infrastructures toward maximum functionality, scalability, and flexibility. As they do so, IT professionals are also embr...
CloudEXPO has been the M&A capital for Cloud companies for more than a decade with memorable acquisition news stories which came out of CloudEXPO expo floor. DevOpsSUMMIT New York faculty member Greg Bledsoe shared his views on IBM's Red Hat acquisition live from NASDAQ floor. Acquisition news was announced during CloudEXPO New York which took place November 12-13, 2019 in New York City.
In an age of borderless networks, security for the cloud and security for the corporate network can no longer be separated. Security teams are now presented with the challenge of monitoring and controlling access to these cloud environments, at the same time that developers quickly spin up new cloud instances and executives push forwards new initiatives. The vulnerabilities created by migration to the cloud, such as misconfigurations and compromised credentials, require that security teams t...
The graph represents a network of 1,329 Twitter users whose recent tweets contained "#DevOps", or who were replied to or mentioned in those tweets, taken from a data set limited to a maximum of 18,000 tweets. The network was obtained from Twitter on Thursday, 10 January 2019 at 23:50 UTC. The tweets in the network were tweeted over the 7-hour, 6-minute period from Thursday, 10 January 2019 at 16:29 UTC to Thursday, 10 January 2019 at 23:36 UTC. Additional tweets that were mentioned in this...
The term "digital transformation" (DX) is being used by everyone for just about any company initiative that involves technology, the web, ecommerce, software, or even customer experience. While the term has certainly turned into a buzzword with a lot of hype, the transition to a more connected, digital world is real and comes with real challenges. In his opening keynote, Four Essentials To Become DX Hero Status Now, Jonathan Hoppe, Co-Founder and CTO of Total Uptime Technologies, shared that ...
After years of investments and acquisitions, CloudBlue was created with the goal of building the world's only hyperscale digital platform with an increasingly infinite ecosystem and proven go-to-market services. The result? An unmatched platform that helps customers streamline cloud operations, save time and money, and revolutionize their businesses overnight. Today, the platform operates in more than 45 countries and powers more than 200 of the world's largest cloud marketplaces, managing mo...
When Enterprises started adopting Hadoop-based Big Data environments over the last ten years, they were mainly on-premise deployments. Organizations would spin up and manage large Hadoop clusters, where they would funnel exabytes or petabytes of unstructured data.However, over the last few years the economics of maintaining this enormous infrastructure compared with the elastic scalability of viable cloud options has changed this equation. The growth of cloud storage, cloud-managed big data e...
Your applications have evolved, your computing needs are changing, and your servers have become more and more dense. But your data center hasn't changed so you can't get the benefits of cheaper, better, smaller, faster... until now. Colovore is Silicon Valley's premier provider of high-density colocation solutions that are a perfect fit for companies operating modern, high-performance hardware. No other Bay Area colo provider can match our density, operating efficiency, and ease of scalability.