Welcome!

Cloud Security Authors: Elizabeth White, Terry Ray, Liz McMillan, Zakia Bouachraoui, Yeshim Deniz

Related Topics: SDN Journal, Microservices Expo, Containers Expo Blog, @CloudExpo, Cloud Security, @DXWorldExpo

SDN Journal: Blog Post

Aggregation Is Good. Aggregation Is Bad.

The vast majority of networking equipment is driven by specialized hardware

For as long as I remember networking has struggled with the balance between aggregated and individual traffic flows. Following the abilities of the technology components we use, we have been forced to aggregate, only to be allowed to de-aggregate or skip aggregation when technology caught up or surpassed the needs of today.

The vast majority of networking equipment is driven by specialized hardware. For datacenter switches, speed and port density are driving the requirements and physics and our technology capabilities create trade-offs that ultimately lead to some form of aggregation. Higher speed and more ports are traded off against memory, table space and functionality. These trade-offs will always exist, no matter what we are trying to build. Networking based in servers will have oodles of memory and table space to do very specific things for many many flows, making it extremely flexible, but those same servers cannot touch the packet processing speeds of the specialized packet processing hardware from Broadcom, Intel or Marvell, or the custom ASICs from Cisco, Juniper, or most anyone else.

funnelSo like it or not, we will want to do more than our hardware is capable of and as a result, we create aggregation points in the network where we lump a bunch of flows together into an aggregate flow and start making decisions on those. Nothing new, even good ole IP forwarding is doing so on an aggregate set of flows, it only makes decisions for all flows destined to a specific IP address.

Network tunnels are the most obvious examples of aggregation, their purpose is to hide information from intermediate networking equipment. In some cases we hide it to keep our table sizes under control, in some cases we hide it because we do not want the intermediate equipment to be able to see what we are transporting (IPSec, SSL, etc). And while sometimes the intermediate systems can see everything that is there, managing the complexity of that visibility simply becomes too expensive. This is why networks that are entirely managed and controlled per flow do not really exist at any reasonable scale, and probably never will.

For the exact same reason we aggregate, we lose the ability to act on specifics. When our tables are not large enough to track each and every flow, we can only make decisions based on what we have decided to keep in common. When talking about tunnels, the tunnel endpoints put new headers onto the original packets and intermediate systems can only act (with minor exceptions) on the information provided in these new headers. The original detail is still there and often visible to the intermediate system, but the intermediate system does not have the capacity to act on the sheer volume of that detail.

And there is the struggle. If I have more information, I can make better decisions. But when I aggregate because I cannot handle that extra information (due to sheer size or management complexity), my decisions by definition become more coarse and as a result, less accurate. But we want it all. We want the power to make decisions based on the most specific information we can, but want to aggregate for operational simplicity or because our hardware dictates. And this is where we get creative and start to turn what used to be black and white into gray.

There is nothing wrong with attempting to act on specifics for aggregate flows, but in so many cases its done as an afterthought and becomes hard to manage, control or specify. Some of the techniques we use are fairly clean, like taking the DSCP values from a packet and replicating it in the outer header of that same packet in a tunnel. Others are far more obscure like calculating some hash function on a packet header and using it as the UDP source port for the VXLAN encapsulated version of that packet. In even others, the original internals may be completely invisible to intermediate systems. STT for instance re-uses the format of TCP packets for its own purpose, but as a side effect of using it as a streaming-like protocol is that the original packet headers may not actually be in an IP packet on the wire. The STT header provides for a 64 bit Context-ID that can be used to take some bits of information from the original packet, but that STT header only appears in the first of what could be many individual packets that are re-assembled in the receiving NIC. Over the Christmas break I spent some time looking at each of the overlay formats and the tools modern day packet processors give you to act on these headers. I will share some of this in this forum next week.

Ultimately, overlay networks are creating a renewed emphasis on the choices between aggregation and individuality. Designed specifically to allow for more complex and scaled networks that hide a lot of the details from the dedicated network hardware, it comes with the price of less granular decisions by that hardware, which can certainly lead to less than optimal use of the available network.

[Today's fun fact: In the Netherlands, there is a 40% higher chance of homeowner insurance claims on the home owner's birthday. Those are some good parties.]

The post Aggregation is Good. Aggregation is Bad. appeared first on Plexxi.

Read the original blog entry...

More Stories By Marten Terpstra

Marten Terpstra is a Product Management Director at Plexxi Inc. Marten has extensive knowledge of the architecture, design, deployment and management of enterprise and carrier networks.

IoT & Smart Cities Stories
Apps and devices shouldn't stop working when there's limited or no network connectivity. Learn how to bring data stored in a cloud database to the edge of the network (and back again) whenever an Internet connection is available. In his session at 17th Cloud Expo, Ben Perlmutter, a Sales Engineer with IBM Cloudant, demonstrated techniques for replicating cloud databases with devices in order to build offline-first mobile or Internet of Things (IoT) apps that can provide a better, faster user e...
In his keynote at 19th Cloud Expo, Sheng Liang, co-founder and CEO of Rancher Labs, discussed the technological advances and new business opportunities created by the rapid adoption of containers. With the success of Amazon Web Services (AWS) and various open source technologies used to build private clouds, cloud computing has become an essential component of IT strategy. However, users continue to face challenges in implementing clouds, as older technologies evolve and newer ones like Docker c...
The Founder of NostaLab and a member of the Google Health Advisory Board, John is a unique combination of strategic thinker, marketer and entrepreneur. His career was built on the "science of advertising" combining strategy, creativity and marketing for industry-leading results. Combined with his ability to communicate complicated scientific concepts in a way that consumers and scientists alike can appreciate, John is a sought-after speaker for conferences on the forefront of healthcare science,...
Disruption, Innovation, Artificial Intelligence and Machine Learning, Leadership and Management hear these words all day every day... lofty goals but how do we make it real? Add to that, that simply put, people don't like change. But what if we could implement and utilize these enterprise tools in a fast and "Non-Disruptive" way, enabling us to glean insights about our business, identify and reduce exposure, risk and liability, and secure business continuity?
To Really Work for Enterprises, MultiCloud Adoption Requires Far Better and Inclusive Cloud Monitoring and Cost Management … But How? Overwhelmingly, even as enterprises have adopted cloud computing and are expanding to multi-cloud computing, IT leaders remain concerned about how to monitor, manage and control costs across hybrid and multi-cloud deployments. It’s clear that traditional IT monitoring and management approaches, designed after all for on-premises data centers, are falling short in ...
"The Striim platform is a full end-to-end streaming integration and analytics platform that is middleware that covers a lot of different use cases," explained Steve Wilkes, Founder and CTO at Striim, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
"MobiDev is a Ukraine-based software development company. We do mobile development, and we're specialists in that. But we do full stack software development for entrepreneurs, for emerging companies, and for enterprise ventures," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
The deluge of IoT sensor data collected from connected devices and the powerful AI required to make that data actionable are giving rise to a hybrid ecosystem in which cloud, on-prem and edge processes become interweaved. Attendees will learn how emerging composable infrastructure solutions deliver the adaptive architecture needed to manage this new data reality. Machine learning algorithms can better anticipate data storms and automate resources to support surges, including fully scalable GPU-c...
As IoT continues to increase momentum, so does the associated risk. Secure Device Lifecycle Management (DLM) is ranked as one of the most important technology areas of IoT. Driving this trend is the realization that secure support for IoT devices provides companies the ability to deliver high-quality, reliable, secure offerings faster, create new revenue streams, and reduce support costs, all while building a competitive advantage in their markets. In this session, we will use customer use cases...
Machine learning has taken residence at our cities' cores and now we can finally have "smart cities." Cities are a collection of buildings made to provide the structure and safety necessary for people to function, create and survive. Buildings are a pool of ever-changing performance data from large automated systems such as heating and cooling to the people that live and work within them. Through machine learning, buildings can optimize performance, reduce costs, and improve occupant comfort by ...