Back to Blog

Navigating the Complexities of Hybrid and Multi-cloud Environments

Phil Gervasi
Phil GervasiDirector of Tech Evangelism
feature-navigate-complex-environments

Summary

Today’s evolving digital landscape requires both hybrid cloud and multi-cloud strategies to drive efficiency, innovation, and scalability. But this means more complexity and a unique set of challenges for network and cloud engineers, particularly when it comes to managing and gaining visibility across these environments.


General Eric K. Shinseki once said, “If you don’t like change, you’re going to like irrelevance a lot less.” For IT operations, the shifting strategies, technologies, and politics around public cloud exemplify this perfectly.

In 2006, AWS launched Elastic Compute, and the advent of the public cloud provider. After a time, many IT leaders were determined to have nothing on-premises and put everything in a public cloud. Whether that was for reducing cost, improving developer productivity, reducing operational burden, or improving the ability to scale, engineers had to re-tool, architectures had to change, and the bean counters had to shift from a purely CapEx model to OpEx.

As resources began moving to the cloud, many organizations became hybrid cloud environments whether they wanted to or not, as some resources were migrated, others were not, and some were in process.

However, many saw over time that some resources performed better and were cheaper to run on-premises in our own data centers. And so network engineers, sysadmins, and newly minted “cloud engineers” saw another shift to architectures running a mixture of public cloud and private data centers – what we now refer to as hybrid cloud.

Container environments, SD-WAN, and the rise of remote work all certainly played a role, as did various other technical, financial, and political factors. Not long ago, we saw another shift to new designs with multiple cloud service providers, including SaaS.

Today’s evolving digital landscape leverages both hybrid cloud and multi-cloud strategies to drive efficiency, innovation, and scalability. However, this has meant more complexity and a unique set of challenges for network and cloud engineers, particularly when it comes to managing and gaining visibility across these environments.

Understanding hybrid and multi-cloud challenges

Today’s hybrid and multi-cloud cloud environments pose new challenges for IT teams, especially those that have both.

First, connectivity is more complex. Each cloud service provider (CSP) has its own set of APIs, networking constructs, capabilities, and management tools. Engineers managing the connectivity to multiple clouds must somehow manage these disparate systems collectively.

Second, because we’re dealing with different providers, there is potential for inconsistent security policies. Especially in today’s cybersecurity landscape, ensuring consistent security policies across all platforms is absolutely critical. Each environment might have different security controls and compliance requirements, not to mention the mechanisms used to enforce policies may vary from CSP to CSP.

Setting up effective and accurate security policies is difficult. Good governance in a public cloud environment is even harder because different teams are often deploying simultaneously. This is compounded further when trying to do that across multiple clouds and aligning public cloud with on-premises resources.

Next, remember that we’re using resources on-premises, in the cloud, and from SaaS providers to deliver services, generally in the form of applications, to actual people. This makes performance monitoring vital to ensuring a great user experience.

The problem is that the different CSPs offer different levels of network performance, reliability, and visibility. This may be due to their internal infrastructure, where their data centers are located in the world, and what internet service providers are involved. When workloads rely upon components that span multiple environments, monitoring all these factors to understand and troubleshoot performance issues is very difficult for engineers, mainly because each CSP tends to have its own visibility and monitoring tools that don’t work across all clouds.

Visibility doesn’t just mean performance, though. Also, consider that CSPs charge their customers for several services, including the ingress or egress of data from their cloud. Managing costs in multi-cloud environments can quickly become a disaster without proper visibility. Engineers need tools that provide insights into resource utilization and cost-efficiency across all platforms. Today, not only do engineers have technical alerts for our infrastructure, but business leaders also have billing alerts when it comes to cloud monitoring. Surprise cloud bills are a problem for budget-conscious IT teams, and alerting them of significant traffic changes will give decision makers the information they need long before the bill comes.

Lastly, remember that even though our cloud resources are in someone else’s data center, often across the globe, data sovereignty and compliance remain a top priority. Adhering to legal and regulatory requirements across geographic and digital boundaries is complex, mainly when data resides across multiple clouds and regions.

Kentik’s role in enhancing cloud network visibility

Comprehensive visibility into network operations across all environments is critical to address these challenges. This is where Kentik’s platform excels. Kentik is vendor-agnostic when it comes to CSPs, so it provides a unified network analytics and visibility platform designed to work across modern hybrid and multi-cloud architectures.

 
Navigating the Complexities of Hybrid and Multi-cloud Environments with Phil Gervasi

Comprehensive data collection

Comprehensive visibility means Kentik ingests data types from a wide variety of sources.

Chart showing types of network data that Kentik monitors

From the CSPs, Kentik ingests:

  • AWS VPC flow logs
  • AWS cloud metrics
  • Azure NSG flow logs
  • Azure Firewall logs
  • Azure cloud metrics
  • Google Cloud VPC flow logs
  • Google Cloud metrics
  • OCI VCN flow logs
  • OCI cloud metrics

From on-premises infrastructure, Kentik ingests:

  • Flow logs
  • SNMP information
  • Streaming telemetry
  • IPAM information
  • DNS information
  • Application and security tag information

From SaaS providers, Kentik monitors:

  • Packet loss, latency, and jitter from hundreds of vantage points to each SaaS provider
  • Path tracing between vantage points and individual points of presence

For internet service providers, Kentik ingests:

  • The global routing table
  • Path tracing among agents and points of presence
  • BGP information

This comprehensive data collection gives engineers a unified view of the network performance across all cloud regions, providers, containers, environments, and including campus, on-premises data centers, public cloud, SaaS providers, internet service providers, and the pathways connecting everything.

Contextual network visibility

All of this data lives in a single unified data repository (UDR), so it can be analyzed in context. This means that Kentik shows what’s happening and provides context on why it’s happening. For instance, Kentik can correlate traffic spikes in one part of the network with events in another. This deep insight is crucial for proactive management and rapid troubleshooting.

Kentik incorporates information from routing tables, DNS information, security tags, and application-layer information to help engineers understand the data as it relates to application performance. With this data, Kentik builds a clear visualization of these resources across on-premises data centers, containers, and clouds so that an engineer can clearly see, troubleshoot, and understand application traffic from end to end.

Consider a typical problem in a multi-cloud environment. Imagine a Kubernetes environment in Google Cloud in which two container pods experience high latency between each other. This, in turn, increases the delay in response to the request made by the web front-end service running in AWS. A delay experienced by the web front end normally means a longer page load or rendering time.

In other words, we have a user experience degradation due to a problem in one cloud affecting the performance of resources in another cloud.

Visibility across service provider networks

Remember that the internet is the primary delivery mechanism for applications today, so monitoring ISPs is crucial for an overall hybrid and multi-cloud strategy.

Kentik offers visibility into service provider networks, bridging the visibility gap between on-premises infrastructure and cloud resources. This is particularly beneficial for tracking the performance of ISP and cloud provider links, which are critical for end-to-end service delivery.

This includes analyzing routing tables, BGP information, and the paths over the internet connecting our resources. With Kentik, we can see where a problem occurs with network performance between our end users and the public cloud resources they’re trying to consume.

Mastering the complexity of multi-cloud environments

Today, most organizations are hybrid, multi-cloud, or both. Architectures and our strategies for consuming new technologies keep changing, especially as we learn to manage them better.

Network and cloud visibility have become more complex, but they’ve also become more critical. Kentik is the modern, vendor-agnostic solution to staying on top of change regardless of the architecture, vendor, or the latest initiative from the CIO’s office.

Explore more from Kentik

View in Prod
We use cookies to deliver our services.
By using our website, you agree to the use of cookies as described in our Privacy Policy.