Intro
It's been three years since I started my journey into the cloud, and every day, there's something new to set your eyes on. About a year into my journey in the cloud (AWS), I noticed a growing requirement for cloud engineers with a solid understanding of networking.
As organisations and businesses grow rapidly in the cloud, so does some of their technical debt. More and more, I see a lot of this debt surrounding networking in the cloud space, and as such, there is a substantial requirement for network engineers to gain a better understanding of networking in the cloud.
This post will mostly serve as an introductory guide for network engineers who wish to transition to cloud networking. I have been in the realm of traditional networking for over 15 years and, in the last couple of years, sat and passed the AWS Architect Associate and the Advanced Networking Speciality exams.
This post will be a part of a series, teaching and helping network engineers about networking in AWS. While this post is predominantly high-level, the following posts will dive into the technical details.
Quick disclaimer: I still work in traditional networking, aiming to work more and more in cloud networking space sometime in the future when an opportunity arises, but I would like to share my journey as a Network Engineer aiming to move into that space and the knowledge I've picked up on the way from working with and passing the Advanced Networking exam.
Quick disclaimer: I still work in traditional networking, aiming to work more and more in cloud networking space sometime in the future when an opportunity arises, but I would like to share my journey as a Network Engineer aiming to move into that space and the knowledge I've picked up on the way from working with and passing the Advanced Networking exam.
Before we start, I want to quickly outline the key differences between cloud and "traditional networking".
- Physical Infrastructure vs Virtual Resources
- As a traditional network engineer, you're mostly used to physical hardware like routers and switches, often on-premises in a data centre.
- Cloud networking relies on virtual resources being provisioned in the cloud. We no longer worry about infrastructure maintenance or capacity planning (to some extent).
- As a traditional network engineer, you're mostly used to physical hardware like routers and switches, often on-premises in a data centre.
- Scalability and Elasticity
- Cloud networking offers scalability and elasticity beyond what you could think of on-premises unless you can rack and stack all the kit to your heart's desire. You can easily scale your network resources up and down based on demand, whereas in traditional networking, scaling in this way would generally require the kit purchase, racking the equipment, configuring the kit, etc.
- Pay-as-you-go Model
- Cloud networking typically follows a pay-as-you-go or pay-for-what-you-use model. You will only ever pay for the resources you consume, requiring less upfront capital investment.
- Global Reach
- In a traditional network environment, if you wanted a global reach and the end user to have the best experience, you would have to place hardware as close as possible to the user. This means multiple PoPs or DCs wherever your services are to be accessed.
- Cloud networks offer a global reach, enabling organisations and businesses to get closer to their users swiftly and effortlessly. Whether you utilise CDNs (Content Distribution Networks) or take a mirror image of your infrastructure from one location to another using IaC (Infrastructure as Code), you'll never wonder how to enable a better user experience.
- In a traditional network environment, if you wanted a global reach and the end user to have the best experience, you would have to place hardware as close as possible to the user. This means multiple PoPs or DCs wherever your services are to be accessed.
- Automation and Orchestration (My favourite part)
- Cloud networking heavily relies on automation and orchestration tools. That time, you had to log into your Cisco router at 2 am and manually update some configuration.. GONE. Tasks that were manual in traditional networking, like provisioning, can be automated in the cloud, increasing efficiency and reducing the risk of human errors.
- Security and Compliance
- Security in cloud networking fundamentally requires a different approach, as data and resources are stored off-premises. As robust as AWS is with their security features, organisations and businesses must ensure compliance with cloud-specific standards. More on this a little later.
- Network Monitoring and Management
- Cloud networking offers cloud-native monitoring and management tools, often with more extensive insights into resource performance than traditional network monitoring solutions. Many other tools also plug into AWS' APIs and provide much more insight again.
Section 1: Understanding Cloud Networking
Cloud networking is the practice of leveraging cloud computing resources to build, manage and optimise networks and network-based services. It allows organisations to create virtualised, scalable and resilient network architectures. Unlike traditional networking approaches that rely on physical hardware and data centres, cloud networking operates and utilises a virtualised, software-defined environment, offering many benefits.
So, how does cloud networking differ from traditional on-premises networking?
- Physical Infrastructure vs Virtual Resources
- In traditional networking, organisations are typically tethered to physical hardware, requiring substantial upfront investments in equipment, space and maintenance. In contrast, cloud networking is purely virtual resources, removing the need for physical infrastructure and the associated complexities.
- Scalability and Elasticity
- Cloud networking introduces the ability to scale and enable elasticity, which is typically unavailable with on-premises infrastructure unless you can constantly have a high level of headroom on equipment and the ability to add hardware at a moment's notice. It allows the ability to instantly build out and tear down huge-scale environments and scale them out indefinitely.
- Pay-as-you-go Model
- One of the leading cloud networking features is the pay-as-you-go type model. This removes any upfront investments and barriers to entry compared to purchasing and utilising physical network infrastructure. You only pay for the resources you consume, optimising cost-efficiency and aligning expenses with actual usage. As a network engineer, this would also mean it's just as easy for you to spin up a lab-type environment and get learning!
Section 2: Key Concepts in Cloud Networking
As we look at cloud networking, we must familiarise ourselves with some foundational concepts that define networking in the cloud. In this section, I want to briefly touch on some key concepts central to cloud networking, focusing on AWS.
A service you may have heard about during corridor talk is VPC's, but what are they? At the heart of almost all technologies deployed in AWS, and your first steps in your cloud networking journey, is AWS' VPC, Virtual Private Cloud. Think of a VPC as your private area of AWS cloud, a virtual network dedicated exclusively to your account. It serves as the foundational building block for creating and managing a multitude of resources, including network resources, in a cloud-native manner. You can have many VPC's per AWS account, all of which can be connected in many ways. Think of a VPC as your private data centre in the cloud.
The Role of a VPC in Isolated Network Environments
The primary purpose of a VPC is to enable the creation of isolated network environments within AWS. VPCs provide a virtualised, software-defined approach to network isolation, similar to how you would physically segment your network in some environments or virtually if your hardware is capable.
So why use VPC's and isolate your resources?
- Enhance Security
- VPCs allow for strict control over network traffic by defining who can communicate and access which resources. Security groups and NACLs (Network Access Control Lists) play vital roles in governing inbound and outbound traffic.
- Isolate Workloads
- VPCs can host a wide range of resources, such as EC2 instances (virtual servers), databases, storage and more, all within a logically segregated environment. This isolation prevents resource interference, enhances performance, and provides an environment for different applications or services. For example; a VPC for Service A and a VPC for Service B.
VPC's vs Traditional Data Centres
The best way to describe the significance of VPCs is to compare them with the traditional data centre model.
- Flexibility and Agility
- VPCs offer a completely different level of flexibility and agility, allowing organisations to provision and configure network resources swiftly. The traditional data centre often entails lengthy procurement and setup processes.
- Scalability
- VPCs support dynamic scaling, where resources can be added, removed, scaled out or in, up or down as needed to accommodate changing workloads. The traditional data centre typically faces scalability limitations and may require upfront investment and planning for expansion.
- Global Reach
- VPC's can provide organisations with global reach, enabling the ability to deploy resources in multiple geographical regions seamlessly. Compared to data centres, you would typically be required to build and maintain each data centre in different locations.
The Concept of Elasticity
Elasticity is a core capability of cloud networking and computing. Elasticity is the ability to allocate dynamically and de-allocate resources based on demand. Within a VPC, elasticity allows organisations to:
- Auto-scale Resources
- Elasticity enables the automatic (and manual) scaling of resources like EC2 instances or containers in response to changes in traffic and load. When demand increases, additional resources are provisioned; when demand decreases, the excess resources are terminated.
- Optimise Costs
- By scaling resources in response to demand, organisations can optimise costs, ensuring they only pay for resources they use. In contrast to the traditional model, over-provisioning resources is common to handle peak loads, letting them run all day.
The Importance of Resource Isolation
Resource isolation is a fundamental principle in cloud networking for security and performance reasons. In a cloud-native environment like AWS, resource isolation involves:
- Security Isolation
- Isolating resources in a VPC prevents unauthorised access and potential security breaches. Security Groups and NACLs act as your virtual security perimeter around resources.
- Performance Isolation:
- Resource isolation ensures that individual workloads do not interfere with each other's performance. Each VPC can have its own dedicated resources, guaranteeing consistent performance.
The VPC is the bedrock of isolated network environments and resources, enabling organisations to benefit from elasticity and resource isolation to build secure, flexible and efficient network architectures in the cloud.
Section 3: AWS Networking Components
In this section, I'd like to cover some of the key cloud networking services/ components you see daily and form the backbone of many services based on AWS's cloud infrastructure. These services/ components of AWS play a vital role in enabling a seamless and robust networking experience.
Elastic Compute Cloud (EC2) Instances
At the core of AWS is EC2 instances. Think of these as your Virtual Machines of the cloud. These form the majority of workloads on AWS, offering scalable compute capacity on demand.
AWS EC2 instances allow organisations to;
- Run Applications
- EC2 instances serve as the foundation for running applications, services and workloads in AWS. They provide the processing power needed to execute a wide range of tasks.
- Customise Environments
- EC2 instances come in various instance types, each tailored to specific workloads, from GPU to CPU enhanced. Organisations can choose instances with their preferred CPU, memory, storage and network performance balance.
- Enhanced Scalability
- Instances can be scaled up or down (manually or automatically) to adapt to changing demand. Auto Scaling groups enable automatic provisioning and termination of instances based on your preferences and resource utilisation. Simple Storage Service (S3)
- Custom Images for enhanced networking capabilities
- Using custom images created by network vendors, you can enhance your networking ability in the cloud by using familiar software & systems. These images can range from vendors such as Juniper, Cisco, F5, Fortinet and more, bringing a familiar experience you're probably used to.
Simple Storage Service (S3)
Another integral part of AWS is Amazon's S3 services, a highly scalable object storage service. While S3 is mostly known for its role in data storage, its significance in networking is vitally important.
- Scalable Object Storage
- S3 provides organisations with a scalable and durable storage solution for many data types, including documents, images, videos and backups.
- Data Transfer
- S3 facilitates efficient data transfer of the internet, supporting HTTP, HTTPS and other protocols, making it an essential component for content delivery (global scale) and data distribution.
- Integration with Services
- S3 is integrated with various AWS services, allowing for storing data used by other AWS components, such as EC2 instances or Lambda functions. It can also be integrated with other networking-based services, like CloudFront, and provide businesses with an easy and cost-effective way to distribute content globally with low latency (by caching data at endpoints) and high data transfer speeds.
Elastic Load Balancers (ELB)
Elastic Load Balancers (ELB) evenly distribute incoming traffic across multiple resources within an AWS environment. ELBs help to ensure high availability, fault tolerance and scalability within AWS, similar to how a load balancer in your data centre would work (like F5).
Some notable features are;
- Traffic Distribution
- ELBs distribute incoming traffic across multiple EC2 instances or other AWS resources (such as containers, etc.), ensuring that workloads are balanced and no single resource is overwhelmed.
- Failover and Redundancy
- ELBs provide failover capabilities, automatically routing traffic to resources in case of failures. This ensures continuous service availability.
- SSL Termination
- ELBs can also handle SSL/ TLS terminations, offloading the encryption and decryption process from your backend resources, enhancing performance and allowing you to inspect traffic (based on your setup) before it arrives at your compute.
Route 53
Amazon's Route 53 is AWS's scalable and highly available DNS service. While DNS typically is an insignificant part of traditional network engineering concepts, in cloud networking and AWS, it plays a much more substantial role.
Route 53's significance in cloud networking includes;
- DNS Management
- Route 53 allows organisations to manage domain names and route internet traffic to AWS resources, such as EC2 instances, S3 buckets or ELBs.
- X-Based Routing
- Route 53 offers various routing options based on the routing policies you choose. These can range from simple routing policies, geolocation routing policies (route traffic to services close to users), latency routing policies (route traffic to the location that provides the best latency for the user) and much more.
- Failover and DNS Health Checks
- Route 53 can automatically route traffic away from unhealthy resources, ensuring high availability and fault tolerance.
Transit Gateway
AWS' Transit Gateway is a service that simplifies network connectivity in complex multi-VPC architectures. It acts as a central hub (think of a central router) for routing traffic between VPCs and on-premises data centres using either VPN connections of AWS Direct Connect.
Some benefits of Transit Gateway include;
- Simplified Connectivity:
- Transit Gateway simplifies VPC-to-VPC and VPC-to-DC connectivity, reading the need for complex peering relationships between VPCs and VPN connection.s
- Scalability
- Each Transit Gateway supports up to 5,000 VPCs and can handle significant traffic volume, making it necessary for large-scale deployments.
- Centralised Control
- Transit Gateway provides centralised control, visibility and management over network traffic.
- Security & Routing
- It offers fine-grained control over routing and access policies, enhancing network security.
These components come together as the building blocks to create a flexible and efficient networking environment in the cloud and are key concepts to understand.
Section 4: Connectivity Options
In this section, we'll briefly explore some of the connectivity options for AWS.
- VPN Connections
- Connect your on-premises infrastructure to your cloud environment utilising VPN connections. Your VPN connections can be terminated on AWS Transit Gateway or VPCs, depending on your requirements.
- Direct Connect (DX)
- With VPN connections, you're limited to a maximum throughput of 1.25 Gbps. This is where Direct Connect comes in. DX acts as your own dedicated link into AWS, providing throughput of up to 100 Gbps. These connections come with a bit of a delay for the physical installation but allow organisations to utilise a low latency and high-speed link purely for AWS.
- Transit Gateway
- As detailed above, Transit Gateway is a central hub for connecting your VPCs and bringing in your on-premises connections, whether VPN or DX. Transit Gateway simplifies your cloud network configuration for large-scale deployments. These are a step up from VPC Peering.
- VPC Peering
- AWS VPC Peering is a service that is utilised to connect VPCs together. This service pre-dates Transit Gateway and is only to be used for certain use cases, such as connecting 2 or 3 VPCs together or if you created a shared services VPC for others to connect into. VPCs using VPC peering can only communicate with a directly connected peer, meaning you can't have transitive peering relationships.
Section 5: Security and Compliance
This section will delve into the realm of security and compliance in AWS. We'll cover some of the services you may encounter daily as a network engineer working with AWS.
Fortifying AWS Infrastructure and Resources
- IAM (Identity and Access Management
- IAM is your control centre for managing user's access to AWS services and resources down to a granular level. It can also be used to control what resources can interact with other resources, like an EC2 instance accessing S3 buckets. This can be done with IAM Roles.
- Security Groups
- These let you define inbound and outbound traffic rules for your AWS resources and act in a STATEFUL way. You only add what you want to allow; everything else is denied access, down to protocol and IP address levels. These operate at the interface level for your resources.
- Network Access Control Lists (NACLs)
- NACLs in AWS operate at the subnet level and function as network-level firewalls. They filter traffic based on rules you define, providing an additional layer of defence on top of security groups. NACLs act as your border guards to your VPC, allowing or denying traffic and maintaining a secure perimeter.
- AWS Network Firewall
- AWS Network Firewall is an advanced threat protection service. It offers stateful inspection, intrusion detection and prevention to safeguard network traffic. AWS offers this paid service, which you deploy wherever required and allows for much more advanced inspection over traffic.
- AWS Web Application Firewall (WAF)
- AWS WAF shields your web applications from common web exploits and attacks, such as SQL injection and XSS (Cross-site scripting). It does this by inspecting incoming HTTP/ HTTPS requests and allowing or blocking them based on predefined rules. This, again, is a paid-for service that AWS offers.
- AWS Shield
- AWS Shield is designed to protect against DDoS attacks. It monitors and mitigates the impact of DDoS attacks, ensuring that your applications remain available and responsible, even under attack.
Conclusion
As we wrap up this high-level journey through AWS networking essentials, I want to take a moment to recap key terms and services we covered.
Key AWS Networking Terms and Services
- AWS Virtual Private Cloud (VPC): Your private slice of the AWS cloud, where you create and manage your network resources.
- Security Groups and NACLs: Virtual firewalls that control traffic in and out of your AWS resources and subnets.
- AWS Transit Gateway: A centralized hub simplifying network connections across multiple VPCs and on-premises environments.
- Direct Connect and VPN: Secure connectivity options for accessing AWS resources from different locations.
- IAM: Identity and Access Management for controlling user access to AWS services and resources.
- AWS Network Firewall and WAF: Advanced security services protecting your network traffic and web applications.
- AWS Shield: A guardian against DDoS attacks, ensuring your applications remain available.
Continuing Your AWS Networking Learning Journey
This brief exploration today came at a very high level. AWS offers many services and technologies you would use while networking in the cloud. As with most technologies, the cloud is dynamic and staying curious and informed is key to mastering the services.
I want to include a link to AWS's core network services. This should help take you beyond this post if you are curious about how it works until I take the time to develop further posts.
Part 1 - The Building Blocks of Cloud Networking - A Guide for Network Engineers
Part 2 - Deep Dive - VPCs - A Guide for Network Engineers