VPC design: subnets, NAT, endpoints — Best Practices in 2025 — Practical Guide (Oct 21, 2025)
VPC Design: Subnets, NAT, Endpoints — Best Practices in 2025
Level: Intermediate
As of October 21, 2025
Introduction
Designing a Virtual Private Cloud (VPC) topology remains a critical skill for software engineers, cloud architects, and DevOps professionals leveraging cloud infrastructure. In 2025, cloud vendors have continued to evolve networking features, balancing security, cost, and operational agility. This article outlines best practices for designing VPCs focusing on subnets, Network Address Translation (NAT), and service endpoints, primarily reflecting the widely used AWS ecosystem, while also applicable in concept to Azure and Google Cloud Platform (GCP) with their respective service variants.
Prerequisites
- Basic understanding of IP networking and CIDR notation.
- Familiarity with cloud provider VPC fundamentals (e.g., AWS VPC, Azure VNet, GCP VPC).
- Knowledge of common cloud services and concepts such as EC2 instances, Lambda functions, and managed services.
- Access to a cloud environment with permission to create VPCs, subnets, gateways, and security groups.
Core Concepts and Design Considerations
Subnets: Private, Public, and Isolated
Subnets are the basic building blocks of your VPC design, logically partitioning your IP address space. The most common classifications are:
- Public subnets – Subnets with a route to an Internet Gateway, used for resources requiring inbound public internet access (e.g., web servers).
- Private subnets – Subnets without direct routes to public internet, typically housing backend services or databases.
- Isolated (or protected) subnets – Subnets with no direct internet connectivity, used exclusively for highly sensitive resources.
Best practice: Avoid mixing subnet types in a single zone. Prefer multiple Availability Zones (AZs) with identical subnet tiering to improve fault tolerance and reduce blast radius.
NAT: Internet Access for Private Subnets
Private subnets often require outbound internet access for software updates or external API calls. This is enabled through NAT solutions:
- NAT gateways – Managed, scalable AWS service (also available in Azure and GCP) that provides outbound NAT with high availability options.
- NAT instances – Self-managed EC2 instances configured with NAT capabilities; more complex and less scalable, generally discouraged in 2025.
When to choose NAT Gateway vs. other solutions: For most workloads, managed NAT gateways offer better uptime, ease of maintenance, and consistent performance. NAT instances might be suitable only if customised routing or software inspection is required, but they add operational overhead.
Endpoints: Private Connectivity to Cloud Services
To avoid traversing the public internet, modern VPC design in 2025 heavily uses VPC endpoints. These create private connections to cloud services such as S3, DynamoDB (AWS), Blob Storage (Azure), or Cloud Storage (GCP).
Types of endpoints include:
- Interface endpoints – Powered by PrivateLink technology, create an elastic network interface with private IPs inside the subnet to connect with supported services.
- Gateway endpoints – Specialised for services like S3/DynamoDB, added as routes in route tables.
Best practice: Use endpoints for frequent service access from private subnets to reduce latency, cost, and improve security by avoiding public IPs.
Hands-on Steps
Step 1: Designing Subnets and CIDRs
# Example: Create three subnets in AWS CLI (replace with your VPC ID)
aws ec2 create-subnet --vpc-id vpc-12345678 --cidr-block 10.0.1.0/24 --availability-zone us-east-1a
aws ec2 create-subnet --vpc-id vpc-12345678 --cidr-block 10.0.2.0/24 --availability-zone us-east-1a
aws ec2 create-subnet --vpc-id vpc-12345678 --cidr-block 10.0.3.0/24 --availability-zone us-east-1a
# Assign names: public (10.0.1.0/24), private (10.0.2.0/24), isolated (10.0.3.0/24)
Step 2: Setting up a NAT Gateway
# Allocate an Elastic IP for NAT
aws ec2 allocate-address
# Create NAT Gateway in the public subnet
aws ec2 create-nat-gateway --subnet-id subnet-public-id --allocation-id eipalloc-12345678
# Update private subnet route table to direct 0.0.0.0/0 traffic to NAT Gateway
aws ec2 create-route --route-table-id rtb-private-id --destination-cidr-block 0.0.0.0/0 --nat-gateway-id nat-12345678
Step 3: Creating VPC Endpoints
# Create Gateway Endpoint for S3 (AWS)
aws ec2 create-vpc-endpoint --vpc-id vpc-12345678
--service-name com.amazonaws.us-east-1.s3
--route-table-ids rtb-private-id
# For Interface Endpoint (e.g., EC2 API)
aws ec2 create-vpc-endpoint --vpc-id vpc-12345678
--service-name com.amazonaws.us-east-1.ec2
--subnet-ids subnet-private-id
--security-group-ids sg-12345678
--vpc-endpoint-type Interface
Common Pitfalls
- Overlapping CIDR blocks: Overlapping IP ranges within your VPC subnets or with on-premises networks complicate routing and VPN/DirectConnect usage.
- Insufficient NAT Gateway capacity: NAT Gateway scales well, but extremely high throughput may require multiple gateways or alternative architectures.
- Ignoring endpoint limits: Most cloud providers impose soft and hard limits on the number of endpoints per VPC; monitor and request quota increases as needed.
- Incorrect security group or route table associations: Endpoint interfaces require proper security group rules and subnet associations; misconfiguration blocks connectivity.
- Relying on public IPs in private subnets: This defeats isolation principles and exposes unnecessary attack surfaces.
Validation
- Connectivity Testing: From an instance in a private subnet, verify outbound internet access via NAT on standard ports (e.g., HTTP 80, HTTPS 443).
- Service reachability: Confirm that VPC endpoints resolve DNS names to private IPs and connectivity to managed services occurs without traversing the public internet.
- Routing Table Inspection: Check each subnet’s route table for accurate next hops (IGW for public, NAT gateway for private, none for isolated).
- Security group audit: Ensure least privilege — only necessary ports/protocols open between subnets and endpoints.
- Logging and Monitoring: Enable VPC Flow Logs and NAT Gateway CloudWatch metrics to detect anomalies or bottlenecks.
Checklist / TL;DR
- Define separate public, private, and isolated subnets across multiple AZs for fault tolerance.
- Use managed NAT Gateways for private subnet outbound connectivity; avoid NAT instances.
- Leverage VPC endpoints to access cloud services privately, improve security, and lower latency.
- Plan CIDR ranges carefully to prevent overlaps and to accommodate future scaling.
- Validate route tables and security group configurations regularly.
- Monitor your NAT gateway and endpoints usage and scale or request quota changes as needed.
- Prefer Infrastructure as Code (IaC) tools like Terraform or AWS CloudFormation to enforce configuration consistency.
References
- AWS: VPC and Subnets
- AWS: NAT Gateways
- AWS: VPC Endpoints Overview
- Azure Virtual Networks
- GCP VPC Documentation
- <a href="https://learn.microsoft.com/en-us/azure/virtual-network/nat-overview"