VPC design: subnets, NAT, endpoints — Design Review Checklist — Practical Guide (Oct 7, 2025)
VPC design: subnets, NAT, endpoints — Design Review Checklist
Level: Intermediate Software Engineers
As of October 7, 2025
Introduction
Designing a Virtual Private Cloud (VPC) is foundational to secure, scalable cloud infrastructure. Whether you use AWS, Azure, or Google Cloud Platform (GCP), understanding how to structure subnets, use Network Address Translation (NAT), and leverage service endpoints directly impacts cost, security, and operational complexity.
This article distils best practices as of late 2025 for designing VPCs in modern cloud environments, focusing on subnet design, NAT usage, and endpoint configuration. Practical hands-on steps, common pitfalls, and a concise checklist will help you confidently review or design VPCs.
Prerequisites
- Familiarity with cloud networking basics: IP addressing, routing, security groups/firewalls.
- Understanding of your cloud provider’s core VPC features — e.g. AWS VPC, Azure VNets, or GCP VPCs.
- Knowledge of CIDR (Classless Inter-Domain Routing) notation and subnetting.
- Access to the cloud provider’s console or CLI tools for hands-on validation steps.
Hands-on Steps
1. Designing subnets: public, private, and isolated
Good subnet design segments workloads by exposure level and purpose. Typical categories:
- Public subnets: Routable to the internet via an Internet Gateway or equivalent.
- Private subnets: No direct internet routing, but can initiate outbound traffic via NAT.
- Isolated subnets: No internet access, typically for sensitive databases or backend services.
Key design points:
- Assign CIDR blocks ensuring no unnecessary overlap with on-premises or peered networks.
- Provision enough IPs for scaling but avoid excessive waste. Remember cloud providers have quotas and different address limits per subnet.
- Associate route tables per subnet type to control traffic flow clearly.
# Example AWS subnet CIDR approach (within 10.0.0.0/16 VPC)
# Public subnet: 10.0.1.0/24 (256 addresses)
# Private subnet: 10.0.2.0/24
# Isolated subnet: 10.0.3.0/28 (16 addresses for a small DB tier)
2. Choosing NAT solutions – Gateway vs Instance
Private subnets must use NAT to egress internet traffic without exposing resources publicly. Common options:
- Managed NAT Gateway (AWS) / NAT Gateway (Azure) / Cloud NAT (GCP): Highly available, fully managed, scales automatically. Preferred where cost and simplicity outweigh spikes in traffic costs.
- NAT Instances / Self-managed NAT: Allows full instance-level control, custom routing or monitoring. Requires patching, scaling, and high availability management.
When to choose: For most architectures in 2025, managed NAT gateways are recommended given their availability and maintenance advantages. Use NAT instances only if you need customisation beyond gateway features.
# Example AWS Terraform snippet for a NAT Gateway in a public subnet
resource "aws_nat_gateway" "gw" {
allocation_id = aws_eip.nat.id
subnet_id = aws_subnet.public_subnet.id
}
resource "aws_eip" "nat" {
vpc = true
}
3. Using VPC endpoints to improve performance and security
VPC endpoints connect your VPC privately to supported AWS services or other cloud services, reducing internet traffic.
Types include:
- Interface endpoints: Elastic Network Interfaces that serve as private entry points to services (e.g., S3, CloudWatch).
- Gateway endpoints: Supported for high-volume services like Amazon S3 and DynamoDB (AWS-specific).
Benefits:
- Eliminate exposure of traffic to the public internet.
- Reduce egress data transfer costs.
- Allow fine-grained IAM policies on endpoints.
When to choose: Always enable endpoints for services heavily accessed by private instances, especially storage and logging services. Gateway endpoints should be preferred where available (currently AWS S3 and DynamoDB). For other services, interface endpoints apply.
# Create an interface VPC endpoint for AWS Systems Manager
aws ec2 create-vpc-endpoint
--vpc-id vpc-12345678
--service-name com.amazonaws.us-east-1.ssm
--vpc-endpoint-type Interface
--subnet-ids subnet-abc1 subnet-abc2
--security-group-ids sg-0123456789abcdef0
Common Pitfalls
- Overlapping CIDRs: Designing subnets overlapping with other VPCs, on-premises networks, or peered VPCs causes routing conflicts. Plan and document CIDR allocations carefully.
- Public access inappropriately granted: Leaving critical resources in public subnets can expose them to attacks. Only put components that require direct internet access (e.g., bastion hosts, load balancers) in public subnets.
- Underestimating NAT-related costs: Managed NAT gateways incur data processing charges and can be costly at scale. Monitor usage and consider savings plans or reserved capacity where your cloud provider supports them.
- Ignoring endpoint security: VPC endpoints need associated security groups and IAM policies tightly scoped. Open policies can expose services or enable privilege escalation.
- Route table misconfigurations: Without explicit correct routes, instances may lose connectivity or allow unwanted traffic flows.
Validation
- Subnet Validation: Check subnet IP ranges using provider console or CLI. Confirm no overlaps and correct association to route tables.
- Connectivity Tests: From an instance in a private subnet, test internet egress (e.g., ping known external IPs), confirm outbound/inbound traffic paths.
- NAT Gateway or Instance health: Use cloud monitoring dashboards to verify NAT resource availability and traffic metrics.
- Endpoint Verification: Use DNS queries from within the VPC to confirm endpoints resolve to private IPs; validate IAM policies allow intended access.
- Audit Logs: Enable flow logs or equivalent to verify traffic flows align with your design and security policy.
Checklist / TL;DR
- Plan subnet CIDRs carefully respecting isolation and scale.
- Use public subnets only for resources requiring direct internet access.
- Implement private subnets with NAT for secure internet egress.
- Prefer managed NAT gateways over self-managed NAT where possible.
- Use VPC endpoints to reduce internet exposure and costs, applying the most appropriate endpoint type.
- Secure endpoints with strict security groups and IAM policies.
- Verify route tables correctly route traffic as intended.
- Monitor NAT usage and VPC flow logs for anomalies.
- Review and document your VPC design for future audits and scaling.