Production-Grade EKS Platform with ALB, HTTPS & IRSA
Designed and deployed a secure, production-style Kubernetes platform on AWS using Terraform, Amazon EKS, AWS Load Balancer Controller, IRSA, and HTTPS termination via ACM. The project mirrors real-world cloud and DevOps operational patterns.
Completed🎯 Problem & Objective
The goal was to deploy a containerized application on Kubernetes while maintaining strong security boundaries, minimal IAM exposure, and clean infrastructure reproducibility. The platform needed to support HTTPS, controlled ingress, and safe teardown — without leaking credentials or exposing workloads directly to the internet.
🏗️ High-Level Architecture
The architecture follows a layered cloud-native approach. A custom VPC hosts an Amazon EKS cluster with managed node groups. An internet-facing Application Load Balancer serves as the single entry point, routing traffic to Kubernetes services through the AWS Load Balancer Controller using IP mode.
🧠 Key Design Decisions
- EKS with Managed Node Groups: Selected to reduce operational overhead while retaining Kubernetes flexibility and control.
- AWS Load Balancer Controller: Used to integrate ALB lifecycle management directly with Kubernetes ingress resources.
- IRSA (IAM Roles for Service Accounts): Implemented to eliminate static AWS credentials inside pods and enforce least privilege.
- ACM HTTPS Termination: TLS handled at the ALB layer, keeping application containers simple and secure.
- Terraform Modularization: Infrastructure split into VPC, EKS, node groups, and controller modules for clarity and reuse.
🛠 Tools & Technologies
✅ Execution & Verification
Infrastructure was provisioned incrementally using Terraform, followed by Kubernetes deployments and ingress configuration. HTTPS connectivity was validated using curl and browser testing. Health checks and routing were verified through ALB target groups and Kubernetes service inspection.
🚧 Challenges Faced
- ALB provisioning failures: Caused by overly restrictive IAM permissions, resolved by refining the controller policy.
- Terraform state lock issues: Triggered by interrupted applies and resolved through DynamoDB lock inspection and safe recovery.
- HTTPS certificate validation errors: Fixed by creating a dedicated subdomain and validating DNS without affecting the main domain.
- 502 Bad Gateway errors: Traced to container port mismatch and health check configuration.
💡 Key Learnings
- Most Kubernetes failures originate from IAM or networking layers.
- IRSA dramatically improves security posture in production clusters.
- Terraform hygiene is critical for long-term maintainability.
- Production readiness requires observability and clean teardown paths.
✅ Outcome & Final Result
The final platform successfully delivered HTTPS traffic to a containerized Next.js application through an ALB-backed Kubernetes ingress. All resources were reproducible via Terraform and safely destroyed after validation to control costs.