AWS — Cloud Infrastructure Guide
Amazon Web Services — the world's most widely used cloud platform. From containers and Kubernetes to databases, DNS, CDN, and networking. Everything you need to run production workloads.
🗺️ AWS Services Map
🔐 IAM — Identity & Access Management
IAM is the foundation of AWS security. It controls who (users, roles, services) can do what (actions) on which resources.
# Create user
aws iam create-user --user-name alice
# Create access keys for a user
aws iam create-access-key --user-name alice
# Create a role for EC2 instances
aws iam create-role \
--role-name ec2-app-role \
--assume-role-policy-document '{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Principal": {"Service": "ec2.amazonaws.com"},
"Action": "sts:AssumeRole"
}]
}'
# Attach a managed policy
aws iam attach-role-policy \
--role-name ec2-app-role \
--policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess
# Create inline policy from file
aws iam put-role-policy \
--role-name ec2-app-role \
--policy-name AppPolicy \
--policy-document file://policy.json
# Check who you are logged in as
aws sts get-caller-identity
# Assume a role temporarily
aws sts assume-role \
--role-arn arn:aws:iam::123:role/DeployRole \
--role-session-name deploy-session
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "S3BucketAccess",
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject"],
"Resource": "arn:aws:s3:::my-app-bucket/*"
},
{
"Sid": "S3ListBucket",
"Effect": "Allow",
"Action": ["s3:ListBucket"],
"Resource": "arn:aws:s3:::my-app-bucket"
},
{
"Sid": "SecretsRead",
"Effect": "Allow",
"Action": ["secretsmanager:GetSecretValue"],
"Resource": "arn:aws:secretsmanager:us-east-1:123:secret:myapp/*"
}
]
}
🌐 VPC — Virtual Private Cloud
A VPC is your private network inside AWS. Everything runs inside a VPC. You control IP ranges, subnets, routing, and firewalls.
# Create VPC
aws ec2 create-vpc --cidr-block 10.0.0.0/16 --region us-east-1
# Create subnets (one public, one private per AZ)
aws ec2 create-subnet --vpc-id vpc-xxx --cidr-block 10.0.1.0/24 --availability-zone us-east-1a
aws ec2 create-subnet --vpc-id vpc-xxx --cidr-block 10.0.10.0/24 --availability-zone us-east-1a
# Create and attach Internet Gateway (for public subnets)
aws ec2 create-internet-gateway
aws ec2 attach-internet-gateway --vpc-id vpc-xxx --internet-gateway-id igw-xxx
# Create NAT Gateway (private subnets need this for outbound internet)
aws ec2 allocate-address --domain vpc # get Elastic IP
aws ec2 create-nat-gateway --subnet-id subnet-public --allocation-id eipalloc-xxx
# Security Groups (stateful firewall per resource)
aws ec2 create-security-group \
--group-name web-sg \
--description "Web tier" \
--vpc-id vpc-xxx
# Allow HTTPS inbound from anywhere
aws ec2 authorize-security-group-ingress \
--group-id sg-xxx \
--protocol tcp --port 443 --cidr 0.0.0.0/0
# Allow inbound from another security group
aws ec2 authorize-security-group-ingress \
--group-id sg-db \
--protocol tcp --port 5432 \
--source-group sg-app
| Concept | What It Is | Key Rule |
|---|---|---|
| VPC | Your private AWS network | One per environment (dev/prod) |
| Public Subnet | Has route to Internet Gateway | Put ALBs and NAT Gateways here |
| Private Subnet | No direct internet route | Put app containers and databases here |
| Security Group | Stateful firewall per resource | Default: deny all inbound, allow all outbound |
| NACL | Stateless firewall per subnet | Use for broad subnet-level rules |
| Internet Gateway | VPC → Internet | One per VPC |
| NAT Gateway | Private subnet → outbound internet | One per AZ (for HA) |
🪣 S3 — Object Storage & Static Websites
S3 stores any file (objects) in buckets. It's also the cheapest and simplest way to host static websites.
# Create a bucket
aws s3 mb s3://my-company-assets --region us-east-1
# Upload files
aws s3 cp ./dist s3://my-company-assets/ --recursive
aws s3 sync ./dist s3://my-company-assets/ --delete # sync + remove old files
# Download a file
aws s3 cp s3://my-company-assets/app.js ./app.js
# List bucket contents
aws s3 ls s3://my-company-assets/
aws s3 ls s3://my-company-assets/ --recursive --human-readable
# Make a file public
aws s3api put-object-acl \
--bucket my-company-assets \
--key index.html \
--acl public-read
# Enable static website hosting
aws s3 website s3://my-site/ \
--index-document index.html \
--error-document 404.html
# Set bucket policy (allow public read)
aws s3api put-bucket-policy \
--bucket my-site \
--policy file://bucket-policy.json
# Presigned URL — share a private file for 1 hour
aws s3 presign s3://my-bucket/private.pdf --expires-in 3600
{
"Version": "2012-10-17",
"Statement": [{
"Sid": "PublicReadGetObject",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::my-site/*"
}]
}
⚡ CloudFront — CDN
CloudFront caches your content at 400+ edge locations worldwide. Users load assets from the nearest location — dramatically faster than fetching from one region. Also handles HTTPS termination.
# Create distribution (S3 origin)
aws cloudfront create-distribution \
--distribution-config file://cloudfront-config.json
# Invalidate cached files (force refresh after deploy)
aws cloudfront create-invalidation \
--distribution-id EDFDVBD6EXAMPLE \
--paths "/*" # all files
aws cloudfront create-invalidation \
--distribution-id EDFDVBD6EXAMPLE \
--paths "/index.html" "/app.js" # specific files
# Get distribution info
aws cloudfront get-distribution --id EDFDVBD6EXAMPLE
{
"Origins": {
"Quantity": 1,
"Items": [{
"Id": "S3-my-site",
"DomainName": "my-site.s3.amazonaws.com",
"S3OriginConfig": { "OriginAccessIdentity": "" }
}]
},
"DefaultCacheBehavior": {
"ViewerProtocolPolicy": "redirect-to-https",
"CachePolicyId": "658327ea-f89d-4fab-a63d-7e88639e58f6",
"Compress": true
},
"DefaultRootObject": "index.html",
"HttpVersion": "http2",
"Enabled": true
}
🌍 Route 53 — DNS
Route 53 is AWS's managed DNS service. Register domains, create DNS records, route traffic with health checks, and do geo-based routing.
# List hosted zones
aws route53 list-hosted-zones
# Create DNS record (A record → ALB)
aws route53 change-resource-record-sets \
--hosted-zone-id Z1234567890 \
--change-batch '{
"Changes": [{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "app.example.com",
"Type": "A",
"AliasTarget": {
"HostedZoneId": "Z35SXDOTRQ7X7K",
"DNSName": "my-alb-1234.us-east-1.elb.amazonaws.com",
"EvaluateTargetHealth": true
}
}
}]
}'
# Create CNAME record → CloudFront
aws route53 change-resource-record-sets \
--hosted-zone-id Z1234567890 \
--change-batch '{
"Changes": [{
"Action": "UPSERT",
"ResourceRecordSet": {
"Name": "www.example.com",
"Type": "CNAME",
"TTL": 300,
"ResourceRecords": [{"Value": "d1234.cloudfront.net"}]
}
}]
}'
| Record Type | Points To | Use Case |
|---|---|---|
| A | IP address or AWS alias | Root domain, ALB, CloudFront |
| CNAME | Another hostname | www → root, subdomains |
| ALIAS | AWS resource (ALB, CloudFront, S3) | Like CNAME but works at apex + free queries |
| MX | Mail servers | Email routing (Google Workspace etc.) |
| TXT | Text string | Domain verification, SPF, DKIM |
🔒 ACM — SSL/TLS Certificates
AWS Certificate Manager provides free SSL certs that auto-renew. Use with ALB and CloudFront for HTTPS.
# Request a certificate (DNS validation is easiest)
aws acm request-certificate \
--domain-name example.com \
--subject-alternative-names "*.example.com" \
--validation-method DNS \
--region us-east-1 # CloudFront requires us-east-1
# List certificates
aws acm list-certificates --region us-east-1
# Get the DNS validation records (add to Route 53)
aws acm describe-certificate \
--certificate-arn arn:aws:acm:us-east-1:123:certificate/abc \
--query 'Certificate.DomainValidationOptions'
ACM certs used with ALB or CloudFront are completely free. They auto-renew 60 days before expiry — no more Let's Encrypt cron jobs. For CloudFront, always request the cert in us-east-1 regardless of where your origin is.
🗄️ RDS — Managed Databases
RDS manages the database server for you — backups, patching, failover, replicas. Supports PostgreSQL, MySQL, MariaDB, Oracle, SQL Server, and Aurora.
# Create a PostgreSQL RDS instance
aws rds create-db-instance \
--db-instance-identifier myapp-db \
--db-instance-class db.t3.micro \
--engine postgres \
--engine-version "15.4" \
--master-username appuser \
--master-user-password "SecurePass@123" \
--db-name myappdb \
--allocated-storage 20 \
--storage-type gp3 \
--vpc-security-group-ids sg-xxx \
--db-subnet-group-name my-db-subnet-group \
--multi-az \ # standby in another AZ (failover)
--backup-retention-period 7 \ # 7-day automated backups
--deletion-protection \ # can't accidentally delete
--no-publicly-accessible # only accessible inside VPC
# Create a read replica
aws rds create-db-instance-read-replica \
--db-instance-identifier myapp-db-replica \
--source-db-instance-identifier myapp-db
# Take a manual snapshot
aws rds create-db-snapshot \
--db-instance-identifier myapp-db \
--db-snapshot-identifier myapp-db-snap-$(date +%Y%m%d)
# Restore from snapshot
aws rds restore-db-instance-from-db-snapshot \
--db-instance-identifier myapp-db-restored \
--db-snapshot-identifier myapp-db-snap-20241215
# List instances
aws rds describe-db-instances \
--query 'DBInstances[*].[DBInstanceIdentifier,DBInstanceStatus,Endpoint.Address]'
| Instance Class | vCPU | RAM | Use For |
|---|---|---|---|
db.t3.micro | 2 | 1 GB | Dev / testing |
db.t3.medium | 2 | 4 GB | Small production |
db.r6g.large | 2 | 16 GB | Memory-heavy workloads |
db.r6g.4xlarge | 16 | 128 GB | Large production DB |
⚡ ElastiCache — Redis / Memcached
Managed in-memory cache. Use Redis for sessions, rate limiting, pub/sub, and caching. Typically ~10x faster than RDS for cacheable data.
# Create a Redis cluster
aws elasticache create-replication-group \
--replication-group-id myapp-redis \
--replication-group-description "App cache" \
--engine redis \
--engine-version "7.0" \
--cache-node-type cache.t3.micro \
--num-cache-clusters 2 \ # primary + 1 replica
--cache-subnet-group-name my-cache-subnet \
--security-group-ids sg-redis \
--at-rest-encryption-enabled \
--transit-encryption-enabled
# Connect (from inside VPC)
redis-cli -h myapp-redis.xxxxx.ng.0001.use1.cache.amazonaws.com -p 6379
# Describe clusters
aws elasticache describe-replication-groups \
--replication-group-id myapp-redis \
--query 'ReplicationGroups[*].NodeGroups[*].PrimaryEndpoint'
🔑 Secrets Manager & Parameter Store
# ── Secrets Manager (for sensitive values) ─────────────
# Store a secret
aws secretsmanager create-secret \
--name /myapp/production/db-password \
--secret-string "SuperSecret@123"
# Store JSON secret (multiple values)
aws secretsmanager create-secret \
--name /myapp/production/db \
--secret-string '{"host":"db.xxx.us-east-1.rds.amazonaws.com","port":"5432","user":"app","password":"Secret@1"}'
# Read a secret
aws secretsmanager get-secret-value \
--secret-id /myapp/production/db \
--query SecretString --output text | jq .
# Rotate secret
aws secretsmanager rotate-secret \
--secret-id /myapp/production/db-password
# ── Parameter Store (for config + non-sensitive values) ──
# Store a parameter
aws ssm put-parameter \
--name /myapp/production/log-level \
--value "info" \
--type String
# Store encrypted parameter (SecureString)
aws ssm put-parameter \
--name /myapp/production/api-key \
--value "sk-live-xxxxx" \
--type SecureString
# Get parameter
aws ssm get-parameter \
--name /myapp/production/api-key \
--with-decryption \
--query Parameter.Value --output text
# Get all parameters under a path
aws ssm get-parameters-by-path \
--path /myapp/production/ \
--with-decryption
🗄️ ECR — Container Registry
# Authenticate Docker to ECR
aws ecr get-login-password --region us-east-1 \
| docker login --username AWS \
--password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com
# Create repository
aws ecr create-repository \
--repository-name myapp \
--image-scanning-configuration scanOnPush=true \
--encryption-configuration encryptionType=AES256
# Build, tag, push
IMAGE=123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp
docker build -t $IMAGE:latest -t $IMAGE:$(git rev-parse --short HEAD) .
docker push $IMAGE --all-tags
# Set lifecycle policy (auto-expire old images)
aws ecr put-lifecycle-policy \
--repository-name myapp \
--lifecycle-policy-text file://ecr-lifecycle.json
🚢 ECS — Elastic Container Service
# Create cluster
aws ecs create-cluster \
--cluster-name production \
--capacity-providers FARGATE FARGATE_SPOT
# Register task definition
aws ecs register-task-definition --cli-input-json file://task-def.json
# Create service
aws ecs create-service \
--cluster production \
--service-name myapp \
--task-definition myapp:1 \
--desired-count 3 \
--launch-type FARGATE \
--load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:...,containerName=myapp,containerPort=8080" \
--network-configuration "awsvpcConfiguration={subnets=[subnet-a,subnet-b],securityGroups=[sg-app],assignPublicIp=DISABLED}"
# Deploy new image version
aws ecs update-service \
--cluster production \
--service myapp \
--force-new-deployment
# Wait for service to stabilize
aws ecs wait services-stable --cluster production --services myapp
# Shell into running Fargate task
TASK=$(aws ecs list-tasks --cluster production --service-name myapp --query 'taskArns[0]' --output text)
aws ecs execute-command \
--cluster production \
--task $TASK \
--container myapp \
--interactive --command "/bin/sh"
☸ EKS — Elastic Kubernetes Service
# Install eksctl
curl -sLO "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz"
tar xz -C /tmp -f *.tar.gz && sudo mv /tmp/eksctl /usr/local/bin/
# Create cluster with managed node group
eksctl create cluster \
--name production \
--region us-east-1 \
--nodegroup-name standard-workers \
--node-type t3.medium \
--nodes 3 --nodes-min 2 --nodes-max 10 \
--managed \
--with-oidc \ # enables IRSA
--ssh-access --ssh-public-key my-key
# Connect kubectl
aws eks update-kubeconfig --region us-east-1 --name production
# Add Fargate profile (serverless pods)
eksctl create fargateprofile \
--cluster production \
--name fp-myapp \
--namespace myapp
# Enable IAM role for a service account (IRSA)
eksctl create iamserviceaccount \
--name myapp-sa --namespace myapp \
--cluster production \
--attach-policy-arn arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess \
--approve
# Install ALB Ingress Controller
helm repo add eks https://aws.github.io/eks-charts
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
-n kube-system \
--set clusterName=production \
--set serviceAccount.name=aws-load-balancer-controller
📊 CloudWatch — Monitoring & Logs
# View logs from ECS container
aws logs tail /ecs/myapp --follow --format short
# Get logs from a specific time range
aws logs get-log-events \
--log-group-name /ecs/myapp \
--log-stream-name ecs/myapp/task-id \
--start-time $(date -d '1 hour ago' +%s000)
# Create a CloudWatch alarm
aws cloudwatch put-metric-alarm \
--alarm-name high-cpu \
--metric-name CPUUtilization \
--namespace AWS/ECS \
--dimensions Name=ClusterName,Value=production Name=ServiceName,Value=myapp \
--statistic Average \
--period 60 \
--evaluation-periods 3 \
--threshold 80 \
--comparison-operator GreaterThanThreshold \
--alarm-actions arn:aws:sns:us-east-1:123:alert-topic
# Get a metric
aws cloudwatch get-metric-statistics \
--namespace AWS/ECS \
--metric-name CPUUtilization \
--dimensions Name=ServiceName,Value=myapp Name=ClusterName,Value=production \
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%SZ) \
--end-time $(date -u +%Y-%m-%dT%H:%M:%SZ) \
--period 300 --statistics Average
🔁 Full CI/CD Pipeline on AWS
📌 AWS CLI Cheat Sheet
| Task | Command |
|---|---|
| Who am I | aws sts get-caller-identity |
| Switch profile | export AWS_PROFILE=myprofile |
| ECR login | aws ecr get-login-password | docker login --username AWS --password-stdin ACCOUNT.dkr.ecr.REGION.amazonaws.com |
| List S3 buckets | aws s3 ls |
| Sync to S3 | aws s3 sync ./dist s3://bucket/ --delete |
| Invalidate CloudFront | aws cloudfront create-invalidation --distribution-id ID --paths "/*" |
| List ECS clusters | aws ecs list-clusters |
| Deploy ECS | aws ecs update-service --cluster C --service S --force-new-deployment |
| Connect to EKS | aws eks update-kubeconfig --name CLUSTER --region REGION |
| Stream logs | aws logs tail /ecs/myapp --follow |
| Get secret | aws secretsmanager get-secret-value --secret-id /app/db --query SecretString --output text |
| Get parameter | aws ssm get-parameter --name /app/key --with-decryption |
| Describe RDS | aws rds describe-db-instances --query 'DBInstances[*].Endpoint.Address' |