Interview Questions Booklet - AWS

AWS — Interview Questions Booklet (50 Q&A)

Accounts & IAM • VPC & Networking • EC2/Lambda/Containers • S3/EBS/EFS/DBs • Eventing & Integration • Observability & Ops • Security & Governance • Cost & Well-Architected • Real-world Scenarios

Section 1 — AWS Fundamentals & Account Strategy

1) What are AWS Regions and Availability Zones, and why do they matter for resiliency?

Answer: Regions are separate geographic areas; AZs are isolated data centers within a region. Spanning AZs protects against single-facility failures and reduces latency to users in chosen regions.

2) When should an organization use a multi-account strategy with AWS Organizations?

Answer: Use multiple accounts to isolate workloads, billing, and blast radius; apply guardrails with SCPs; enable cost allocation and delegated admin per domain (e.g., prod vs. dev, business units).

3) How do Service Control Policies (SCPs) differ from IAM policies?

Answer: SCPs set account/OU-level maximum permissions (guardrails). IAM policies grant permissions to principals inside an account but cannot exceed SCP boundaries.

4) What is the Shared Responsibility Model on AWS?

Answer: AWS secures the cloud (facilities, hardware, managed services); customers secure what they run/configure in the cloud (data, identity, network, OS, apps).

5) How do landing zones help large enterprises start on AWS?

Answer: They provide baseline multi-account structure, identity, network, logging, and guardrails so teams can onboard workloads consistently and compliantly.

Section 2 — Identity & Access Management

6) How do IAM users, groups, and roles differ, and which should be preferred?

Answer: Users are long-lived identities; groups bundle permissions; roles are assumable, short-lived credentials. Prefer roles + SSO/MFA over static user keys.

7) What is the principle of least privilege, and how do you enforce it on AWS?

Answer: Grant only required actions/resources, use IAM conditions (tags, IPs), scoped roles, access advisor, and permission boundaries; review regularly.

8) When would you use resource-based policies instead of identity-based policies?

Answer: Use resource policies on S3, KMS, SQS, etc., to allow cross-account access or public access patterns without granting permissions to the caller’s identity.

9) How do AWS SSO (IAM Identity Center) and federation improve access management?

Answer: They centralize identity via corporate IdPs, provide SSO into accounts/roles, enforce MFA, and issue short-lived credentials with auditability.

10) What is AWS KMS used for, and how do CMKs integrate with services?

Answer: KMS manages encryption keys; services like S3, EBS, RDS can use CMKs for server-side encryption, with fine-grained access via key policies and grants.

Section 3 — Networking & Connectivity (VPC)

11) What core components make up a VPC, and how do they interact?

Answer: Subnets (public/private), route tables, Internet/NAT gateways, NACLs, security groups, and endpoints; routes + ACLs/SecGroups control reachability and filtering.

12) How do security groups and network ACLs differ?

Answer: Security groups are stateful, instance-level firewalls; NACLs are stateless, subnet-level rules. Use SGs for most filtering; NACLs for coarse subnet rules.

13) When do you choose VPC peering, Transit Gateway, or AWS PrivateLink?

Answer: Peering: simple 1-to-1 private routing. Transit Gateway: hub-and-spoke scalable routing across many VPCs/VPNs. PrivateLink: private access to specific services via endpoints.

14) What are interface and gateway endpoints, and why use them?

Answer: Interface endpoints (ENIs) privately connect to AWS services/PrivateLink; gateway endpoints route S3/DynamoDB via private paths—improving security and avoiding NAT egress.

15) How do you connect on-prem networks to AWS securely and reliably?

Answer: Use Site-to-Site VPN for quick IPSec tunnels, Direct Connect for dedicated links/consistent throughput, or both for resilience (DX + VPN failover).

Section 4 — Compute & Containers

16) What EC2 purchase options exist, and when is each appropriate?

Answer: On-Demand (flexible), Savings Plans/Reserved Instances (steady workloads), and Spot (fault-tolerant, up to steep discounts). Mix for cost/perf.

17) How do Auto Scaling Groups maintain availability and cost efficiency?

Answer: They scale instances based on metrics/schedules, replace unhealthy nodes across AZs, and support mixed instance/Spot strategies.

18) When would you choose Lambda over EC2?

Answer: Choose Lambda for event-driven, intermittent workloads needing zero-admin and sub-minute execution; use EC2 for long-running, custom OS/runtime needs.

19) What are the differences between ECS, EKS, and Fargate?

Answer: ECS is AWS-managed orchestration; EKS is managed Kubernetes; Fargate runs containers serverlessly for either ECS/EKS without managing EC2.

20) How do you implement blue/green or canary deployments for compute workloads?

Answer: Use load balancers/weighted routing, CodeDeploy for EC2/ECS, or service mesh/gateway for EKS; shift traffic gradually with health checks and rollbacks.

Section 5 — Storage & Databases

21) What are key S3 storage classes, and how do lifecycle policies reduce cost?

Answer: Standard, Intelligent-Tiering, Standard-IA/One Zone-IA, Glacier Instant/Flexible/Deep Archive. Lifecycle moves/expunges objects based on age/access.

22) How do S3 encryption options differ (SSE-S3, SSE-KMS, SSE-C, client-side)?

Answer: SSE-S3 uses AWS-managed keys, SSE-KMS uses CMKs with auditing, SSE-C lets you supply keys, and client-side encrypts before upload for full client control.

23) When do you use EBS, EFS, or FSx for workloads?

Answer: EBS: block storage per instance; EFS: shared POSIX file storage across instances; FSx: managed file systems like Windows/ONTAP/Lustre for specific protocols/perf.

24) How do RDS Multi-AZ and read replicas differ?

Answer: Multi-AZ provides synchronous standby for HA/failover; read replicas are asynchronous copies for read scaling and offloading analytics.

25) What DynamoDB design choices affect scale and cost?

Answer: Choosing partition keys for even workload, using on-demand vs. provisioned + autoscaling, GSIs/LSIs judiciously, TTL to prune data, and DAX for caching.

Section 6 — Serverless & Integration

26) How do API Gateway and Application Load Balancer differ for HTTP APIs?

Answer: API Gateway is feature-rich for serverless APIs (auth, throttling, usage plans). ALB suits L7 load-balancing to containers/EC2/Lambda with simpler routing.

27) When do you choose SQS vs. SNS vs. EventBridge?

Answer: SQS for durable queues and decoupling, SNS for pub/sub fan-out, EventBridge for event bus with routing by rules across SaaS/AWS sources.

28) What are Step Functions Standard vs. Express workflows used for?

Answer: Standard for long-running, exactly-once orchestrations with history; Express for high-throughput, short-lived orchestrations at lower cost.

29) How do you handle Lambda cold starts and concurrency limits?

Answer: Use provisioned concurrency, keep packages slim, reuse connections, and request concurrency increases with reserved/concurrent limits per function.

30) How does AWS Glue/Athena fit into a serverless data lake?

Answer: Glue catalogs ETL/metadata; Athena queries S3 with SQL on demand; together they enable schema-on-read analytics without managing clusters.

Section 7 — Observability, Operations & Automation

31) What are CloudWatch Metrics, Logs, and Alarms used for?

Answer: Metrics track numeric telemetry, Logs capture application/system output, and Alarms notify/act when thresholds or anomaly detections trigger.

32) How do CloudTrail and AWS Config complement each other?

Answer: CloudTrail records API activity for audit; Config tracks resource state/drift and evaluates compliance against rules over time.

33) What is AWS X-Ray, and when is it valuable?

Answer: X-Ray provides distributed tracing across services/functions, helping find latency bottlenecks and error hotspots in microservices/serverless apps.

34) How does Systems Manager (SSM) help with fleet operations?

Answer: SSM enables patching, state management, secure remote commands/Session Manager, parameter/secret storage, and runbooks for automation.

35) What options exist for IaC on AWS, and how do they differ?

Answer: CloudFormation (declarative templates), CDK (code-driven generating CFN), and Terraform (multi-cloud). Choose based on ecosystem and governance.

Section 8 — Security, Governance & Compliance

36) How do you protect internet-facing applications on AWS?

Answer: Use WAF for L7 filtering, Shield for DDoS protection, ALB/CloudFront for edge defense, TLS everywhere, and least-privilege backends.

37) What are GuardDuty, Inspector, and Macie used for?

Answer: GuardDuty: threat detection from logs; Inspector: automated vulnerability scanning; Macie: sensitive data discovery in S3.

38) How do you enforce baseline governance across many accounts?

Answer: Use Organizations with SCPs, Control Tower blueprints/guardrails, centralized logging/CloudTrail, and Config conformance packs.

39) What S3 features reduce data exfiltration risk?

Answer: Block Public Access, bucket policies with aws:PrincipalOrgID, VPC endpoints, object ownership, and SSE-KMS with strict key policies.

40) How do you design key management for compliance?

Answer: Use CMKs with rotation, separate admin vs. usage roles, key grants for services, detailed CloudTrail/CloudWatch logging, and periodic access reviews.

Section 9 — Cost Management & Architecture

41) How do you track and allocate AWS costs effectively?

Answer: Use cost allocation tags, Cost Explorer, Budgets/alerts, and split by accounts/OU; enable CUR for detailed analytics and chargeback.

42) When should you choose Savings Plans over Reserved Instances?

Answer: Savings Plans provide flexible savings across instance families/regions/services; RIs can offer specific benefits but are less flexible. Choose based on predictability and flexibility needs.

43) How do the AWS Well-Architected pillars guide design decisions?

Answer: They cover Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization, and Sustainability—use reviews to find risks and remediations.

44) How do you architect for high availability vs. disaster recovery?

Answer: HA uses multi-AZ with automatic failover; DR plans cross-region strategies (backup/restore, pilot-light, warm standby, active-active) based on RPO/RTO.

45) What techniques reduce data transfer and egress costs?

Answer: Keep traffic within regions/VPCs, use PrivateLink/endpoints, compress content, cache with CloudFront, and review inter-AZ/region flows.

Section 10 — Real-World Scenarios & Troubleshooting

46) An S3 bucket suddenly became public; how do you lock it down quickly?

Answer: Enable Block Public Access at account/bucket, remove public ACLs/policies, add aws:PrincipalOrgID restrictions, and verify via Access Analyzer.

47) An EC2 service is unreachable from the internet; what are your first checks?

Answer: Confirm public subnet + route to IGW, security group ingress, NACLs, instance health/port listening, and correct ALB/NLB target registration.

48) A Lambda function is timing out intermittently; how do you triage?

Answer: Check CloudWatch logs/X-Ray traces, increase timeout/memory (CPU), reuse connections, add retries/backoff, and examine downstream latency.

49) Cross-account access to an S3 bucket fails; what might be wrong?

Answer: Missing bucket policy allow, wrong principal/role trust, KMS key policy denies, or Block Public Access interfering with intended access.

50) A database migration to RDS is behind schedule; how can you accelerate safely?

Answer: Use DMS with ongoing replication, pre-cutover validation, scale target IOPS/instance, enable Multi-AZ after load, and perform phased cutover during low traffic.

Cutting-edge Technology Courses by Uplatz