Cloud Technologies
Module 6: Cloud AWS
Duration: 70 Hours
Topic 6.1: AWS Fundamentals
Theory:
Introduction to AWS Global Infrastructure (Regions, AZs)
IAM: Roles, Users, Policies
EC2: Elastic Compute Cloud – virtual servers
S3: Simple Storage Service – object storage
EBS: Elastic Block Store – persistent volumes
Lab:
Launch an EC2 instance and connect via SSH
Store and retrieve data from S3
Create IAM roles and attach them to services
Scenarios:
Use EC2 to run a batch Spark job
Store daily reports in S3 for compliance Tasks:
Upload KYC documents to S3 with versioning
Configure an EC2 instance to pull data and write to S3 Challenges:
EC2 boot time delays affecting pipeline schedules
IAM permission errors causing S3 upload failures
Topic 6.2: AWS for Data Engineering
Theory:
RDS & DynamoDB: managed relational and NoSQL databases
Lambda: serverless execution
EventBridge & CloudWatch: event triggers and monitoring
ECS/EKS: container services
Data lifecycle policies and cost control
Lab:
Launch an RDS PostgreSQL instance and connect via SQL client
Build a Lambda function to trigger ETL on new file upload
Set up CloudWatch alarms for data latency
Scenarios:
Trigger fraud detection pipeline when new transaction files land in S3
Monitor Spark jobs running on EKS
Tasks:
Write a Lambda to process CSV and store results in DynamoDB
Deploy containerized pipeline using ECS
Challenges:
Debugging Lambda failures (timeout, memory limits)
Ensuring secure cross-service access via IAM roles
Cost tracking across environments