Open Dash Tutorial
Hello and welcome back to Open Dash! In this tutorial, we'll compare AWS database services and clarify where and how to use each for your cloud-native architectures.
SQL / Relational
OLTP Workloads
NoSQL / Key-Value
Real-time Applications
Data Warehouse
OLAP Workloads
Understanding the AWS Database Ecosystem
Many developers get confused when choosing between AWS database services, especially when starting with cloud-native architectures. Let's break down the primary options:
Type: SQL / Relational
Traditional relational database service with full SQL support
Best For:
Type: NoSQL / Key-Value & Document
Serverless, fully managed NoSQL database service
Best For:
Type: Data Warehouse / Columnar
Fully managed, petabyte-scale data warehouse service
Best For:
SQL / Relational Database Solution
Amazon RDS is a managed relational database service that makes it easy to set up, operate, and scale a relational database in the cloud. It provides cost-efficient and resizable capacity while automating time-consuming administration tasks.
Ideal Use Cases:
Online Transaction Processing (OLTP), booking systems, point-of-sale systems, finance applications
Supports MySQL, PostgreSQL, SQL Server, Oracle, MariaDB, and Aurora
Single-AZ (cost-effective) or Multi-AZ (auto failover, high availability)
For read-heavy workloads, analytics, or reporting (same/cross-region)
VPC integration, security groups, automated backups, and snapshots
AWS Region
Availability Zone 1
Primary DB
Availability Zone 2
Standby DB
Automatic failover in Multi-AZ deployments
Synchronous data replication for high durability
Deployment Options & High Availability
Cost-effective option with a single database instance in one Availability Zone.
Primary DB Instance
Availability Zone A
Consideration: Potential downtime during maintenance or failures
High availability option with automatic failover to standby instance.
Primary DB
Availability Zone A
Sync
Replication
Standby DB
Availability Zone B
Benefit: Automatic failover, enhanced availability, better durability
Improve read performance by offloading read queries to replicas. Can be created in the same region or cross-region.
Primary DB
Writes
Async
Replication
Read Replica
Reads
Read Replica
Reads
Read Replica
Reads
NoSQL Database Service
Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability. It's a key-value and document database that can handle any scale of application with consistent, single-digit millisecond latency.
Ideal Use Cases:
Real-time applications, serverless workloads, IoT systems, gaming leaderboards, session management
No servers to provision, patch, or manage; AWS handles all of this for you
Single-digit millisecond latency at any scale, handling millions of requests per second
Data automatically replicated across multiple AZs for high durability
No need to define disk size; storage automatically scales with your data
Partition Key (Hash Key) only
Partition Key + Sort Key
UserId (Partition Key) | GameId (Sort Key) | Attributes |
---|---|---|
user_123 | game_1 | Score: 94, Level: 5 |
user_123 | game_2 | Score: 82, Level: 3 |
user_456 | game_1 | Score: 77, Level: 4 |
You specify read and write capacity units in advance. Best for predictable workloads.
Pay-per-request model with no capacity planning. Best for variable or unpredictable workloads.
High Availability & Global Distribution
DynamoDB Global Tables provide a fully managed, multi-region, multi-active database solution for building globally distributed applications with low-latency data access.
DynamoDB Table
Multi-AZ Replicated
Multi-Region
Replication
DynamoDB Table
Multi-AZ Replicated
Benefits: Global read/write access with low latency, built-in conflict resolution, and regional fault tolerance
Automatically expire and delete items based on a timestamp attribute. Perfect for session data, temporary data, or logs that should be automatically removed after a certain time.
Capture item-level changes in your tables and send change records in near real-time to Lambda for event-driven processing, or for replication, analytics, and more.
In-memory cache designed specifically for DynamoDB that delivers microsecond response times for read-heavy workloads. Access DynamoDB through the DAX client for up to 10x performance improvement.
Coordinate all-or-nothing changes across multiple items within and across tables. Provides atomicity, consistency, isolation, and durability (ACID) capabilities.
Leaderboards, player data, game state, and session management with low-latency at any scale
Device metadata, sensor data storage, and real-time event processing
Perfect complement to Lambda functions for truly serverless, scalable applications
Choosing the Right Database for Your Workload
Feature |
Amazon RDS
|
Amazon DynamoDB
|
---|---|---|
Database Type | Relational (SQL) | NoSQL (Key-Value & Document) |
Performance | Good for complex transactional workloads | Single-digit millisecond latency at any scale |
Scalability | Vertical scaling (instance size) with read replicas | Horizontal scaling with unlimited throughput capacity |
Administration | Managed, but requires some administration (DB parameters, patching schedules) | Fully managed serverless experience with minimal administration |
Data Model | Structured schema with tables, columns, and relationships | Schema-less with flexible attributes (requires partition key) |
Query Capabilities | Complex joins, aggregations, subqueries | No joins, limited to key-based access patterns |
Serverless | No (except Aurora Serverless) | Yes |
Global Distribution | Read replicas across regions | Global Tables with multi-region replication |
RDS Ideal For:
E-commerce platforms, content management systems, financial applications, ERP systems
DynamoDB Ideal For:
Mobile backends, gaming leaderboards, IoT data, session stores, real-time analytics
Data Warehousing on AWS
Amazon Redshift is a fully managed, petabyte-scale data warehouse service that makes it simple and cost-effective to analyze all your data using standard SQL and your existing business intelligence tools. It is optimized for Online Analytical Processing (OLAP) workloads and is designed to handle large-scale data analysis and complex queries.
Ideal Use Cases:
Business intelligence, reporting and dashboarding, complex aggregations and joins across large volumes of data
Leader Node
Query Planning & Aggregation
Compute Nodes
Compute Node 1
Compute Node 2
Compute Node 3
Column 1
Column 2
Column 3
Column 4
Columnar Storage Format
Stores data by columns instead of rows, making aggregations and scanning operations faster for analytics
Distributes query execution across multiple nodes, reducing query time drastically
Automatically applies compression algorithms column-wise, reducing storage costs and improving I/O performance
Automatically adds extra capacity to handle bursts of queries without performance degradation
Amazon S3
via COPY command
AWS Glue
ETL/ELT jobs
Amazon Kinesis
Real-time data
AWS DMS
Database migration
Performance Optimization & Integrations
Define how data is sorted on disk. Improves query performance by minimizing data scanned and enabling efficient range filtering.
Controls how data is distributed across nodes. Aligning with join keys minimizes data movement between nodes during query execution.
Pre-compute and store complex query results for faster access. Redshift automatically maintains and refreshes these views as underlying data changes.
Automatically adds cluster capacity to handle increases in concurrent queries, maintaining consistent performance during peak usage periods.
Redshift Tables
Internal data
JOIN
Amazon S3
External data
How It Works: Query data directly in S3 without loading it into Redshift.
Region-specific: Redshift clusters operate within a single AWS region
RA3 Nodes: Scale compute and storage independently for flexibility
RDS vs DynamoDB vs Redshift: Making the Right Choice
Choose the right database service based on your specific workload requirements and use cases
Features |
|
|
|
---|---|---|---|
Database Type | Relational (SQL) | NoSQL (Key-Value & Document) | Data Warehouse (Columnar) |
Primary Workload | OLTP (Online Transaction Processing) | Real-time applications, high throughput | OLAP (Online Analytical Processing) |
Scalability | Vertical scaling with read replicas | Unlimited horizontal scaling (serverless) | MPP architecture, scale by adding nodes |
Performance | Millisecond response times | Single-digit millisecond latency | Optimized for complex queries over large datasets |
Query Complexity | Complex SQL, joins, subqueries | Limited to key-based access | Complex analytical queries |
Serverless Option | Aurora Serverless only | Fully serverless | No |
Administration | Managed service (some administration) | Zero administration | Managed service (requires tuning) |
Capacity Planning | Instance size & storage planning | On-demand or provisioned capacity | Node type & count planning |
Cost Model | Hourly instance charges + storage | Pay per request or provisioned capacity | Hourly node charges + storage |
Global Distribution | Cross-region read replicas | Global Tables (multi-active) | Cross-region snapshots only |
Common Patterns & Implementation Examples
Explore real-world architectures leveraging AWS database services for specific use cases
Traditional web application with separated presentation, business logic, and data layers
Example Use Case:
E-commerce platform with product catalogs, customer accounts, and order processing
Event-driven, scalable architecture with no server management
Example Use Case:
Mobile app backend for a social media platform with user profiles, activity feeds, and notifications
Data warehouse solution for business intelligence and reporting
Example Use Case:
Retail company analyzing sales trends, customer behavior, and inventory optimization across thousands of stores
Modern applications often combine multiple database services for different workloads:
CQRS Pattern
Use RDS for write operations (commands) and DynamoDB for read operations (queries) to optimize for different access patterns
Data Pipeline Pattern
Operational data in RDS/DynamoDB with periodic ETL into Redshift for analytics and reporting
Moving to AWS Database Services Efficiently
Strategies to efficiently migrate your current databases to AWS services
Pros:
Cons:
Pros:
Cons:
Pros:
Cons:
Use DynamoDB Streams or RDS with AWS DMS to trigger Lambda functions for real-time processing and cross-service updates.
AWS Glue jobs to transform and move data between services, such as from operational databases to Redshift for analysis.
Create microservices with API Gateway and Lambda to provide unified access to different database services.
Getting the Most from Your AWS Database Services
Implement these practices to optimize costs while maintaining performance
Right-size your instances
Monitor CloudWatch metrics and adjust instance sizes to match actual workload needs
Reserved Instances
Purchase Reserved Instances for predictable workloads to receive significant discounts
Storage optimization
Use gp3 volumes when possible and regularly clean up unused snapshots
Choose the right capacity mode
Use on-demand for variable traffic and provisioned with auto-scaling for predictable traffic
Reserved Capacity
Purchase reserved capacity for stable, predictable workloads
Optimize TTL & item size
Configure TTL for temporary data and keep item sizes small
Use RA3 nodes with managed storage
Scale compute and storage independently to optimize costs
Leverage pause and resume
Pause clusters during idle periods to reduce costs
Optimize storage with compression
Use appropriate compression encodings for columns
Monitor database costs, identify savings opportunities, and receive recommendations to optimize resource utilization and costs
Protecting Your Data in the Cloud
Implement these foundational security practices across all AWS database services
Enable encryption at rest and in transit for all database services
Implement least privilege IAM permissions and service-specific access controls
Use VPC, security groups, and network ACLs to restrict network access
Enable CloudTrail, AWS Config, and service-specific logging
Use Security Groups
Restrict inbound access to specific CIDR ranges and security groups
Enable SSL/TLS for connections
Force all database connections to use encryption in transit
Implement IAM Database Authentication
Use IAM roles and users for database authentication
Enable Automated Backups
Configure backups with appropriate retention periods
Security Note: Rotate database credentials regularly and avoid hardcoding them in application code
Fine-grained access control
Use IAM policies with conditions to restrict access to specific items and attributes
VPC Endpoints
Use VPC endpoints to access DynamoDB without traversing the public internet
Enable Point-in-time Recovery
Protect against accidental writes or deletes with continuous backups
Use CMKs for enhanced encryption
Utilize customer-managed KMS keys for more control over encryption
Security Note: When using Global Tables, ensure IAM policies account for multi-region resources
Configure cluster encryption
Enable encryption at rest for the entire cluster and specify KMS key
Enhanced VPC Routing
Enable to ensure traffic between your cluster and data repositories flows through your VPC
Column-level access controls
Restrict access to sensitive columns using view-based access control
Audit logging
Enable audit logging to track connection attempts, queries, and changes
Security Note: Regularly audit user permissions and rotate admin credentials
Security is Everyone's Responsibility
Remember that AWS provides the tools but security implementation follows the shared responsibility model. You are responsible for security in the cloud while AWS is responsible for security of the cloud.
How to Choose the Right Database Service for Your Workload
Unexpected costs from over-provisioning or inefficient usage
Solutions:
Slow queries, throttling, or inefficient data access patterns
Solutions:
Challenges in moving data between database types
Solutions:
Modern applications often benefit from using multiple database services together, each optimized for specific workloads:
RDS for Transactional Data
Primary records, financial transactions, normalized data models
DynamoDB for High-Velocity Data
User sessions, real-time data, high-throughput access patterns
Redshift for Analytics
Historical aggregations, reporting, data warehousing needs
Emerging Technologies and Best Practices in AWS Database Services
The future of AWS databases is evolving rapidly with these key developments
Beyond Aurora Serverless and DynamoDB, expect more serverless database options with automatic scaling and pay-per-use models across AWS database portfolio
Coming soon: Enhanced serverless analytics and more granular resource control
Machine learning capabilities embedded directly within database services for intelligent query optimization, anomaly detection, and predictive scaling
Watch for: Enhanced integration between Amazon SageMaker and database services
Preparation for quantum computing's impact on cryptography and database processing with quantum-resistant encryption and optimized algorithms
Research focus: Amazon Braket integration with data processing workflows
Sophisticated architectural approaches leveraging AWS database services
Using multiple database types for different data needs within a single application
Example Implementation:
Using database change events to trigger downstream processes and maintain data consistency
Example Implementation:
Leveraging services that support multiple data models to simplify architecture
Example Implementation:
Follow AWS Database Blog, attend re:Invent sessions, and join AWS database webinars to learn about the latest features and best practices
Key Takeaways and Implementation Guidance
Assess your workload requirements
Determine your data structure needs, access patterns, scalability requirements, and performance SLAs
Create proof-of-concept deployments
Test your specific use cases with sample data in each relevant database service
Implement monitoring and cost controls
Set up CloudWatch alerts, performance monitoring, and budget thresholds before scaling
Establish backup and disaster recovery
Configure appropriate backup schedules, retention policies, and cross-region strategies
Over-provisioning resources
Start small and scale as needed, especially for DynamoDB provisioned capacity
Ignoring data access patterns
DynamoDB and Redshift performance depend heavily on understanding your access patterns
Neglecting security best practices
Always implement encryption, least-privilege access, and proper VPC controls
Forcing a single database for all workloads
Consider purpose-built database strategy for complex applications
AWS Database Services Comparison: RDS vs DynamoDB vs Redshift
Thank you for joining us on this journey through AWS database services. We hope this comparison helps you make the right choices for your cloud architecture.
Learn how to configure high-availability RDS databases
Create multi-region, low-latency database setups
Implement gaming leaderboards with DynamoDB
Don't forget to like, share, and subscribe to Open Dash for more cloud and DevOps content.