Search for Well Architected Advice
-
Operational Excellence
-
- Resources have identified owners
- Processes and procedures have identified owners
- Operations activities have identified owners responsible for their performance
- Team members know what they are responsible for
- Mechanisms exist to identify responsibility and ownership
- Mechanisms exist to request additions, changes, and exceptions
- Responsibilities between teams are predefined or negotiated
-
- Executive Sponsorship
- Team members are empowered to take action when outcomes are at risk
- Escalation is encouraged
- Communications are timely, clear, and actionable
- Experimentation is encouraged
- Team members are encouraged to maintain and grow their skill sets
- Resource teams appropriately
- Diverse opinions are encouraged and sought within and across teams
-
- Use version control
- Test and validate changes
- Use configuration management systems
- Use build and deployment management systems
- Perform patch management
- Implement practices to improve code quality
- Share design standards
- Use multiple environments
- Make frequent, small, reversible changes
- Fully automate integration and deployment
-
- Have a process for continuous improvement
- Perform post-incident analysis
- Implement feedback loops
- Perform knowledge management
- Define drivers for improvement
- Validate insights
- Perform operations metrics reviews
- Document and share lessons learned
- Allocate time to make improvements
- Perform post-incident analysis
-
Security
-
- Separate workloads using accounts
- Secure account root user and properties
- Identify and validate control objectives
- Keep up-to-date with security recommendations
- Keep up-to-date with security threats
- Identify and prioritize risks using a threat model
- Automate testing and validation of security controls in pipelines
- Evaluate and implement new security services and features regularly
-
- Define access requirements
- Grant least privilege access
- Define permission guardrails for your organization
- Manage access based on life cycle
- Establish emergency access process
- Share resources securely within your organization
- Reduce permissions continuously
- Share resources securely with a third party
- Analyze public and cross-account access
-
- Perform regular penetration testing
- Deploy software programmatically
- Regularly assess security properties of the pipelines
- Train for Application Security
- Automate testing throughout the development and release lifecycle
- Manual Code Reviews
- Centralize services for packages and dependencies
- Build a program that embeds security ownership in workload teams
-
-
Reliability
-
- Be aware of service quotas and constraints in Cloud Services
- Manage service quotas across accounts and Regions
- Accommodate fixed service quotas and constraints through architecture
- Monitor and manage quotas
- Automate quota management
- Ensure sufficient gap between quotas and usage to accommodate failover
-
- Use highly available network connectivity for your workload public endpoints
- Provision Redundant Connectivity Between Private Networks in the Cloud and On-Premises Environments
- Ensure IP subnet allocation accounts for expansion and availability
- Prefer hub-and-spoke topologies over many-to-many mesh
- Enforce non-overlapping private IP address ranges in all private address spaces where they are connected
-
- Monitor end-to-end tracing of requests through your system
- Conduct reviews regularly
- Analytics
- Automate responses (Real-time processing and alarming)
- Send notifications (Real-time processing and alarming)
- Define and calculate metrics (Aggregation)
- Monitor End-to-End Tracing of Requests Through Your System
- Define and calculate metrics
- Send notifications
- Automate responses
-
- Monitor all components of the workload to detect failures
- Fail over to healthy resources
- Automate healing on all layers
- Rely on the data plane and not the control plane during recovery
- Use static stability to prevent bimodal behavior
- Send notifications when events impact availability
- Architect your product to meet availability targets and uptime service level agreements (SLAs)
-
-
Cost Optimization
-
- Establish ownership of cost optimization
- Establish a partnership between finance and technology
- Establish cloud budgets and forecasts
- Implement cost awareness in your organizational processes
- Monitor cost proactively
- Keep up-to-date with new service releases
- Quantify business value from cost optimization
- Report and notify on cost optimization
- Create a cost-aware culture
-
- Perform cost analysis for different usage over time
- Analyze all components of this workload
- Perform a thorough analysis of each component
- Select components of this workload to optimize cost in line with organization priorities
- Perform cost analysis for different usage over time
- Select software with cost effective licensing
-
-
Performance
-
- Learn about and understand available cloud services and features
- Evaluate how trade-offs impact customers and architecture efficiency
- Use guidance from your cloud provider or an appropriate partner to learn about architecture patterns and best practices
- Factor cost into architectural decisions
- Use policies and reference architectures
- Use benchmarking to drive architectural decisions
- Use a data-driven approach for architectural choices
-
- Use purpose-built data store that best support your data access and storage requirements
- Collect and record data store performance metrics
- Evaluate available configuration options for data store
- Implement Strategies to Improve Query Performance in Data Store
- Implement data access patterns that utilize caching
-
- Understand how networking impacts performance
- Evaluate available networking features
- Choose appropriate dedicated connectivity or VPN for your workload
- Use load balancing to distribute traffic across multiple resources
- Choose network protocols to improve performance
- Choose your workload's location based on network requirements
- Optimize network configuration based on metrics
-
- Establish key performance indicators (KPIs) to measure workload health and performance
- Use monitoring solutions to understand the areas where performance is most critical
- Define a process to improve workload performance
- Review metrics at regular intervals
- Load test your workload
- Use automation to proactively remediate performance-related issues
- Keep your workload and services up-to-date
-
-
Sustainability
-
- Scale workload infrastructure dynamically
- Align SLAs with sustainability goals
- Optimize geographic placement of workloads based on their networking requirements
- Stop the creation and maintenance of unused assets
- Optimize team member resources for activities performed
- Implement buffering or throttling to flatten the demand curve
-
- Optimize software and architecture for asynchronous and scheduled jobs
- Remove or refactor workload components with low or no use
- Optimize areas of code that consume the most time or resources
- Optimize impact on devices and equipment
- Use software patterns and architectures that best support data access and storage patterns
- Remove unneeded or redundant data
- Use technologies that support data access and storage patterns
- Use policies to manage the lifecycle of your datasets
- Use shared file systems or storage to access common data
- Back up data only when difficult to recreate
- Use elasticity and automation to expand block storage or file system
- Minimize data movement across networks
- Implement a data classification policy
- Remove unneeded or redundant data
-
- Articles coming soon
< All Topics
Print
Secure and encrypt backups
PostedDecember 20, 2024
UpdatedMarch 22, 2025
ByKevin McCaffrey
Securing and encrypting backups is critical to ensure data integrity and confidentiality, especially in a cloud environment. It protects sensitive information against unauthorized access and data corruption, aligning with recovery time objectives (RTO) and recovery point objectives (RPO).
Best Practices
Implement Robust Backup Encryption
- Use strong encryption algorithms (e.g., AES-256) to protect backup data both at rest and in transit. This ensures that even if unauthorized access occurs, the data remains unreadable.
- Utilize AWS Key Management Service (KMS) for key management to maintain control over your encryption keys. Regularly rotate keys to enhance security.
- Ensure that your backup solution supports encryption natively; this minimizes the risk of human error in the encryption process.
Establish Access Controls for Backups
- Implement strict IAM policies that define who can access backup data. Use roles and permissions to limit access to only those who need it.
- Enable CloudTrail to monitor all access to backup data. This helps in detecting any unauthorized access attempts, allowing you to respond quickly to potential threats.
- Use Multi-Factor Authentication (MFA) for sensitive operations related to backup access, adding an additional layer of security.
Regularly Test Backup Integrity
- Conduct regular restoration tests to ensure that backups can be successfully restored and that data integrity is maintained.
- Utilize checksums or hashes for backup files to verify data integrity and detect corruption or unauthorized changes. Regularly review and update your verification methods.
- Incorporate automated tools that regularly perform integrity checks on backups, alerting you to any issues promptly.
Questions to ask your team
- What methods are you using to encrypt your backups?
- How do you manage access controls for backups?
- Are you regularly testing your backup encryption to ensure its effectiveness?
- What measures do you have in place to detect unauthorized access to backups?
- How frequently do you perform integrity checks on your backups?
- Are your backup configurations documented and easily accessible?
- Do you have an incident response plan for compromised backups?
Who should be doing this?
Backup Administrator
Security Officer
What evidence shows this is happening in your organization?
- Backup and Encryption Policy: A comprehensive policy document outlining the guidelines for backing up data, applications, and configurations, including encryption standards, access controls, and compliance requirements.
- Backup Encryption Checklist: A step-by-step checklist to ensure that all backups are securely encrypted, detailing the processes for implementing encryption and validating data integrity.
- Disaster Recovery Plan: A formal plan that details the procedures for data backup, restoration processes, RTOs and RPOs, emphasizing secure and encrypted backups to meet recovery objectives.
- Backup Strategy Guide: A guide that outlines the strategies for securely backing up data, including the use of encryption techniques, access management, and monitoring for integrity violations.
- Data Backup Dashboard: An interactive dashboard that provides real-time metrics on backup status, encryption compliance, access logs, and integrity checks for backups across the organization.
Cloud Services
AWS
- AWS Backup: AWS Backup simplifies the backup process by enabling policy-based backup for AWS services, ensuring backups are secure and properly managed.
- Amazon S3 (with SSE): Amazon S3 can be used to store encrypted backups with server-side encryption (SSE) to protect data at rest.
- AWS IAM: AWS Identity and Access Management (IAM) controls access to backups through authentication and authorization mechanisms.
Azure
- Azure Backup: Azure Backup provides backup as a service to securely back up and restore data while ensuring compliance and enhanced data protection.
- Azure Blob Storage with Encryption: Azure Blob Storage supports encryption to help protect backup data at rest, offering multiple security options.
- Azure Active Directory: Azure Active Directory enables secure access management for backups, controlling who can access data and services.
Google Cloud Platform
- Google Cloud Storage with Encryption: Google Cloud Storage provides secure storage for backups with built-in encryption options for data at rest and in transit.
- Google Cloud Backup and DR: Google Cloud Backup and DR (Disaster Recovery) enables automated backups across cloud services, ensuring data integrity and security.
- Google Cloud IAM: Google Cloud Identity and Access Management (IAM) allows you to manage access to your backups securely, with fine-grained permissions.
Question: How do you back up data?
Pillar: Reliability (Code: REL)
Table of Contents