Sunday, February 23, 2014

Cloud Guidelines

Area Title Description Platform specific
Architecture, Testing HA environment, Failover Testing Have capability to artificially fail any system component. The system should recover from failure and still meet NFRs All
Design Check feasibility of abstracting the Cloud provider API Helps support Cloud agnostic development and avoid vendor lock-in; Faster time to market since developers need not learn different APIs from multiple cloud vendors All
Coding Static analysis Make sure static analysis is incorporated into build process - High quality changes are more important in cloud environment than in non-cloud environment All
Architecture Multi-cloud usage Evaluate services from multiple cloud providers and chose the one that suits the requirements the most; Architect application to be able to handle multi-cloud (Both private and public) environments All
Architecture Application bifurcation Evaluate if an application needs to be broken up to use private cloud for some functionality and public for others All
Security Use appropriate algorithms Use AES 256 bit symmetric encryption to encrypt sensitive data at rest
Use RSA 2096 bits or higher for certificates
Use SHA2-256 or higher for hash and message digest
All
Security Security boundary Divide security responsibilities clearly between cloud provider and consumer; Define trust boundary clearly too All
Security Key storage Do not store sensitive keys on your own if your cloud provider provides service for key/certificate management All
Architecture Difference between claimed and actual SLAs Ascertain any gap between claimed and actual SLAs and architect with the lower number in mind. Don't blindly rely on numbers published by Cloud provider All
Architecture Conformance to standards Check if a Cloud service meets any industry standard and give it a higher rating than the one which does not All
Deployment Repeatable and automated Make deployment completely automated All
Security Security planning Plan for handling security breaches and how do you recover from them. This plan may include notifying customers that a breach has occurred and how are they impacted because of the breach All
Architecture Software fatigue Consider automatically replacing instances (that have been running for quite some time) with fresh instances to avoid software fatigue (undetected memory leaks etc.) All
Architecture Put dynamic data near computing infrastructure and static data near users Keep dynamic data near computing instances - (For example, move data to Cloud first before processing); Move static data closer to users to avoid latency (for example through AWS CloudFront) All
Security Protect cloud credentials Rotate keys on a regular basis; Do not store keys on Cloud infrastructure or storage
Make sure certificates are renewed on an annual basis
Use multi-factor authentication wherever feasible
Use cloud infrastructure to store keys wherever such a service is available
All
Security Reduce attack surface Keep attack surface as low as possible by limiting number of ports that are open and the IP's to which those ports are open All
Security Certifications Check and make sure the Cloud provider meets the minimum set of applicable certifications (such as HIPPA, security etc.) and audit requirements All
General Pricing Always check if there is room for any price related negotiation with Cloud provider. Cloud provider may be ready to offer volume based discounts beyond published rates All
Software development Guidelines Publish cloud specific coding guidelines to the team at the start of the development to avoid any surprises later All
Architecture Tagging cloud resources See if Cloud provider allows you to tag/name resources and make use of the tags to manage cloud infrastructure more easily and efficiently. Naming convention/requirements should be established as part of Cloud deployment plan before actual roll-out All
Architecture Disable MySQL binary logs incase of large database loads MySQL Binary logging incurs significant costs and should be disabled in case of large data loads AWS
Testing Simulate customer distribution Use Cloud's datacenters spread across the world to simulate the customers spread across the globe and test for latency they are likely to experience. Having databases spread across the globe (and working as read replicas) and CDNs are some of the choices to provide better customer experience All
Architecture Handle possibility of longer disruptions Cloud provider's implementation may have bugs and the service may see long duration of interruptions much beyond committed SLAs. Sometimes, given the scale of cloud provider, the service interruption may turn grave and result in domino effect (with one service knocking off other services). Always architect with failure in mind. All
Security Consider using IDS/IPS in Cloud environments Use SNORT (an open source IDS) to protect your infrastructure; Snort is usually the first outward facing component. Snort can in turn send incoming requests to load balancers/web servers etc. Also Restful APIs should be protected by an API gateway (such as Oracle API gateway) that protects against DoS attacks, bad input etc. All

No comments:

Post a Comment