AWS CSAA Exam Preparation Notes
I recently wrote and passed my Amazon Web Services Solutions Architect Associate examination. I though I’d share some of the notes I compiled during my studies. Note taking was one part of my preparation. It also included lots of hands on experience in the AWS console as well as the taking of various practice exams. I also made an Anki flash card deck which I will share shortly.
Table of Contents
- • Zones and Availability Zones
- • IAM (Identity Access Management)
- • S3
- • EC2
- • EBS
- • AWS CLI
- • EFS
- • EC2 Placement Groups
- • Elastic Network Interfaces (ENIs)
- • Security Groups
- • CloudWatch
- • CloudTrail
- • Databases
- • Redshift
- • Route 53
- • VPCs
- • VPC Flow Logs
- • Bastions
- • Direct Connect
- • VPC Endpoints
- • Gateway Endpoints
- • VPC Endpoints Exam Tips
- • Elastic Load Balancers
- • Advanced Load Balancer Theory
- • API Gateway
- • Options for Data-at-Rest Encryption
- • Identity Providers and Federation
- • Lambda
- • CloudFront
- • What is a CDN and why use one?
- • What is CloudFront?
- • CloudFront Components
- • Use CloudFront
- • CloudFormation
- • Test Axioms
- • Whitepapers to READ
- • Various Topic You Should Know
-
Zones and Availability Zones
Availability Zone = Data Center, sometimes 2 or 3 data centers if they are within a mile or so of each other
Region: consists of 2 or more availability Zones
Edge locations are endpoints for AWS which are used for caching content. Typically this consists of CloudFront, Amazon’s CDN. There are many more EDGE locations than Regions. Currently over 150.
IAM - Identity Access Management
– Users:
-
- end users
-
- people
-
- employees of an organization
– Groups
-
- collections of Users
-
- each user in teh group will inherite the permissions of the group
– Policies
-
- policies are made up of documents called Ploicy Documentents
-
- JSON documents
-
- they describe the permissions for what a User/Group/Role can do
– Roles
-
- you create Roles and then assign them to AWS Resources
– you change the permissions to a role, even if that role is already assigned to an existing EC2 instance, and these changes will take effect immediately.
– IAM roles do not have credentials associated with it(password or access keys). Credentials are associated dynamically for short-term use.
IAM Roles
An AWS Identity and Access Management (IAM) role is similar to a user, in that it is an AWS identity with permissions policies that determine what the identity can and cannot do in AWS. However, instead of being uniquely associated with one person, a role is intended to be assumable by anyone who needs it.
– roles are a secure way to delegate access to users, appllications, and AWS services because they use temp creds – roels have a many-to-one relationship, therefore you can enable many users and apps to assume the same role to grant the same set of permissions
Roles for EC2
– enables applications running on that instance to make API calls – aws manages your security creds for you
-
- rotates the creds automatically
– you can now attach/detach a role from an existing instance
– add/update permissions without logging into teh instance
Service Roles
– A service role is a role that an AWS service assumes to perform actions on your behalf.
Service Linked Roles
-A service-linked role is a unique type of IAM role that is linked directly to an AWS service. Service-linked roles are predefined by the service and include all the permissions that the service requires to call other AWS services on your behalf. The linked service also defines how you create, modify, and delete a service-linked role. A service might automatically create or delete the role.
-Some services that have service-linked roles:
– EC2 – S3 – EFS – ElasticBeanstalk – ELB – EC2 AutoScaling – LambdaEdge – DynamoDB – ERlasticache – RDS – Redshift – KMS – IAM – and many more
-##### Creating a Service-Linekd Role
– The method that you use to create a service-linked role depends on the service. In some cases, you don’t need to manually create a service-linked role. For example, when you complete a specific action (such as creating a resource) in the service, the service might create the service-linked role for you.
-#### Service-linked vs service roles
– main difference is servicelinked roles are predefined by the service
IAM Exam Tips
Always remember that you should associate IAM roles to EC2 instances and not an IAM user, for the purpose of accessing other AWS services. IAM roles are designed so that your applications can securely make API requests from your instances, without requiring you to manage the security credentials that the applications use. Instead of creating and distributing your AWS credentials, you can delegate permission to make API requests using IAM roles.
S3
– object based storage – ulimited storage – files can be 0bytes to 5TB – universal namespace, bucket names need to be unique
-
- assigned a url of the form:
– when you upload a file to an S3 bucket you will receive a HTTP 200 status code
– an object consists of:
-
- key - name of object
-
- value - data, sequence of bytes
-
- version ID - important for versioning
-
- metadata
-
- subresources
-
- access control lists
-
- torrent
How does data cosnistency work for S3?
– read-after-write consistency for PUTS of new objects – eventual consistency for overwrite PUTS and DELETES
S3 Guarantees
– built for availabilty 99.99% availability – guarantees 99.9% tho – 99.9999999999 durability gurantee
S3 features
– tiered storage – lifecycle management – versioning – encryption – MFA delete – secure your data using Access Control List and bucket policies
S3 Storage Classes
S3 Standard:
– 99.99 avaialbility, 99.99999999 durability stored redundntly accross multiple devices in multiple facilities and designed to sustain the loss of 2 facilities concurrently. – so probably 3 facitilkty copies
S3 Infrequently Accesses
– for data that is access less often, but requires rapid access. Lower storage fee than S3 standard, but are charged an access fee
S3 One Zone IA
– infrequent access – low cost – single zone storage so data resiience is less
S3 Intelligent Tiering
– uses AI to optimize costs by automatically moving data to the most cost-effective tier without perofrmance impact of operational overhead
S3 Cost
– Some prices vary across Amazon S3 Regions. – Billing prices are based on the location of your bucket. There is no Data Transfer charge for data transferred within an Amazon S3 Region via a COPY request. – Data transferred via a COPY request between AWS Regions is charged at rates specified in the pricing section of the Amazon S3 detail page.
S3 Glacier
Used for data archiving
– low cost – store any amount of data – retrieval times are configurable from minutes to hours
S3 Glacier Deep archive
– lowest cost – retrieval takes 12 hours
How are you charged in S3?
– storage size – number of requests – transfer acceleration
-### Cross-Region Replication
To guard against deletion and data corruption, enable S3 versioning. With versioning enabled, S3 never overwrites or deletes an object. Instead, modifying an object creates a new version of it that you can revert to if needed. Also, instead of deleting an object, S3 inserts a delete marker and removes the object from view. But the object and all of its versions still exist.
-To protect your data against multiple availability zone failures–or the failure of an entire region–you can enable cross-region replication between a source bucket in one region and destination bucket in another. Once enabled, S3 will synchronously copy every object from the source bucket to the destination. Note that cross-region replication requires versioning to be enabled on both buckets. Also, deletes on the source bucket don’t get replicated.
Exam Tips for S3
– object-based - upload files – files can be 0 bytes to 5 TB – there is unlimited storage – Files are stored in Buckets – S3 has a universal namespace, so your bucket names must be gloablly unique
-
- e.g. https://s3-
<region>.amazonaws.com/<user>
</user></region>
– object storage, so not used for installing an OS or hosting a DB
– successful uploads will generate a HTTP 200
status code
– you can turn on MFA delete for your buckets
– fundamental components:
-
- key - the name of the object
-
- value - the data itself
-
- version id
-
- metadata
-
- subresources
-
- read-after-write consistency for PUTs of new objects
-
- eventual consistency for PUT updates
– Storage classes:
-
- S3 standard:
-
- 99.99 availabiulty, 99.9999999999 durabilityS
-
- stored redundantly accross mul;tiple facilities
-
- designed to withstand the loss of 2 facilities concurrently
-
- S3 - IA (Infrequent Access)
-
- S3 1Z-IA
-
- S3 Glacier
– Transfers between S3 buckets or from Amazon S3 to any service(s) within the same AWS Region are free.
S3 Encryption
– Encryption in Transit achieved by SSL/TLS – Server-side Encryption:
-
- S3 managed keys - SSE-S3
-
- AWS Key Management Service: Managed Keys - SSE-KMS
-
- Server Side encryption with Customer provided keys - SSE-C
-### S3 Client-Side Encryption
-Client-side encryption is the act of encrypting data before sending it to Amazon S3. To enable client-side encryption, you have the following options:
– Use an AWS KMS-managed customer master key. – Use a client-side master key.
-When using an AWS KMS-managed customer master key to enable client-side data encryption, you provide an AWS KMS customer master key ID (CMK ID) to AWS. On the other hand, when you use client-side master key for client-side data encryption, your client-side master keys and your unencrypted data are never sent to AWS. It’s important that you safely manage your encryption keys because if you lose them, you can’t decrypt your data.
-### S3 Cross-Region Replication
– versioning must be enabled on both the source and destination buckets – must use unique regions – files in an existing bucket are not replicated automaticcaly, must manually move previously existing files to replicant bucket manually, however all newly created/updated files will be replicated automatically – delete markers are not replicated – deleting versions or delete marker will NOT be replicated
EC2
Pricing Models
-
On Demand:
-
allows you to pay a fixed rate per hour/sec with no commitment
-3. ideal for users that want flexibilty without any up-front payment
- applciations with short term, spiky, or unpreditable workloads that cannot be interrupted
-
good for developing or testing applciations on EC2 for the first time
- Reserved:
-
- provides you with a capacity reserv ation and offers a significant discount on the hourly charge for an instance. Contract terms are offered on 1 or 3 year terms.
-
- useful for:
-
- apps with steady state or predictable usage
-
- apps that require reserved capacity
– 2 diff pricing types:
-
- standard: offer up to 75% off demand isntances, more you pay upfront along with the longer the contract the more you can save
-
- Convertible reserved isntances:
– allows to change between instance types, but the savings aren’t as large
- Spot:
-
- enables you to bid whatever price you want for instance capacity, providing for even greater savings if your applications have flexible start and end times.
-
- usecases:
-
- apps that have flexible start/end times
-
- apps that are only feasible at very low compute prices
-
- users with urgent computing needs for large amounts of additional capacity
- Dedicated Hosts Pricing
-
- useful for regulatory requirements that may not support multi-tenant virtualization
-
- e.g. government contracts, licensing constraints
-
- can be purchased on demand
EC2 Exam Tips
– Termination Protection is turned OFF by default – on an EBS backed instance, the default action is for the root EBS volume to be deleted once the instance is terminated – EBS Root Volumes of your DEFAULT AMI’s cannot be encrypted
-
- additional volumes CAN be encrypted tho
– changing securoty rules like disabling port 80, take effect immediately
– when you create inbound HTTP it automatically allows HTTP out
-
- i.e securrity groups are stateful, Network control accesss lists are stateless
– cant blacklist ips or ports using security groups, use NACLs
– in security groups all inbound traffic is blocked by default – can attach more than one securoty group to an EC2 instance – all outbound traffic is allowed
-
- security groups are stateful
– The instance retains its private IPv4 addresses and any IPv6 addresses when stopped and restarted.
EBS
– termination protection is turned off by default – on an EBS backed instance, the default action is for the root EBS volume to be deleted – EBS root volumes of your DEFAULT AMI’s cannot be encrypted.
-
- you cant use 3rd party tools to encrypt the volume
-
- this can be done when creating AMIs via the AWS console or the API
– additional volumes can be encrypted
– you cannot delete a snapshot of an EBS Volume that is used as the root device of a registered AMI
– Provides persistent block storagee volumes for use with EC2 instances.
– each EBS volume is automatically replicated within its availalabilty zone – EBS is a multi-tenant block storage service. We employ rate limiting as a mechanism to avoid resource contention.
Q: Will I be able to access my snapshots using the regular Amazon S3 API?
A: No, snapshots are only available through the Amazon EC2 API.
EBS Snapshots
– volumes exist on EBS. Think of EBS as virtual hard disk – snapshots exist on S#. Point0in-time image of disk state – snapshots are incremental: only blocks that have changed since last snapshot will be transferred to S3 – first snapshot takes the longest..duh – to create a snpshot for Amazon EBS volumes that serve as root devices, should stop the instance before taking the snapshot
-
- but can take one of a running instance
– you can create AMI from both snapshot and ebs volumes
– you can change EBS volume sizes on the fly, including the size and storagge type – volumes will always be in the same availabilty zone as the EC2 instance
Migrating EBS
– to move an EC2 instance from one AZ to another, take a snapshot, crate an AMI from the snapshot and then use the AMI to launc the eC2 isntance in a new AZ – to move an EC2 volumes from one region to another take a snapshot of it, create an AMI from teh snapshot, then copy the AMI from one region to another. Then use the copied AMI to launch the new EC2 instance in teh new region.
EBS Encryption
– snapshots of encrypted voluems are encrypted automatically – volumes restored from encrypted snashots are encrypted automatically – you can share snapshots, but only if they are unencrypted – these snapshots can be shared with other AWS accounts ofmade public – root device volumes can now be encrypted when you provisionyour EC2 instance – to encrypt an exisitng root volume the steps are the following:
- create a snapshot of the unencrypted root device volume
- create a copy of the snapshot and select teh encryption option
- create an AMI from teh encrypted snapshot
- use that AMI to launch new encrypted instances
Volume Types
SSD
– general purpose ssd
-
- use cases: average workload
-
- api name: gp2
-
- vol size: 1GB-16TB
-
- IOPS: 16k
– provisioned IOPs ssd
-
- databases
-
- api name: io1
-
- vol size: 4GB - 16TB
-
- iops: 64k
HDD
– throughput optimized HDD
-
- use cases: big data and data warehouseing
-
- api name: st1
-
- vol size: 500GB - 16TB
-
- iops: 500
– cold hdd - lowest cost HDD volume for less frequently accessed workloads
-
- api name: sc1
-
- vol size: 500gb - 16tb
-
- iops: 250
-
- use cases: file servers
– EBS Magnetic
-
- api name: standard
-
- vol size: 1gb - 1tb
-
- iops: 40-200
-
- use cases: worklaods where data is infrequently accesssed
Snapshots
– exist(stored) on S3 – point in time copies of volumes – snapshots are incremental – only the blocks that have changed since the last snapshot are moved to S3 bucket containing your snapshots – to create a snapshot of the EBS root volume of an instance, stop the instance first so files are not changing during snapshotting – snapshots can be done in real time while the volume is attached and in use – snapshots only capture data that has been written to your Amazon EBS volume, which might exclude any data that has been locally cached by your application or OS
EBS vs Instance Store
– you can select your AMI based on:
-
- region
-
- OS
-
- architecture 32/64-bit
-
- Launch permissions
-
- storage for root device
-
- instance store - epehemral storage
-
- EBS backed volumes
– instance store are sometimes called Ephemeral Storage
– instance store volumes cannot be stopped
-
- if the underlying host fails, you will lose your data
– EBS backed instances can be stopped
-
- you will not lose the data on this instance if it is stopped
– you can however reboot both and not lose your data
– by default, both ROOT volumes will be deleted on termiantion
-
- however, with EBS volumes, you can tell AWS to keep the root device volume
ALL AMIs are categorized as either backed by Amazon WBS or backed by inistance store
EBS Volumes:
– the root device for an instance launched from the AMI is an Amazon EBS volume create from an Amazon EBS snapshot
Instance Store Volumes:
– the root device for an instance launched from the AMI is an instance store volume created from a template stored in Amazon S3
EBS Exam Tips
– instance store voluems are sometimes called ephemeral storage – instance store volumes cannot be stopped. If the underlying hsot fauls, you will lose your data. – EBS backed instances can be stopped. You will not lose the data in this instance if it is stopped. – you can reboot both, you will not lose your data. – by default, both ROOT volumes wil be deleted on termination. However, with EBS volumes, you can tell AWS to keep the root device volume. – You can control whether an EBS root volume is deleted when its associated instance is terminated. The default delete-on-termination behaviour depends on whether the volume is a root volume, or an additional volume. By default, the DeleteOnTermination attribute for root volumes is set to ‘true.’ However, this attribute may be changed at launch by using either the AWS Console or the command line. For an instance that is already running, the DeleteOnTermination attribute must be changed using the CLI. – you cannot attach an EBS volume to more than one EC2 instance at the same time
Snapshots Exam Tips
– snapshots of encrypted volumes are encrypted automatically – voluems restorted from encrypted snapshots are encrypted automatiicaly – you can share snapshots, but only if they are unencrypted
EBS Main Points
– volumes exist on EBS – think of EBS as a virtual hard disk – snapshots exist on S3 – can create AMIs from Snapshots OR directly from volumes – can change EBS voluems on teh fly, including size and type – volumes will always be in teh same availbilty zone as the associated EC2 instance
– we can migrate EC2 voluems from one region to another -steps:
- take snapshot of EC2 volumes
- create AMI from the snapshot
- copy the new AMI from one region
- then use the copied AMI to launch the new EC2 instance in teh new region
AWS CLI
– you can interact with AWS from anywhere in teh world just by using the CLI – you will need to set up access in IAM – commands themselves are nto in the exam
Using IAM Roles with EC2
– isntead of storing access id and key on isntance, use roles isntead – roles are more secure and easier to manage – roels can be assigned to an EC2 isntance after it is created using both the console and command line – roles are universal - you can use them in any region
Using Bootstrap Scripts
Get Metadata about instance
-curl <http://169.254.169.254/latest/meta-data>
, curl <http://169.254.169.254/latest/user-data>
EFS
Elastic File System is an elastic NFS file system for EC2 isntances.
– provides a simple interface to create and configure files systems quickly and easily. – with EFS storage capacity is elasstic…duh – don’t need to pre-provision storage liek with EBS – You can mount an Amazon EFS file system on instances in only one VPC at a time.
EFS Exam Tips
– supports NFSv4 Protocol – only pay for the storage you use – can scale up to teh petabyte elvel – supports thousands of concurrent NFS connections – data is stored axross multiple AZs within a regio – read-after-write consistency
EC2 Placement Groups
Clustered Ppalcement Group
– a grouping of instances within a single Availability Zone – clustered palcement groups are recommended for applciations that need low network latency and/or high network throughput -this strategy enables workloads to achieve the low-latency network performance necessary for tightly-coupled node-to-node communication that is typical of HPC applications only certain instanced can be launched in to a Clustered Placement Group
Spread Placement Group
A group of instances that are each placed on distinct underlying hardware
– recommended for applications that have a small number of critical instances that shoudl be kept separate from each other – single EC2 isntances – Spread placement groups have a specific limitation that you can only have a maximum of 7 running instances per Availability Zone and therefore this is the only correct option. Deploying instances in a single Availability Zone is unique to Cluster Placement Groups only and therefore is not correct. The last two remaining options are common to all placement group types and so are not specific to Spread Placement Groups.
Partitioned Placement Groups
– spreads your instances across logical partitions such that groups of instances in one partition do not share the underlying hardware with groups of instances in different partitions. This strategy is typically used by large distributed and replicated workloads, such as Hadoop, Cassandra, and Kafka. – same idea as spread placement but with groups of Ec2 instances
Placement Groups Exam Tips
– a clustered placement group can’t span multiple Availability Zones – a spread placement and partitioned group can – the name you specify for a palcement group must be unique within your AWS account – only certain types of instances can be launched in a placement group:
-
- Compute Optimized
-
- Memory Optimized
-
- GPU
-
- Storage Optimized
– AWS recommends homogenous instances within clustered placement groups
– you cannot merge placement groups – also cannot move an existing instance into a placement group.
-
- instead, you’ll have to create an AMI from your existing image and then launch a new instance from the AMI into a placement group
EC2 Summary
– 4 different pricing models:
-
- On-Demand
-
- Reserved - 1 or 3 year contract terms
-
- Spot - bid on price
-
- dedicate
– Every instance must have a primary network interface (also known as the primary ENI ), which is connected to only one subnet. This is the reason you have to specify a sub- net when launching an instance. You can’t remove the primary ENI from an instance.
Elastic Network Interfaces (ENIs)
-Each instance must have a primary private IP address from the range specifi ed by the sub- net CIDR. The primary private IP address is bound to the primary ENI of the instance. You can’t change or remove this address, but you can assign secondary private IP addresses to the primary ENI. Any secondary addresses must come from the same subnet that the ENI is attached to. It’s possible to attach additional ENIs to an instance. Those ENIs may be in a differ- ent subnet, but they must be in the same availability zone as the instance. As always, any addresses associated with the ENI must come from the subnet to which it is attached.
– Every ENI must have at least one security group associated with it. One ENI can have multiple security groups attached, and the same security group can be attached to multiple ENIs. – every ENI must have a primary private IP address – it can have a secondary IP address mut this address must ocme form the same subnet as its primary IP – once created, the ENI cannot be moved to a different subnet – an ENI can be created separately from an isntance and attached later
ENI Best Practices
Best Practices for Configuring Network Interfaces
– You can attach a network interface to an instance when it’s running (hot attach), when it’s stopped (warm attach), or when the instance is being launched (cold attach).
– You can detach secondary network interfaces when the instance is running or stopped. However, you can’t detach the primary network interface (eth0).
– You can move a network interface from one instance to another, if the instances are in the same Availability Zone and VPC but in different subnets.
– When launching an instance using the CLI, API, or an SDK, you can specify the primary network interface (eth0) and additional network interfaces.
– Launching an Amazon Linux or Windows Server instance with multiple network interfaces automatically configures interfaces, private IPv4 addresses, and route tables on the operating system of the instance.
– A warm or hot attach of an additional network interface may require you to manually bring up the second interface, configure the private IPv4 address, and modify the route table accordingly. Instances running Amazon Linux or Windows Server automatically recognize the warm or hot attach and configure themselves.
– Attaching another network interface to an instance (for example, a NIC teaming configuration) cannot be used as a method to increase or double the network bandwidth to or from the dual-homed instance.
– If you attach two or more network interfaces from the same subnet to an instance, you may encounter networking issues such as asymmetric routing. If possible, use a secondary private IPv4 address on the primary network interface instead. For more information, see Assigning a Secondary Private IPv4 Address.
Security Groups
Security groups act as a firewall for associated Amazon EC2 instances, controlling both inbound and outbound traffic at the instance level. When you launch an instance, you can associate it with one or more security groups that you’ve created. Each instance in your VPC could belong to a different set of security groups. If you don’t specify a security group when you launch an instance, the instance is automatically associated the default security group for the VPC. For more information, see Security Groups for Your VPC.
– when you hear security group think instance FIREWALL! – ALL inbound traffic is blocked by default – ALL outbound traffic is allowed – changes to security groups take effect immediately – you can have any number of ec2 isntances within a security group – you can have multiple security groups atached to EC2 instances – security groups are stateful if you create an inboud rule allowing traffic in, that traffic is automatically allowed back out – you cannot block specific IP addresses using Securoty Groups
-
- use Network Access Control Lists
– you can specify allow rules, but not deny rules
– In practice, because most instances have only one ENI, people often think of a security group as being attached to an instance. When an instance has multiple ENIs, take care to note whether those ENIs use different security groups. When you create a security group, you must specify a
NACLs
– think subnet FIREWALL – Network ACLs act as a firewall for associated subnets, controlling both inbound and outbound traffic at the subnet level.
Security Groups vs NACLs
Security Groups | NACLs |
---|---|
Operates at the instance level | Operates at the subnet level |
Supports allow rules only | Supports allow rules and deny rules |
Is stateful: Return traffic is automatically allowed, regardless of any rules | Is stateless: Return traffic must be explicitly allowed by rules |
We evaluate all rules before deciding whether to allow traffic | We process rules in number order when deciding whether to allow traffic |
Applies to an instance only if someone specifies the security group when launching the instance, or associates the security group with the instance later on | Automatically applies to all instances in the subnets it’s associated with (therefore, an additional layer of defense if the security group rules are too permissive) |
CloudWatch
Using Amazon CloudWatch alarm actions, you can create alarms that automatically stop, terminate, reboot, or recover your EC2 instances. You can use the stop or terminate actions to help you save money when you no longer need an instance to be running. You can use the reboot and recover actions to automatically reboot those instances or recover them onto new hardware if a system impairment occurs.
Monitors things like:
– Compute:
-
- EC2 indtances
-
- autoscaling groups
-
- elastic load balancers
-
- route530 health checks
– Storage and Content Delivery:
-
- EBS volumes
-
- storage gateways
-
- cloudfron
– Host level metrics of:
-
- CPU
-
- Network
-
- Disk
-
- Status Check
Tip: they often try to confuse you between cloudTrail and cloudwatch
– cloudtrail is more of a usage/user tracker. Saves the user/account called the API created inatances etc. – Cloudwatch is used for monitoring performance – cloudwatch can monitor most of AWS as well as your applications that run on AWS – cloudwatch with EC2 will monitor events every 5 mins by default – you can gave 1 minute intervalks by turning on detailed monitoring – you can create cloudwatch alarms which ttrigger notifications
Perfomance==CloudWatch, Auditing/Recording/Logging users == CloudTrail
– CloudWatch monitors performance – CloudTrail monitors API calls in the AWS platform – cloudwatch is used for monitoring performance – can monitor most of AWS as well as the applciaitons that you are runing on AWS – will monitor EC2 events every 5 mins by default – you can have 1 minute intervals by turning on detailed monitoring – CloudWatch is about performance, CloudTrail is abotu auditing
CloudWatch Agent
-The CloudWatch Agent collects logs from EC2 instances and on-premises servers running Linux or Windows operating systems. The agent can also collect performance metrics, including metrics that EC2 doesn’t natively produce, such as memory utilization.
– Metrics generated by the agent are custom metrics and are stored in a custom namespace that you specify.
CloudWatch Alarms
An alarm can be in one of the three following states at any given time:
– ALARM
: The data points to alarm have crossed and remained past a defi ned threshold for a period of time.
– OK
: The data points to alarm have not crossed and remained past a defi ned threshold for a period of time.
– INSUFFICIENT_DATA
: The alarm hasn’t collected enough data to determine whether the data points to alarm have crossed a defined threshold.
New alarms always start out in an INSUFFICIENT_DATA state. It’s important to remember that an ALARM state doesn’t necessarily indicate a problem, and an OK state doesn’t necessarily indicate the absence of a problem. Alarm states track only whether the data points to alarm have crossed and remained past a threshold for a period of time.
CloudTrail
– CloudTrail logs 90 days of management events and stores them in a viewable, searchable, and downloadable database called the event history. The event history does not include data events. – CloudTrail creates a separate event history for each region containing only the activities that occurred in that region. But events for global services such as IAM and Route 53 are included in the event history of every region.
CloudWatch, CloudTrail, AWS Config Exam Tips
– Know how to configure the different features of CloudWatch. CloudWatch receives and stores performance metrics from various AWS services. You can also send custom metrics to CloudWatch. You can configure alarms to take one or more actions based on a metric. CloudWatch Logs receives and stores logs from various resources and makes them searchable. – Know the differences between CloudTrail and AWS Config. CloudTrail tracks events, while AWS Config tracks how those events ultimately affect the configuration of a resource. – AWS Config organizes configuration states and changes by resource, rather than by event. – Understand how CloudWatch Logs integrates with and complements CloudTrail. CloudTrail can send trail logs to CloudWatch Logs for storage, searching, and metric extraction. – Understand how SNS works. CloudWatch and AWS Config send notifications to an Amazon SNS topic. The SNS topic passes these notifications on to a subscriber, which consists of a protocol and endpoint. Know the various protocols that SNS supports.
Databases
RDS
Amazon RDS enables you to run a fully featured relational database while offloading database administration.
RDS has two key feature:
-1. Multi-AZ: for Disaster Recovery -2. Read-Replicas: for performance
RDS High Availability
-On Amazon RDS, a Multi-AZ deployment creates aprimary DB instance and a secondary standby DB instance in another Availability Zone for failoversupport. We recommend Multi-AZ deployments for production workloads to maintain high availability.For development and test purposes, you can use a deployment that isn’t Multi-AZ.
-For additional resiliency, you can use a multi-AZ RDS deployment that maintains a primary instance in one availability zone and a standby instance in another. RDS replicates data synchronously from the primary to the standby. If the primary instance fails, either because of an instance or because of availability zone failure, RDS automatically fails over to the secondary.
Exam Tips
– RDS runs on virtual machines – but you can’t login to these machines – Amazon is responsible for patching of the RDS OS and the databases – RDS is not a serverless technology
-
- Aurora Serverless however is an exception
-
- everythign else inside RDS is NOT serverless
-#### RDS-Backups
– Autoamted backuos are enabled by default. – the backup data is stored in S3 and you get free storage space equal to the size of your database.
-
- e.g. RDS isntance size = 10GB, 10GB of backup storage
– backups are taken within a defined window
-
- during this window, storage I/O may be suspended while your data is being backed up
-
- this may lead to elevated altency
– database snapshots are done manually
-
- user initiated
-
- stored after you delete the origiunal RDS isntance
– restoring an automatic backup or snapshot, the restored version will be a new RDS isntance with a new endpoint
Encryption at Rest
– supported for MySQL, Oracle, SQL Server, PostgreSQL, MariaDB, and Aurora – encryption is done using Key Management Service (KMS) – once your RDS instance is encrypted, the data stored at rest in the underlying storage is encrypted, as are its automated abckups, read replicas, adn snapshot
-#### Multi-AZ
-With optional Multi-AZ deployments, Amazon RDS also manages synchronous data replication across Availability Zones with automatic failover.
– multi-AZ is designed for disaster recovery, not performance imporovement – for performance improvement, use Read Replicas – you can force a failover from one AZ to another by rebooting the RDS isntance
Read Replicas
– read replicas make it easy to take advantage of supported engines’ built-in replication functionality to elastically scale out beyond the capacity constraints of a single DB instance for read-heavy database workloads – You can create a read replica with a few clicks in the AWS Management Console or using the CreateDBInstanceReadReplica API. – Once the read replica is created, database updates on the source DB instance will be replicated using a supported engine’s native, asynchronous replication. – You can create multiple read replicas for a given source DB Instance and distribute your application’s read traffic amongst them. – can promote read replicas to their own database – used for scaling, not disaster recovery – you can have up to 5 read replica copies of any database – must have automatic backups turned on in order to deploy a read-replcia – can have replicas of read replicas, but there will be latency costs – each read replicas will have its own DNS endpoint – you can have read replicas that have multi-AZ – you can create read replicas of Multi-AZ source databases – read replicas can be promoted to be their own databases. Note: they are now their own instance and do not replciate their prior associated master db anymore – you can have a read replica in a separate region – can be multi-AZ – used to increase performance – must have backups turned on for your RDS isntance you wish to make a replica of – can be in different regions than master – can be MySQL, PostgreSQL, MariaDB, Orcale, Aurora – can be promoted to master, this will break the Read Replica
DynamoDB
– DynamoDB is a fast and flexible nonrelational database service for any scale. – fully managed
-
- DynamoDB automatically scales throughput capacity to meet workload demands, and partitions and repartitions your data as your table size grows.
– DynamoDB synchronously replicates data across three facilities in an AWS Region, giving you high availability and data durability.
DynamoDB Consistency Model
When reading data from DynamoDB, users can specify whether they want the read to be eventually consistent or strongly consistent:
– Eventually consistent reads (the default):
-
- maximizes your read throughput
-
- All copies of data usually reach consistency within a second
– Strongly consistent reads:
-
- A strongly consistent read returns a result that reflects all writes that received a successful response before the read.
DynamoDB Basics
– stored on SSD storage – spread across 3 geographically distinct data centers – eventually consistent reads by default
Redshift
– Amazon Redshift is a fast, fully managed data warehouse that makes it simple and cost-effective to analyze all your data using standard SQL and your existing Business Intelligence (BI) tools. I – It allows you to run complex analytic queries against petabytes of structured data,
OLTP vs OLAP
Redshift is OLAP:
– In contrast to an OLTP database, an OLAP database is designed to process large datasets quickly to answer questions about data. The name reflects this purpose: Online Analytic Processing.
Redshift Configuration
– single node (160GB) – multi-node:
-
- Leader Node: manages client conenctions and receives queries
-
- Compute Node: stores data and performs queries and computations. Can have up to 128 compute nodes.
Redshift Backups
– anabled by default with a 1 day retention period – max retention period is 35 days – redshift always attemtps to maintain at least 3 copies of your data (the original and replica on the compute nodes, and a backup in S3) – Redshift can also asynchronously replciate your snapshots to S3 in another region for disaster recovery
Redshift Pricing
– compute node hours
-
- 1 unit/node/hour
– not charged for leader node hours
– billed for backups – billed for data transfer within a VPC
Redshift Security
– encrypted in transit using SSL – encrypted at rest using AES-256 – Redshift handles key management by defult
Redshift Availabilty
– curernt only in 1 AZ – no multi AZ turned on capapbilty – can restore snapshots to a new AZ in the case of an outage
Redshift Exam Tips
– redshift is used for business intelligence – only available in 1 AZ – backups are enabled by default with 1 day retention period (max 35 days) – attempts to maintain 3 copies of data
-
- original and 1 replica on compute nodes, backup on S3
Aurora
– Amazon Aurora is a relational database engine, that is MySQL and PostgreSQL compatible – Amazon Aurora MySQL delivers up to five times the performance of MySQL without requiring any changes to most MySQL applications – Amazon Aurora PostgreSQL delivers up to three times the performance of PostgrFeSQL. – fully managed
-
- Amazon RDS manages your Amazon Aurora databases, handling time-consuming tasks such as provisioning, patching, backup, recovery, failure detection and repair.
Things to Know About Aurora
– start with 10GB, scales in 10GB incremenents to 64TB (Storage Autoscaling) – compute resources can be scaled up to 32 virtual CPUs and 244GB of RAM – 2 copies of your data is contained in each availabilty zone, with a minimum of 3 availability zones.
-
- 6 copies of your data
-
- this implies that Aurora is only availabile in regions that have at least 3 AZs.
– designed to handle the loss of 2 copies odf data without affecting write availabilty, and up to 3 copies without affecting read availabilty
Databases Summary
AWS DB Types
-##### RDS - OLTP (Online Transactional Processing)
– OLTP (On-line Transaction Processing) is involved in the operation of a particular system. OLTP is characterized by a large number of short on-line transactions (INSERT, UPDATE, DELETE). The main emphasis for OLTP systems is put on very fast query processing, maintaining data integrity in multi-access environments and an effectiveness measured by number of transactions per second. In OLTP database there is detailed and current data, and schema used to store transactional databases is the entity model (usually 3NF). It involves Queries accessing individual record like Update your Email in Company database.
– MySQL, MariaDB, PostgreSQL, Oracle, Aurora
– Redshift
-
- used for OLAP
– Elasticache
Points to Remember
– RDS runs on virtual machines
-
- you CANNOT LOG IN to these systems however
-
- security patching of the RDS OS and DB is Amazon’s responsibilty because of this
– RDS is NOT serverless with teh exception of Aurora
– Multi AZ is used only for disaster recovery
Route 53
-Amazon Route 53 provides highly available and scalable Domain Name System (DNS), domain name registration, and health-checking web services.
-You can combine your DNS with health-checking services to route traffic to healthy endpoints or to independently monitor and/or alarm on endpoints.
You can also register new domain names or transfer in existing domain names to be managed by Route 53.
DNS
– can register domain names directly with Amazon – it can take up to 3 days to register depending on the circumstances
Routing Policies
– simple – weighted – latency-based – failover – geolocation – geoproximity (traffic flow only) – multivalue answer
Simple Routing Policy
– you have a single DNS record with multiple IP addresses.
-
- if you specify multiple calues in a record, Route53 returns all the values to the user in a random order
– Use for a single resource that performs a given function for your domain, for example, a web server that serves content for the example.com website.
Weighted Routing Policy
– Use to route traffic to multiple resources in proportions that you specify.
Latency Routing Policy
– Use when you have resources in multiple AWS Regions and you want to route traffic to the region that provides the best latency.
Geolocation Routing Policy
–Use when you want to route traffic based on the location of your users.
Multivalue Answer Routing Policy
– Use when you want Route 53 to respond to DNS queries with up to eight healthy records selected at random. – basically Simple Routing with health checks
Route 53 Exam Tips
– ELBs never have a pre-defined IPv4 address; you resolve to them using a DNS name – understand the difference between an Alias Record and a CNAME
-
- in regards to Route53 on the exam, if given a choice between Alias Record or CNAME, always choose Alias Records – ELBs do not have pre-defined IPv4 addresses; you resolve them using a DNS name – understand the difference between an Alias Record and a CNAME – given the choice, always choose an Alias record over a CNAME in the exam
– common DNS types:
-
- SOA Records
-
- NS records
-
- A records
-
- CNAMES
-
- MX Records
-
- PTR Records
– Routing Policies:
-
- simple
-
- weighted
-
- latency-absed
-
- failover
-
- geolocation
-
- geoproximity (advanced stuff, with the directed graph network stuff)
-
- multivalue answer = simple with health monitoring
– Health Checks
-
- can set health checks on individual record sets
-
- if a reecord set fails, it will be removed from Route 53
-
- you can set SNS notification to alert you when a record fails
VPCs
The day of your exam make sure you can build out your own VPC!!
eck out cidr.xyz for interactive subnetting
VPC Features
– launch isntancs to a subnet of your choosing – assign custom IP address ranges in each subnet – configure route table between subnets – create internet gateway and attach it to our VPC – much better security control over your AWS resources – instance security groups – subnet network access control lists ACLs
Default vs Custom VPC
– default VPC is user friendly, and allwos you to immediately deploy isntances without muckin about – all subnets in the efault VPC have a route out to the internet – each EC2 instance has both a public and private IP address
VPC Peering
– allows you to connect one VPC with another via a direct network route using private IPv4 addresses – isntances behave as if they were on the same private network – you can peer VPCs with other AWS accounts as well as with VPCs in teh same account – you can peer across region – no transitive peering, toplogy is always a star
-
- always 1 central VPC that peers with the others
– You can create a VPC peering connection between your own VPCs, or with a VPC in another AWS account. The VPCs can be in different regions (also known as an inter-region VPC peering connection).
– AWS uses the existing infrastructure of a VPC to create a VPC peering connection; it is neither a gateway nor a VPN connection, and does not rely on a separate piece of physical hardware.
-
- There is no single point of failure for communication or a bandwidth bottleneck.
– if you have more than one AWS account, you can peer the VPCs across those accounts to create a file sharing network. You can also use a VPC peering connection to allow other VPCs to access resources you have in one of your VPCs.
-#### Inter-Region VPC Peering
–You can establish peering relationships between VPCs across different AWS Regions (also called Inter-Region VPC Peering).
– This allows VPC resources including EC2 instances, Amazon RDS databases and Lambda functions that run in different AWS Regions to communicate with each other using private IP addresses, without requiring gateways, VPN connections, or separate network appliances. – The traffic remains in the private IP space. – All inter-region traffic is encrypted with no single point of failure, or bandwidth bottleneck. – Traffic always stays on the global AWS backbone, and never traverses the public internet, which reduces threats, such as common exploits, and DDoS attacks – Inter-region peering is not available in the China Regions. A
VPC FlowLogs
VPC Flow Logs is a feature that enables you to capture information about the IP traffic going to and from network interfaces in your VPC. Flow log data can be published to Amazon CloudWatch Logs and Amazon S3. After you’ve created a flow log, you can retrieve and view its data in the chosen destination.
AWS uses S3 to store various logs, including CloudTrail logs, VPC flow logs, DNS query logs, and S3 server access logs. Athena lets you use the Structured Query Language (SQL) to search data stored in S3. Although you can use CloudWatch Logs to store and search logs, you can’t format the output to show you only specific data.
VPC Exam Tips
– think of a VPC as a logical datacenter in AWS – consists of:
-
- IGW( virtual private gateways)
-
- Route Tables
-
- Network ACLs
-
- Subnets
-
- Security Groups
– Security Groups are stateful; Network ACLs are Stateless – no transitive peering – can’t have a subnet stretched over multipl AZs, but can have multiple subnets in the same availabilty zone – 1 subnet per 1 AZ!!!!
-
- Each subnet must reside entirely within one Availability Zone and cannot span zones.
By default, instances in new subnets in a custom VPC can communicate with each other across Availability Zones. In a custom VPC with new subnets in each AZ, there is a route that supports communication across all subnets/AZs. Plus a default SG with an allow rule ‘All traffic, all protocols, all ports, from anything using this default SG
Creating a VPC from Scratch
– when you create a new VPC it doesn’t create a subnet – it doesn’t create a default internet gateway – it does create a route table though – it does create a network ACL – it does create a security group – amazon always reserves 5 IP addresses within your subnet for VPC router, DNS server
Steps:
– create vpc – create subnet x 2 – allow public IP on one public subnet – create igw – assign igw to our public subnet – keep main route table as private, creat a second route table to use for our public subnet – add new route
-
- give it destination 0.0.0.0/0
-
- subnet asociation - add public subnet – when adding ec2 instances, you’ll have to create a new security group, the one we created eariler was for our defualt VPC
-
- security groups DO NOT span VPCs
NAT Instances and NAT Gateways Exam Tips
NAT Instances
– when creating a NAT isntance, disable source/destination check on the instance – NAT instance MUST be in a public subnet – there msut be a route out of the private subnet to teh NAT isntance, in order to use it from from an isntance in your private subnet – NAT isntances can be a bottleneck and the taffic they can support depends on the the size of the ANT isntance – you can create high availabilty using AutoScaling Groups, multiple subnets in different AZs, and a script to automate failover
-
- pain in the ASS! thooooo use NAT Gateways
– nat isntances are always behind security groups
NAT Gateways
– redundant within an AZ for resiliency, but will only see the one NAT gateway within your account with its associated IP, not a isngle EC2 instance like a NAT instance – you can onyl have one NAT gateway in one AZ; cannot span multiple AZs – starts at 5Gbps and acales to 456GBps – no need to patch liek in NAT isntances – auitomatically be assigned a public IP address – you do need to manually update your route tables tho – no need to disable source/destination check liek in NAT isntance – if you haev multiple pulic subnet in different AZs, you will need to set up another NAT gateway within the other public subnets
– will not accept any incoming traffic from the internet, will accept outbound communication from within your VPC
– if you have resources in multiple AZs and they share one NAT gaeway, if the AZ containing the NAT gateway goes down, the other resources in other AZs will lose internet access.
-
- generally you’ll what to creat a NAT gateway for each AZ/VPC
Network Access Control Lists vs Securoty Groups
– whenever you creata new NACL, all inbound rules are off by defualt – lower valued rules take precedence
-
- so if you want to make a deny
NACL Exam Tips
– a subnet can only be associated with 1 NACL at a time – but NACLs can be associated with multple subnets – ruels take effect immediately, and are evaluated in numerical order. e.g. if you make a deny with rule 100 allowing traffic on port 80 but then make a deny rule of 900 denying port 80, it wil still be allowed. – by default each custom NACL denies all inbound and outboudn traffic – when creating a subnet, if you don’t explicitly associate that subnet with a NACL, the subnet will be associated with the defauly Network Access Control List – block IP addresses using NACLs, not Security Groups – NACLs are stateless, so inbound and outbound traffic msut both be specified explicitly
-
- with security groups, inbound/outbound is stateful and done automatically
VPC Flow Logs
– VPC Flow Logs is a feature that enables you to capture information about the IP traffic going to and from network interfaces in your VPC – Flow log data can be published to Amazon CloudWatch Logs and Amazon S3. – After you’ve created a flow log, you can retrieve and view its data in the chosen destination.
Flow Logs Exam Tips
– you cannot enable flow logs for VPCs that are peeered with your VPC unless the peer VPC is in your account – you cannot tag a flo log – after you’ve created a flow log, you cannot change its config
-
- e.g. you can’t associate a different IAM role with the flow log
Traffic NOT monitored by Flow Logs
– not all traffic is monitored
-
- e.g. traffic generated by instances when they contact the Amazon DNS server is not logged. However, if you use your own DNS server, then all traffic to that servis is logged
– traffic generated by a Windows instance for the AWS Windows licence activation is not monitored
– traffic to and from the instance metadata endpoint (169.254.169.254) is not monitored – DHCP traffic – traffic to teh reserved AWS IP addresses for the default VPC router
Bastions
-Bastion hosts are a special-purpose instance that hosts a minimal number of administrative applications, such as Remote Desktop Protocol (RDP) for Windows or PuTTY for Linux-based distributions. All other unnecessary services are removed. Hosts are typically placed in a segregated network. They’re often protected with multi-factor authentication (MFA) and monitored with auditing tools.
The bastion hosts provide secure access to Linux instances located in the private and public subnets.
Bastion Host Exam Tips
– a NAT Gatewat or NAT Instance is used to procide internet traffic to EC2 instances in a private subnet – a Bastion is used to securely administer EC2 instance using SSH or RDP – you cannot use a NAT gateway as a Bastion host
Direct Connect
– AWS Direct Connect links your internal network to an AWS Direct Connect location over a standard Ethernet fiber-optic cable. – One end of the cable is connected to your router, the other to an AWS Direct Connect router. With this connection, you can create virtual interfaces directly to public AWS services (for example, to Amazon S3) or to Amazon VPC, bypassing internet service providers in your network path. – An AWS Direct Connect location provides access to AWS in the Region with which it is associated. – You can use a single connection in a public Region or AWS GovCloud (US) to access public AWS services in all other public Regions. – simple terms; connects your data center directly to AWS
-
- useful for high-throuput workloads with ltos of network traffic
– offers a reliable and secure connection
VPC Endpoints
– A VPC endpoint enables you to privately connect your VPC to supported AWS services and VPC endpoint services powered by PrivateLink without requiring an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection – Instances in your VPC do not require public IP addresses to communicate with resources in the service. Traffic between your VPC and the other service does not leave the Amazon network. – Endpoints are virtual devices. – They are horizontally scaled, redundant, and highly available VPC components that allow communication between instances in your VPC and services without imposing availability risks or bandwidth constraints on your network traffic. – major beneift is that In contrast to a NAT gateway, traffic between your VPC and the other service does not leave the Amazon network when using VPC endpoints.
Gateway Endpoints
– supported for Amazon S3, DynamoDB
Gateway Endpoint
A gateway endpoint is a gateway that you specify as a target for a route in your route table for traffic destined to a supported AWS service.
The following AWS services are supported:
– Amazon S3 – DynamoDB
Interface Endpoints
An interface endpoint is an elastic network interface with a private IP address from the IP address range of your subnet that serves as an entry point for traffic destined to a supported service.
Almsot all other AWS services are supported by interface endpoints.
PrivateLink
-AWS PrivateLink simplifies the security of data shared with cloud-based applications by eliminating the exposure of data to the public Internet. AWS PrivateLink provides private connectivity between VPCs, AWS services, and on-premises applications, securely on the Amazon network. AWS PrivateLink makes it easy to connect services across different accounts and VPCs to significantly simplify the network architecture.
VPC Endpoints Exam Tips
– rememebr what a VPC endpoint is\ – endpoints are virtual devices – horizontally scaled
2 types of VPC Endpoints:
– Interface Endpoints – Gateway Endpoints
VPC Summary
– think of a WPC as a logical datacenter in AWS – consists of IGWs (or Virtual Private Gateways), route tables, network access control lists, subnets, and security groups – 1 subnet == 1 AZ – security groups are stateful(if you open a port, inboud and outbound traffic on that port is enabled), NACLs are stateless (can allow only inbound or outbound traffic) – NO TRANSITIVE PEERING
Remember:
– when you create a VPC, a default route table, a default NACL, and a default security group – it won’t create any subnets, nor will it create a default internet gateway – the names US-east-1A in your AWS accoutn are just random assigned names, that will not necessariuly be the same as US-east-1a in someone else’s account – Amazon reserves 5 IP addresses within your subnets – you can only have 1 Internet Gateway per VPC – security groups can’t span VPCs – You may have only one internet gateway per VPC. – Security groups act like a firewall at the instance level, whereas NACLs are an additional layer of security that act at the subnet level.
NAT Instances Exam Tips
– You can use a network address translation (NAT) instance in a public subnet in your VPC to enable instances in the private subnet to initiate outbound IPv4 traffic to the Internet or other AWS services, but prevent the instances from receiving inbound traffic initiated by someone on the Internet. – when craetin a NAT isntance; disable source/destination check on the instance – NAT instances must be in a public subnet – there must be a route out of the private subnet to the NAT instance for it to work – the amount of traffic that NAT instances can support depends on the instance size
-
- if there is bottlenecking, increase he instance size
– you can create high availability using Autoscaling groups with multiple subnets in diff AZs with a script to automate failover.
-
- if this sounds liek a headache, that because it is, and in most real-world scenarios nowadays, companies use NAT Gateways instead because of this
NAT Gateways Exam Tips
– are redundant within the AZ – preferred by the enterprise now over NAT isntances for ease of use – starts at 5GBps and scalles to 45GBps – no need to patch liek with NAT instances as they are managed by AWS – not associated with any security group – automatically assigned a public IP address – remember to update your route tables after creating a NAT gateway – no need to disable source/destination checks
-
- Each EC2 instance performs source/destination checks by default. This means that the instance must be the source or destination of any traffic it sends or receives. However, a NAT instance must be able to send and receive traffic when the source or destination is not itself. Therefore, you must disable source/destination checks on the NAT instance.
Network Access Control Lists (NACLs) Exam Tips
– your VPC automatically comes with a default NACL, and by default allows inbound and outboud traffic
-
- this default is created because each subnet in your VPC must be associated with a NACL
– you can create custom NACLs
-
- by default, these custom NACLs deny all inbound and outbound traffic
– if you wish to block IP addresses, use NACLs not Security Groups(not sure if it’s even possible with security groups)
– you associate a NACL with multiple subnets, however, a subnet can be associated with only one NACL at a time!!!
-
- when you associate a subnet with a NACL, any previously associated NACL is removed
– the assigned rules are evaluated in ORDER, and the lowest num bered rule takes precedence
– NACLs are stateless, and therefore have separate rules for inbound and outbound traffic (cant do thsdi with securioty groups)
Flow Logs
– you cannot enable flow logs for VPCs that are peered with your VPC unless the peer VPC is in your account – you cannt tag a flow log – after you’ve created a flow log, you cannot change its configuration; for example, you cant associate a different IAM role with the flow log after its creation
Not all IP traffic is monitored
not logged:
– instance traffic from conatacting Amazons DNS – windows licence activation traffic for windows AMI – traffic to metadata IP ( 169.254.169.254) – DHCP – traffic to the reserved Amazon IP address for the default VPC router
Direct Conenct
– jsut remmeber what it is, a diorect ehternet or
Exam VPC Questions
-Q: Having just created a new VPC and launching an instance into its public subnet, you realise that you have forgotten to assign a public IP to the instance during creation. What is the simplest way to make your instance reachable from the outside world? A: Create an Elastic IP address and associate it with your isntance. By default, any user-created VPC subnet WILL NOT automatically assign public IPv4 addresses to instances – the only subnet that does this is the “default” VPC subnets automatically created by AWS in your account.
Q: Are you permitted to conduct your own vulnerability scans on your own VPC without alerting AWS first? A: Depends on the type of scan and which service is being scanned. Until recently customers were not permitted to conduct penetration testing without AWS engagement. However that has changed. There are still conditions though.
– When you create a new security group, all outbound traffic is allowed by default.
– you are allwoed 5 VPCs per AWS region by default
– In VPC, security groups carry out stateful filtering whereas network ACLs perform stateless filtering.
Elastic Load Balancers
3 types:
– Application – Network – Classic Load Balancer
Applciation Load Balancers
A load balancer serves as the single point of contact for clients. The load balancer distributes incoming application traffic across multiple targets, such as EC2 instances, in multiple Availability Zones. This increases the availability of your application. You add one or more listeners to your load balancer.
Application laod balancers are best suited for balancing of HTTP and HTTPS traffic:
– they operate at Layer 7 (applciation layer) – application aware (Application awareness is the capacity of a system to maintain information about connected applications to optimize their operation and that of any subsystems that they run or control.) – intelligent : you can create advanced request routing, sending specified requests to specific web servers
A listener checks for connection requests from clients, using the protocol and port that you configure. The rules that you define for a listener determine how the load balancer routes requests to its registered targets. Each rule consists of a priority, one or more actions, and one or more conditions. When the conditions for a rule are met, then its actions are performed. You must define a default rule for each listener, and you can optionally define additional rules.
Network Load Balancers
– best suited for balancing of TCP traffic where extreeme performance is required. – operates at Layer 4 (conenction/transport layer) – used for extreme performance – can handle millions of requests per second
Classic Load Balancer
– legacy ELBs – if applciation stops responding, classic laod balancer responds with 504 error
-
- means that the gateway has timed out
-### X-forwarded for header
– teh EC2 isntance sees incoming traffic as the IP of the ELB, to see the clients/enduser’s actual public IP address use the x forwarded for header
ELB Exam Tips
– ELBs do not have a predefined IPv4 address; they get resolved using a DNS name
– 504 error means gatewat ahs tiomes out. Applciation needs to br troubelshooted
– insnaces monitored by ELB are reported as InService
or OutService
– health checks check the instance health by talking to it
– load balancers have their own DNS name
-
- you are never given an IP address to a ELB
– read the FAQs for classic load balancers
Advanced Load Balancer Theory
Sticky Sessions
Sticky sessions are a mechanism to route requests to the same target in a target group. This is useful for servers that maintain state information in order to provide a continuous experience to clients. To use sticky sessions, the clients must support cookies.
When a load balancer first receives a request from a client, it routes the request to a target, generates a cookie named AWSALB that encodes information about the selected target, encrypts the cookie, and includes the cookie in the response to the client. The client should include the cookie that it receives in subsequent requests to the load balancer. When the load balancer receives a request from a client that contains the cookie, if sticky sessions are enabled for the target group and the request goes to the same target group, the load balancer detects the cookie and routes the request to the same target. If the cookie is present but cannot be decoded, or if it refers to a target that was deregistered or is unhealthy, the load balancer selects a new target and updates the cookie with information about the new target.
-Application Load Balancers support load balancer-generated cookies only. The contents of these cookies are encrypted using a rotating key. You cannot decrypt or modify load balancer-generated cookies.
-You enable sticky sessions at the target group level. You can also set the duration for the stickiness of the load balancer-generated cookie, in seconds. The duration is set with each request. Therefore, if the client sends a request before each duration period expires, the sticky session continues.
By default, a Classic Load Balancer routes each request independently to the registered instance with the smallest load. However, you can use the sticky session feature (also known as session affinity), which enables the load balancer to bind a user’s session to a specific instance.
** You can enable stick sessions for Application Load Balamcers as well, but the traffic will be sent at teh Target Group Level, not teh isntance level.
Enable Stick Sessions from AWS CLI
-Use the modify-target-group-attributes command with the stickiness.enabled and stickiness.lb_cookie.duration_seconds attributes.
Cross Zone Load Balancing
– enables you to lead balance across multiple AZs
Path Patterns
– allow you to direct traffic to different EC2 instances based on the URL contained in the request
-#### Auto-scaling Groups
– when you delete an autoscaling gorup, all the isntances under it are also terminated
HA Architecture
– always design for failure – sue multiple AZs and Multiple Regions where efver you can – know the difference between multi-AZ and Read replicas for RDS – know the difference between scaling out and scaling up – read the question carefully and always consider teh cost element – know the different storage classes
API Gateway
Provide end users with the lowest possible latency for API requests and responses by taking advantage of our global network of edge locations using Amazon CloudFront. Throttle traffic and cache the output of API calls to ensure that backend operations withstand traffic spikes and backend systems are not unnecessarily called.
-You can improve the performance of specific API requests by using Amazon API Gateway to store responses in an optional in-memory cache. This approach not only provides performance benefits for repeated API requests, but it also reduces the number of times your Lambda functions are executed, which can reduce your overall cost.
– API Gateway also supports access logging with configurable reports, and AWS X-Raytracingfor debugging. – Each resource/method combination that you create as part of your API is granted its own specific Amazon Resource Name (ARN) that can be referencedin AWS IAM policies. – API Gateway supports throttling, rate limits, and burst rate limits for each method in your API.
-## Options for Data-at-Rest Encryption
-### Client-side Encryption
– you encrypt your data before data is submitted to an AWS service – you supply encryption keys or use keys in your AWS account] – available encryption clients:
-
- S3
-
- DynamoDB
-
- EMRFS
-
- AWS Encryption SDK
-### Server-side Encryption
– AWS encrypts data on your behalf after data is received by an AWS service – many inegrated services:
-
- S3, Snowball, EBS, RDS, Redshift, WorkSpaces, Kinesis Firehose, CoudTrail, and more
-### Server-side Encryption in Amazon S3
– S3 encrypts data at the objecc t level as it writes it to disks and decrypts it when you acess it
-
- authenticated request and have access permissions 3 types:
-
- Customer provided key
-
- key is used in memory at teh S3 webserver, and then deleted
-
- customer must provide the same key when downlaoding to all S3 to encrypt the data
-
- Amazon S3-Managed Keys
-
- keys are stored on hosts that are separate from the hsots used to store data
-
- AWS KMS-Managed Keys
Identity Providers and Federation
If you already manage user identities outside of AWS, you can use IAM identity providers instead of creating IAM users in your AWS account. With an identity provider (IdP), you can manage your user identities outside of AWS and give these external user identities permissions to use AWS resources in your account. This is useful if your organization already has its own identity system, such as a corporate user directory. It is also useful if you are creating a mobile app or web application that requires access to AWS resources.
-When you use an IAM identity provider, you don’t have to create custom sign-in code or manage your own user identities. The IdP provides that for you. Your external users sign in through a well-known IdP, such as Login with Amazon, Facebook, or Google. You can give those external identities permissions to use AWS resources in your account. IAM identity providers help keep your AWS account secure because you don’t have to distribute or embed long-term security credentials, such as access keys, in your application.
To use an IdP, you create an IAM identity provider entity to establish a trust relationship between your AWS account and the IdP. IAM supports IdPs that are compatible with OpenID Connect (OIDC) or SAML 2.0 (Security Assertion Markup Language 2.0).
Lambda
AWS Lambda is acompute service that allows you to run arbitrary code functions in any of the supported languages (Node.js, Python, Ruby, Java, Go, .NET, for more information,see Lambda FAQs) without provisioning, managing, or scaling servers.Lambda functions are executed in a managed, isolated container and are triggered in response to an event which can be one of several programmatic triggers that AWS makes available, called an event source (see Lambda FAQsfor all event sources).
-Many popular use cases for AWS Lambda revolve around event-driven data processing workflows, such as processing files stored in Amazon Simple Storage Service (Amazon S3)or streaming data records from Amazon Kinesis. When used in conjunction with Amazon API Gateway, an AWS Lambda function performs the functionality of a typical web service: it executes code in response to a client HTTPS request. Amazon API Gateway acts as the front door for your logic tier, and AWS Lambda executes the application code.
Lambda Security
To execute a Lambda function, it must be triggered by an event or service that is permitted to do so via an AWS Identity and Access Management (IAM)policy.Using IAM policies, you cancreate a Lambda function that cannot be executed at all unless it is invoked by an API Gateway resource that you define.
Each Lambda function itself assumes an IAM role that is assigned when the Lambda function is deployed. This IAM role defines the other AWS servicesand resources your Lambda function caninteract with (for example,Amazon DynamoDB table, Amazon S3).
– You should not store sensitive information inside a Lambda Function
-
- if you need to access other credentials (for example, database credentials, API Keys) from inside your Lambda function, you can use AWS Key Management Service (AWS KMS)with environment variables, or use a service like AWS Secrets Managerto keep this information safe when not in use.
– For your Lambda function to access resources that you cannot expose publicly, like a private database instance, you can place your AWS Lambda function inside an Amazon Virtual Private Cloud (Amazon VPC)and configure an Elastic Network Interface (ENI)to access your internal resources.
Lambda Performance at Scale
– Code uploaded to AWS Lambda is stored in Amazon S3 and runs in an isolatedenvironment managed by AWS.
Lambda Exam Tips
– you’ll always have API Gateway at the front-end – Lambda functions that scale-out automatically – want a serverless databse, DynamoDB, Aurora Serverless – what languages does Lambda support?
-
- Python
-
- Go
-
- Node.js
-
- Java
-
- C#
-
- Powershell
– Lambda scales OUT
-
- each invocation to API gateway will trigger a new Lambda function
– Lambda fucntions are independent; 1 event = 1 lambda fucntion
– know which AWS services are serverless:
-
- Compute:
-
- Lambda
-
- LambdaEdge
-
- Fargate
-
- Storage:
-
- S3
-
- EFS
-
- Data Stores:
-
- DynamoDB
-
- Aurora Serverless
-
- API Proxy:
-
- API Gateway
-
- Application Integration:
-
- SQS
-
- SNS
-
- AppSync
-
- Orchestration:
-
- AWS Step Functions
-
- Analytics:
-
- Kinesis
-
- Athena
– Lambda functions can trigger other Lambda functions, can start a tree of Lambda functions if one event triggers a fucntions that calls another that calls 3 and so on..
– can get very complex architectures that are difficult to debug:
-
- use AWS X-Ray to debug
– Lambda can do things globally
– know your triggers, knwo what can adn can’t trigger Lambda (e.g. RDS can’t trigger Lambda at thsi point)
-
- can trigger:
-
- API Gateway
-
- IoT
-
- Alexa Skills Kit and Alexa Home
-
- ALB
-
- CloudFront
-
- CloudWatch Events
-
- Cloudwatch Logs
-
- CodeCommit
-
- Cognito Sync Trigger
-
- DynamoDB
-
- Kinesis
-
- S3
-
- SNS
-
- SQS
Lambda Pricing
– requests: to execute code – duration: length of the tiem it takes the code to execute – accessign data from otehr AWS services
Billed by: – Number of requests:
-
- first 1 million requests are free
-
- $0.20 per million requests after – Duration:
-
- calculated from the time your code begins executing until it returns or otherwise terminates, roudned up to the nearest 100ms.
-
- the price depends on the amount of memory you allocate to your function
-
- $0.00001667 for ever GB/sec used
Serverless Summary
– lambda is a global service
-
- can use it to backup S3 buckets to other S3 buckets etc
CloudFront
What is a CDN and why use one?
– A content delivery network is a large distribution of caching servers that are geographically distrbuted. – These servers contain data that is usually held on your origin server. – caches appropriate data (usually static content) at the Edge – this allows your apps to scale globally
What is CloudFront?
Amazon CloudFront is a web service that speeds up distribution of your static and dynamic web content, such as .html, .css, .js, and image files, to your users. CloudFront delivers your content through a worldwide network of data centers called edge locations. When a user requests content that you’re serving with CloudFront, the user is routed to the edge location that provides the lowest latency (time delay), so that content is delivered with the best possible performance.
– If the content is already in the edge location with the lowest latency, CloudFront delivers it immediately. – If the content is not in that edge location, CloudFront retrieves it from an origin that you’ve defined–such as an Amazon S3 bucket, a MediaPackage channel, or an HTTP server (for example, a web server) that you have identified as the source for the definitive version of your content.
CloudFront Components
Distribution
Minimizing Cost
Recognize Opportunities to Use Serverless Architecture
API Gateway, Lambda, DynamoDB, S3
Lambda function store session in DynamoDB, S3 for static files, API Gateway to attach REST endpoint to Lambda function.
Use CloudFront
– Avoid fetching data from S3 by caching it on CloudFront – No data transfer for moving data between S3 and CloudFront – CloudFront cost is determiend by how
CloudFormation
Template Sections
Templates include several major sections. The Resources section is the only required section. Some sections in a template can be in any order. However, as you build your template, it can be helpful to use the logical order shown in the following list because values in one section might refer to values from a previous section.
Format Version (optional)
The AWS CloudFormation template version that the template conforms to. The template format version is not the same as the API or WSDL version. The template format version can change independently of the API and WSDL versions.
Description (optional)
A text string that describes the template. This section must always follow the template format version section.
Metadata (optional)
Objects that provide additional information about the template.
Parameters (optional)
Values to pass to your template at runtime (when you create or update a stack). You can refer to parameters from the Resources and Outputs sections of the template.
Mappings (optional)
A mapping of keys and associated values that you can use to specify conditional parameter values, similar to a lookup table. You can match a key to a corresponding value by using the Fn::FindInMap intrinsic function in the Resources and Outputs sections.
Conditions (optional)
Conditions that control whether certain resources are created or whether certain resource properties are assigned a value during stack creation or update. For example, you could conditionally create a resource that depends on whether the stack is for a production or test environment.
Transform (optional)
-For serverless applications (also referred to as Lambda-based applications), specifies the version of the AWS Serverless Application Model (AWS SAM) to use. When you specify a transform, you can use AWS SAM syntax to declare resources in your template. The model defines the syntax that you can use and how it is processed.
You can also use AWS::Include transforms to work with template snippets that are stored separately from the main AWS CloudFormation template. You can store your snippet files in an Amazon S3 bucket and then reuse the functions across multiple templates.
Resources (required)
Specifies the stack resources and their properties, such as an Amazon Elastic Compute Cloud instance or an Amazon Simple Storage Service bucket. You can refer to resources in the Resources and Outputs sections of the template.
Outputs (optional)
-Describes the values that are returned whenever you view your stack’s properties. For example, you can declare an output for an S3 bucket name and then call the aws cloudformation describe-stacks AWS CLI command to view the name.
CloudFormation Exam Tips
– stack names mus tbe unique within the account – cloud foramtion can read templates only from an S3 bucket – parameters in your template allow you to pass custom values to your stack upon not creation, isntaed of hardcoding them – you can’t delete a stack if another stack is referencing any of its outputs – CloudFormation doesn’t preemptively check whether an update will violate a stack policy. If you attempt to update a stack in such a way that’s prevented by the stack policy, CloudFormation will still attempt to update the stack. The update will fail only when CloudFormation attempts to perform an update prohibited by the stack policy. Therefore, when updating a stack, it’s important to verify that the update succeeded. Don’t just start the update and walk away.
Test Axioms
Test Axioms for Resilient Sytems
– expect Single AZ will never be the rigght the answer – AWS managed services should always be preferred – fault-tolerant and high-availabilty are nto the same thing – expect that everything will fail at some point and design accordingly
Test Axioms for Security
– lcok down the root user – security groups only allow; they are staeful – prefer IAM Roels to access keys
Test Axioms for Cost Optimization
– if you know it’s going to be on, reserve it – any unused CPU time is a waste of money – use the most cost-effective storage service and class that meets your needs – determine the most cost-effective EC2 pricing model and instace type for each workload
Test Axioms for Operational Excellence
– IAM Roles are easier and safer than keys and passwords – monitor metrics across the system – automate responses to metrics where appropriate – provide alerts for anomalous conditions
Whitepapers to READ
– AWS Security Best Practices – AWS Well-Architected Framework – Architecting for the Cloud AWS Best Practices – Practicing Continuous Integration and Continuous Delivery on AWS Accelerating Software Delivery with DevOps – Microservices on AWS – Serverless Architectures with AWS Lambda – Optimizing Enterprise Economics with Serverless Architectures – Running Containerized Microservices on AWS – Blue/Green Deployments on AWS – IAM best practices – When to use AWS STS
Various Topic You Should Know
– run command – active-passive faiklover.still paying fro resources..btu i guess you can have an autoscaling group ready to go in the event that the passive grop starts receiving traffic if the primnary group of resourcves goes down. – palcement groups – review CloutWatch, CloudTrail, AWS Config Section of textbook – API Gateway – Lambda – Elastic IPs – when to use security groups over NACLs – practice drawing out a VPC from scratch by hand – securioty groups FAQs – read security best practices whitepaper – read all FAQs – can use multiple security groups in VPC?? – learn oltp vs olap – make lsit of serverless vs server services – make lsit of managed/unmanaged – signed URLs – protect against ddos – use cases as possible for VPC setup. With or without internet gateway, with or without NAT. – different reserved instance types and their usage – The 30 Day constraint in the S3 Lifecycle Policy before transitioning to S3-IA and S3-One Zone IA storage classes – Enabling Cross-region snapshot copy for an AWS KMS-encrypted cluster – Redis Auth / Amazon MQ / IAM DB Authentication – Know that FTP is using TCP and not UDP (Helpful for questions where you are asked to troubleshoot the network flow) – Difference between S3, EBS and EFS – Kinesis Sharding – Handling SSL Certificates in ELB ( Wildcard certificate vs SNI ) – Difference between OAI, Signed URL (CloudFront) and Pre-signed URL (S3) efault Termination Policy for Auto Scaling Group (Oldest launch configuration vs Instance Protection) – know when and why to use autoscaling – SAML vs. Federation
Thanks for Reading! --- @avcourt
Questions? Join my Discord server => discord.gg/5PfXqqr
Follow me on Twitter! => @avcourt