Massive training corpus for AI coding models containing: - 10 JSONL training datasets (641+ examples across coding, reasoning, planning, architecture, communication, debugging, security, workflows, error handling, UI/UX) - 11 agent behavior specifications (explorer, planner, reviewer, debugger, executor, UI designer, Linux admin, kernel engineer, security architect, automation engineer, API architect) - 6 skill definition files (coding, API engineering, kernel, Linux server, security architecture, server automation, UI/UX) - Master README with project origin story and philosophy Built by Pony Alpha 2 to help AI models learn expert-level coding approaches.
1348 lines
32 KiB
Markdown
1348 lines
32 KiB
Markdown
# Server Automation Expert Skill
|
|
|
|
## Activation Criteria
|
|
Activate this skill when the user:
|
|
- Automates server provisioning and configuration
|
|
- Implements Infrastructure as Code (IaC)
|
|
- Designs CI/CD pipelines
|
|
- Automates deployment processes
|
|
- Manages container orchestration
|
|
- Implements configuration management
|
|
- Automates testing and quality assurance
|
|
- Designs monitoring and alerting automation
|
|
- Automates backup and disaster recovery
|
|
- Implements GitOps practices
|
|
- Automates security scanning and compliance
|
|
- Manages multi-environment deployments
|
|
- Implements blue-green or canary deployments
|
|
- Automates infrastructure scaling
|
|
- Needs reproducible infrastructure builds
|
|
|
|
## Core Methodology
|
|
|
|
### 1. Ansible Playbook Design
|
|
|
|
#### Complete Ansible Setup
|
|
|
|
```yaml
|
|
# ansible.cfg - Ansible Configuration
|
|
[defaults]
|
|
inventory = ./inventory
|
|
host_key_checking = False
|
|
retry_files_enabled = False
|
|
gathering = smart
|
|
fact_caching = jsonfile
|
|
fact_caching_connection = /tmp/ansible_facts
|
|
fact_caching_timeout = 86400
|
|
stdout_callback = yaml
|
|
bin_ansible_callbacks = True
|
|
callbacks_enabled = profile_tasks, timer
|
|
jinja2_extensions = jinja2.ext.do
|
|
display_skipped_hosts = False
|
|
|
|
[ssh_connection]
|
|
pipelining = True
|
|
control_path = /tmp/ansible-ssh-%%h-%%p-%%r
|
|
```
|
|
|
|
```yaml
|
|
# inventory.yml - Dynamic Inventory
|
|
---
|
|
all:
|
|
children:
|
|
production:
|
|
hosts:
|
|
prod-web-01:
|
|
ansible_host: 10.0.1.10
|
|
ansible_user: deploy
|
|
prod-web-02:
|
|
ansible_host: 10.0.1.11
|
|
ansible_user: deploy
|
|
prod-db-01:
|
|
ansible_host: 10.0.20.10
|
|
ansible_user: deploy
|
|
staging:
|
|
hosts:
|
|
staging-web-01:
|
|
ansible_host: 10.1.1.10
|
|
ansible_user: deploy
|
|
webservers:
|
|
children:
|
|
production:
|
|
vars:
|
|
nginx_worker_processes: auto
|
|
nginx_worker_connections: 1024
|
|
databases:
|
|
children:
|
|
production:
|
|
vars:
|
|
postgresql_version: 15
|
|
postgresql_max_connections: 200
|
|
vars:
|
|
ansible_python_interpreter: /usr/bin/python3
|
|
env: production
|
|
```
|
|
|
|
```yaml
|
|
# site.yml - Master Playbook
|
|
---
|
|
- name: Configure web servers
|
|
hosts: webservers
|
|
become: true
|
|
roles:
|
|
- role: base
|
|
tags: ['base']
|
|
- role: nginx
|
|
tags: ['nginx', 'web']
|
|
- role: application
|
|
tags: ['application']
|
|
- role: monitoring
|
|
tags: ['monitoring']
|
|
|
|
- name: Configure database servers
|
|
hosts: databases
|
|
become: true
|
|
roles:
|
|
- role: base
|
|
tags: ['base']
|
|
- role: postgresql
|
|
tags: ['database', 'postgresql']
|
|
- role: monitoring
|
|
tags: ['monitoring']
|
|
```
|
|
|
|
#### Production-Ready Roles
|
|
|
|
```yaml
|
|
# roles/base/tasks/main.yml
|
|
---
|
|
- name: Update apt cache
|
|
apt:
|
|
update_cache: true
|
|
cache_valid_time: 3600
|
|
tags: ['apt']
|
|
|
|
- name: Upgrade all packages
|
|
apt:
|
|
upgrade: dist
|
|
autoremove: true
|
|
tags: ['apt']
|
|
|
|
- name: Install base packages
|
|
apt:
|
|
name:
|
|
- curl
|
|
- wget
|
|
- git
|
|
- vim
|
|
- htop
|
|
- tmux
|
|
- net-tools
|
|
- tcpdump
|
|
- strace
|
|
- sysstat
|
|
- fail2ban
|
|
- ufw
|
|
state: present
|
|
tags: ['packages']
|
|
|
|
- name: Configure timezone
|
|
timezone:
|
|
name: UTC
|
|
tags: ['system']
|
|
|
|
- name: Set hostname
|
|
hostname:
|
|
name: "{{ inventory_hostname }}"
|
|
tags: ['system']
|
|
|
|
- name: Configure sysctl
|
|
sysctl:
|
|
name: "{{ item.name }}"
|
|
value: "{{ item.value }}"
|
|
state: present
|
|
reload: true
|
|
loop:
|
|
- { name: "net.ipv4.ip_forward", value: "0" }
|
|
- { name: "net.ipv4.conf.all.send_redirects", value: "0" }
|
|
- { name: "net.ipv4.conf.default.send_redirects", value: "0" }
|
|
- { name: "net.ipv4.conf.all.accept_source_route", value: "0" }
|
|
- { name: "net.ipv4.conf.default.accept_source_route", value: "0" }
|
|
- { name: "net.ipv4.conf.all.accept_redirects", value: "0" }
|
|
- { name: "net.ipv4.conf.default.accept_redirects", value: "0" }
|
|
- { name: "net.ipv4.icmp_echo_ignore_broadcasts", value: "1" }
|
|
- { name: "net.ipv4.tcp_syncookies", value: "1" }
|
|
- { name: "net.ipv4.tcp_max_syn_backlog", value: "2048" }
|
|
- { name: "net.core.somaxconn", value: "1024" }
|
|
tags: ['system', 'security']
|
|
|
|
- name: Configure limits
|
|
pam_limits:
|
|
domain: "*"
|
|
limit_type: "{{ item.type }}"
|
|
limit_item: "{{ item.item }}"
|
|
value: "{{ item.value }}"
|
|
loop:
|
|
- { type: "soft", item: "nofile", value: "65536" }
|
|
- { type: "hard", item: "nofile", value: "65536" }
|
|
- { type: "soft", item: "nproc", value: "65536" }
|
|
- { type: "hard", item: "nproc", value: "65536" }
|
|
tags: ['system']
|
|
|
|
- name: Configure fail2ban
|
|
copy:
|
|
src: jail.local
|
|
dest: /etc/fail2ban/jail.local
|
|
owner: root
|
|
group: root
|
|
mode: '0644'
|
|
notify: restart fail2ban
|
|
tags: ['security']
|
|
|
|
- name: Ensure fail2ban is running
|
|
service:
|
|
name: fail2ban
|
|
state: started
|
|
enabled: true
|
|
tags: ['security']
|
|
|
|
- name: Configure UFW
|
|
ufw:
|
|
state: enabled
|
|
policy: deny
|
|
direction: incoming
|
|
tags: ['firewall']
|
|
|
|
- name: Allow SSH through UFW
|
|
ufw:
|
|
rule: allow
|
|
port: "22"
|
|
proto: tcp
|
|
tags: ['firewall']
|
|
|
|
- name: Allow HTTP/HTTPS through UFW
|
|
ufw:
|
|
rule: allow
|
|
port: "{{ item }}"
|
|
proto: tcp
|
|
loop:
|
|
- "80"
|
|
- "443"
|
|
tags: ['firewall']
|
|
|
|
- name: Create deploy user
|
|
user:
|
|
name: deploy
|
|
shell: /bin/bash
|
|
groups: sudo
|
|
append: true
|
|
state: present
|
|
tags: ['users']
|
|
|
|
- name: Add SSH key for deploy user
|
|
authorized_key:
|
|
user: deploy
|
|
key: "{{ deploy_ssh_public_key }}"
|
|
state: present
|
|
tags: ['users']
|
|
```
|
|
|
|
```yaml
|
|
# roles/nginx/tasks/main.yml
|
|
---
|
|
- name: Add NGINX repository
|
|
apt_repository:
|
|
repo: "ppa:ondrej/nginx"
|
|
state: present
|
|
update_cache: true
|
|
tags: ['nginx', 'repository']
|
|
|
|
- name: Install NGINX
|
|
apt:
|
|
name: nginx
|
|
state: present
|
|
tags: ['nginx', 'packages']
|
|
|
|
- name: Create nginx directories
|
|
file:
|
|
path: "{{ item }}"
|
|
state: directory
|
|
owner: www-data
|
|
group: www-data
|
|
mode: '0755'
|
|
loop:
|
|
- /var/www/html
|
|
- /etc/nginx/sites-available
|
|
- /etc/nginx/sites-enabled
|
|
- /var/log/nginx
|
|
tags: ['nginx', 'config']
|
|
|
|
- name: Configure nginx main config
|
|
template:
|
|
src: nginx.conf.j2
|
|
dest: /etc/nginx/nginx.conf
|
|
owner: root
|
|
group: root
|
|
mode: '0644'
|
|
validate: 'nginx -t -c %s'
|
|
notify: reload nginx
|
|
tags: ['nginx', 'config']
|
|
|
|
- name: Remove default nginx site
|
|
file:
|
|
path: /etc/nginx/sites-enabled/default
|
|
state: absent
|
|
notify: reload nginx
|
|
tags: ['nginx', 'config']
|
|
|
|
- name: Configure nginx site
|
|
template:
|
|
src: site.conf.j2
|
|
dest: "/etc/nginx/sites-available/{{ application_name }}.conf"
|
|
owner: root
|
|
group: root
|
|
mode: '0644'
|
|
validate: 'nginx -t'
|
|
notify: reload nginx
|
|
tags: ['nginx', 'config']
|
|
|
|
- name: Enable nginx site
|
|
file:
|
|
src: "/etc/nginx/sites-available/{{ application_name }}.conf"
|
|
dest: "/etc/nginx/sites-enabled/{{ application_name }}.conf"
|
|
state: link
|
|
notify: reload nginx
|
|
tags: ['nginx', 'config']
|
|
|
|
- name: Ensure nginx is running
|
|
service:
|
|
name: nginx
|
|
state: started
|
|
enabled: true
|
|
tags: ['nginx', 'service']
|
|
|
|
- name: Configure logrotate for nginx
|
|
copy:
|
|
src: nginx-logrotate
|
|
dest: /etc/logrotate.d/nginx
|
|
owner: root
|
|
group: root
|
|
mode: '0644'
|
|
tags: ['nginx', 'logging']
|
|
```
|
|
|
|
```yaml
|
|
# roles/nginx/templates/nginx.conf.j2
|
|
user www-data;
|
|
worker_processes {{ nginx_worker_processes }};
|
|
worker_rlimit_nofile 65535;
|
|
|
|
error_log /var/log/nginx/error.log warn;
|
|
pid /var/run/nginx.pid;
|
|
|
|
events {
|
|
worker_connections {{ nginx_worker_connections }};
|
|
use epoll;
|
|
multi_accept on;
|
|
}
|
|
|
|
http {
|
|
include /etc/nginx/mime.types;
|
|
default_type application/octet-stream;
|
|
|
|
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
|
|
'$status $body_bytes_sent "$http_referer" '
|
|
'"$http_user_agent" "$http_x_forwarded_for"';
|
|
|
|
access_log /var/log/nginx/access.log main;
|
|
|
|
sendfile on;
|
|
tcp_nopush on;
|
|
tcp_nodelay on;
|
|
keepalive_timeout 65;
|
|
types_hash_max_size 2048;
|
|
server_tokens off;
|
|
|
|
client_body_buffer_size 128k;
|
|
client_max_body_size 100m;
|
|
client_header_buffer_size 1k;
|
|
large_client_header_buffers 4 4k;
|
|
|
|
gzip on;
|
|
gzip_vary on;
|
|
gzip_proxied any;
|
|
gzip_comp_level 6;
|
|
gzip_types text/plain text/css text/xml text/javascript
|
|
application/json application/javascript application/xml+rss
|
|
application/rss+xml font/truetype font/opentype
|
|
application/vnd.ms-fontobject image/svg+xml;
|
|
|
|
include /etc/nginx/conf.d/*.conf;
|
|
include /etc/nginx/sites-enabled/*;
|
|
}
|
|
```
|
|
|
|
```yaml
|
|
# roles/docker/tasks/main.yml
|
|
---
|
|
- name: Install prerequisites
|
|
apt:
|
|
name:
|
|
- apt-transport-https
|
|
- ca-certificates
|
|
- curl
|
|
- gnupg
|
|
- lsb-release
|
|
state: present
|
|
tags: ['docker', 'prerequisites']
|
|
|
|
- name: Add Docker GPG key
|
|
apt_key:
|
|
url: https://download.docker.com/linux/{{ ansible_distribution | lower }}/gpg
|
|
state: present
|
|
tags: ['docker', 'repository']
|
|
|
|
- name: Add Docker repository
|
|
apt_repository:
|
|
repo: "deb https://download.docker.com/linux/{{ ansible_distribution | lower }} {{ ansible_distribution_release }} stable"
|
|
state: present
|
|
update_cache: true
|
|
tags: ['docker', 'repository']
|
|
|
|
- name: Install Docker
|
|
apt:
|
|
name:
|
|
- docker-ce
|
|
- docker-ce-cli
|
|
- containerd.io
|
|
- docker-compose-plugin
|
|
state: present
|
|
tags: ['docker', 'packages']
|
|
|
|
- name: Create Docker directory
|
|
file:
|
|
path: /etc/docker
|
|
state: directory
|
|
owner: root
|
|
group: root
|
|
mode: '0755'
|
|
tags: ['docker', 'config']
|
|
|
|
- name: Configure Docker daemon
|
|
copy:
|
|
src: daemon.json
|
|
dest: /etc/docker/daemon.json
|
|
owner: root
|
|
group: root
|
|
mode: '0644'
|
|
notify: restart docker
|
|
tags: ['docker', 'config']
|
|
|
|
- name: Ensure deploy user can use Docker
|
|
user:
|
|
name: deploy
|
|
groups: docker
|
|
append: true
|
|
tags: ['docker', 'users']
|
|
|
|
- name: Ensure Docker is running
|
|
service:
|
|
name: docker
|
|
state: started
|
|
enabled: true
|
|
tags: ['docker', 'service']
|
|
|
|
- name: Install Python Docker SDK
|
|
pip:
|
|
name: docker
|
|
state: present
|
|
tags: ['docker', 'python']
|
|
```
|
|
|
|
### 2. Terraform Infrastructure as Code
|
|
|
|
#### Production Terraform Configuration
|
|
|
|
```hcl
|
|
# terraform/terraform.tf
|
|
terraform {
|
|
required_version = ">= 1.0"
|
|
|
|
required_providers {
|
|
aws = {
|
|
source = "hashicorp/aws"
|
|
version = "~> 5.0"
|
|
}
|
|
}
|
|
|
|
backend "s3" {
|
|
bucket = "terraform-state-prod"
|
|
key = "production/terraform.tfstate"
|
|
region = "us-east-1"
|
|
encrypt = true
|
|
dynamodb_table = "terraform-locks"
|
|
}
|
|
}
|
|
|
|
# terraform/provider.tf
|
|
provider "aws" {
|
|
region = var.aws_region
|
|
|
|
default_tags {
|
|
tags = {
|
|
Environment = var.environment
|
|
Project = var.project_name
|
|
ManagedBy = "Terraform"
|
|
}
|
|
}
|
|
}
|
|
|
|
# terraform/variables.tf
|
|
variable "aws_region" {
|
|
description = "AWS region"
|
|
type = string
|
|
default = "us-east-1"
|
|
}
|
|
|
|
variable "environment" {
|
|
description = "Environment name"
|
|
type = string
|
|
default = "production"
|
|
}
|
|
|
|
variable "project_name" {
|
|
description = "Project name"
|
|
type = string
|
|
default = "myapp"
|
|
}
|
|
|
|
variable "vpc_cidr" {
|
|
description = "VPC CIDR block"
|
|
type = string
|
|
default = "10.0.0.0/16"
|
|
}
|
|
|
|
variable "availability_zones" {
|
|
description = "List of availability zones"
|
|
type = list(string)
|
|
default = ["us-east-1a", "us-east-1b"]
|
|
}
|
|
|
|
variable "instance_types" {
|
|
description = "Instance types by tier"
|
|
type = map(string)
|
|
default = {
|
|
web = "t3.medium"
|
|
app = "t3.large"
|
|
db = "r6g.large"
|
|
}
|
|
}
|
|
|
|
# terraform/vpc.tf
|
|
resource "aws_vpc" "main" {
|
|
cidr_block = var.vpc_cidr
|
|
enable_dns_support = true
|
|
enable_dns_hostnames = true
|
|
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-vpc"
|
|
Tier = "Network"
|
|
}
|
|
}
|
|
|
|
resource "aws_internet_gateway" "main" {
|
|
vpc_id = aws_vpc.main.id
|
|
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-igw"
|
|
}
|
|
}
|
|
|
|
resource "aws_subnet" "public" {
|
|
count = length(var.availability_zones)
|
|
vpc_id = aws_vpc.main.id
|
|
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index)
|
|
availability_zone = var.availability_zones[count.index]
|
|
map_public_ip_on_launch = true
|
|
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-public-${var.availability_zones[count.index]}"
|
|
Tier = "Public"
|
|
}
|
|
}
|
|
|
|
resource "aws_subnet" "private" {
|
|
count = length(var.availability_zones)
|
|
vpc_id = aws_vpc.main.id
|
|
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index + 10)
|
|
availability_zone = var.availability_zones[count.index]
|
|
map_public_ip_on_launch = false
|
|
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-private-${var.availability_zones[count.index]}"
|
|
Tier = "Private"
|
|
}
|
|
}
|
|
|
|
resource "aws_subnet" "database" {
|
|
count = length(var.availability_zones)
|
|
vpc_id = aws_vpc.main.id
|
|
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index + 20)
|
|
availability_zone = var.availability_zones[count.index]
|
|
map_public_ip_on_launch = false
|
|
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-database-${var.availability_zones[count.index]}"
|
|
Tier = "Database"
|
|
}
|
|
}
|
|
|
|
resource "aws_eip" "nat" {
|
|
count = length(var.availability_zones)
|
|
domain = "vpc"
|
|
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-nat-${count.index}"
|
|
}
|
|
|
|
depends_on = [aws_internet_gateway.main]
|
|
}
|
|
|
|
resource "aws_nat_gateway" "main" {
|
|
count = length(var.availability_zones)
|
|
allocation_id = aws_eip.nat[count.index].id
|
|
subnet_id = aws_subnet.public[count.index].id
|
|
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-nat-${count.index}"
|
|
}
|
|
|
|
depends_on = [aws_internet_gateway.main]
|
|
}
|
|
|
|
resource "aws_route_table" "public" {
|
|
vpc_id = aws_vpc.main.id
|
|
|
|
route {
|
|
cidr_block = "0.0.0.0/0"
|
|
gateway_id = aws_internet_gateway.main.id
|
|
}
|
|
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-public-rt"
|
|
}
|
|
}
|
|
|
|
resource "aws_route_table" "private" {
|
|
count = length(var.availability_zones)
|
|
vpc_id = aws_vpc.main.id
|
|
|
|
route {
|
|
cidr_block = "0.0.0.0/0"
|
|
nat_gateway_id = aws_nat_gateway.main[count.index].id
|
|
}
|
|
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-private-rt-${count.index}"
|
|
}
|
|
}
|
|
|
|
resource "aws_route_table_association" "public" {
|
|
count = length(var.availability_zones)
|
|
subnet_id = aws_subnet.public[count.index].id
|
|
route_table_id = aws_route_table.public.id
|
|
}
|
|
|
|
resource "aws_route_table_association" "private" {
|
|
count = length(var.availability_zones)
|
|
subnet_id = aws_subnet.private[count.index].id
|
|
route_table_id = aws_route_table.private[count.index].id
|
|
}
|
|
|
|
resource "aws_route_table_association" "database" {
|
|
count = length(var.availability_zones)
|
|
subnet_id = aws_subnet.database[count.index].id
|
|
route_table_id = aws_route_table.private[count.index].id
|
|
}
|
|
|
|
# terraform/security_groups.tf
|
|
resource "aws_security_group" "web" {
|
|
name = "${var.project_name}-${var.environment}-web"
|
|
description = "Security group for web tier"
|
|
vpc_id = aws_vpc.main.id
|
|
|
|
ingress {
|
|
description = "HTTPS from anywhere"
|
|
from_port = 443
|
|
to_port = 443
|
|
protocol = "tcp"
|
|
cidr_blocks = ["0.0.0.0/0"]
|
|
}
|
|
|
|
ingress {
|
|
description = "HTTP from anywhere"
|
|
from_port = 80
|
|
to_port = 80
|
|
protocol = "tcp"
|
|
cidr_blocks = ["0.0.0.0/0"]
|
|
}
|
|
|
|
egress {
|
|
description = "All outbound traffic"
|
|
from_port = 0
|
|
to_port = 0
|
|
protocol = "-1"
|
|
cidr_blocks = ["0.0.0.0/0"]
|
|
}
|
|
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-web"
|
|
Tier = "Web"
|
|
}
|
|
}
|
|
|
|
resource "aws_security_group" "app" {
|
|
name = "${var.project_name}-${var.environment}-app"
|
|
description = "Security group for application tier"
|
|
vpc_id = aws_vpc.main.id
|
|
|
|
ingress {
|
|
description = "Application port from web tier"
|
|
from_port = 8080
|
|
to_port = 8080
|
|
protocol = "tcp"
|
|
security_groups = [aws_security_group.web.id]
|
|
}
|
|
|
|
egress {
|
|
description = "All outbound traffic"
|
|
from_port = 0
|
|
to_port = 0
|
|
protocol = "-1"
|
|
cidr_blocks = ["0.0.0.0/0"]
|
|
}
|
|
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-app"
|
|
Tier = "Application"
|
|
}
|
|
}
|
|
|
|
resource "aws_security_group" "database" {
|
|
name = "${var.project_name}-${var.environment}-database"
|
|
description = "Security group for database tier"
|
|
vpc_id = aws_vpc.main.id
|
|
|
|
ingress {
|
|
description = "PostgreSQL from application tier"
|
|
from_port = 5432
|
|
to_port = 5432
|
|
protocol = "tcp"
|
|
security_groups = [aws_security_group.app.id]
|
|
}
|
|
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-database"
|
|
Tier = "Database"
|
|
}
|
|
}
|
|
|
|
# terraform/ec2.tf
|
|
resource "aws_launch_template" "web" {
|
|
name_prefix = "${var.project_name}-${var.environment}-web-"
|
|
image_id = data.aws_ami.amazon_linux_2.id
|
|
instance_type = var.instance_types.web
|
|
key_name = aws_key_pair.main.key_name
|
|
|
|
network_interfaces {
|
|
associate_public_ip_address = true
|
|
security_groups = [aws_security_group.web.id]
|
|
}
|
|
|
|
user_data = base64encode(templatefile("${path.module}/templates/web_user_data.sh", {
|
|
environment = var.environment
|
|
}))
|
|
|
|
tag_specifications {
|
|
resource_type = "instance"
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-web"
|
|
Tier = "Web"
|
|
}
|
|
}
|
|
|
|
lifecycle {
|
|
create_before_destroy = true
|
|
}
|
|
}
|
|
|
|
resource "aws_autoscaling_group" "web" {
|
|
desired_capacity = 2
|
|
max_size = 4
|
|
min_size = 2
|
|
vpc_zone_identifier = aws_subnet.public[*].id
|
|
|
|
target_group_arns = [aws_lb_target_group.web.arn]
|
|
|
|
launch_template {
|
|
id = aws_launch_template.web.id
|
|
version = "$Latest"
|
|
}
|
|
|
|
tag {
|
|
key = "Name"
|
|
value = "${var.project_name}-${var.environment}-web"
|
|
propagate_at_launch = true
|
|
}
|
|
}
|
|
|
|
# terraform/load_balancer.tf
|
|
resource "aws_lb" "main" {
|
|
name = "${var.project_name}-${var.environment}-lb"
|
|
internal = false
|
|
load_balancer_type = "application"
|
|
security_groups = [aws_security_group.web.id]
|
|
subnets = aws_subnet.public[*].id
|
|
|
|
enable_deletion_protection = false
|
|
|
|
tags = {
|
|
Name = "${var.project_name}-${var.environment}-lb"
|
|
}
|
|
}
|
|
|
|
resource "aws_lb_target_group" "web" {
|
|
name = "${var.project_name}-${var.environment}-web-tg"
|
|
port = 80
|
|
protocol = "HTTP"
|
|
vpc_id = aws_vpc.main.id
|
|
|
|
health_check {
|
|
enabled = true
|
|
healthy_threshold = 2
|
|
interval = 30
|
|
matcher = "200"
|
|
path = "/health"
|
|
port = "traffic-port"
|
|
protocol = "HTTP"
|
|
timeout = 5
|
|
unhealthy_threshold = 3
|
|
}
|
|
}
|
|
|
|
resource "aws_lb_listener" "https" {
|
|
load_balancer_arn = aws_lb.main.arn
|
|
port = "443"
|
|
protocol = "HTTPS"
|
|
ssl_policy = "ELBSecurityPolicy-TLS-1-2-2017-01"
|
|
certificate_arn = aws_acm_certificate.main.arn
|
|
|
|
default_action {
|
|
type = "forward"
|
|
target_group_arn = aws_lb_target_group.web.arn
|
|
}
|
|
}
|
|
|
|
resource "aws_lb_listener" "http" {
|
|
load_balancer_arn = aws_lb.main.arn
|
|
port = "80"
|
|
protocol = "HTTP"
|
|
|
|
default_action {
|
|
type = "redirect"
|
|
|
|
redirect {
|
|
port = "443"
|
|
protocol = "HTTPS"
|
|
status_code = "HTTP_301"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### 3. CI/CD Pipeline Design
|
|
|
|
#### GitHub Actions Production Pipeline
|
|
|
|
```yaml
|
|
# .github/workflows/ci-cd.yml
|
|
name: CI/CD Pipeline
|
|
|
|
on:
|
|
push:
|
|
branches: [main, develop]
|
|
pull_request:
|
|
branches: [main, develop]
|
|
|
|
env:
|
|
AWS_REGION: us-east-1
|
|
ECR_REPOSITORY: myapp
|
|
ECS_CLUSTER: production
|
|
ECS_SERVICE: myapp-service
|
|
|
|
jobs:
|
|
# CI Job
|
|
test:
|
|
name: Test
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Checkout code
|
|
uses: actions/checkout@v4
|
|
|
|
- name: Set up Node.js
|
|
uses: actions/setup-node@v4
|
|
with:
|
|
node-version: '20'
|
|
cache: 'npm'
|
|
|
|
- name: Install dependencies
|
|
run: npm ci
|
|
|
|
- name: Run linter
|
|
run: npm run lint
|
|
|
|
- name: Run tests
|
|
run: npm run test:coverage
|
|
|
|
- name: Upload coverage
|
|
uses: codecov/codecov-action@v3
|
|
with:
|
|
files: ./coverage/lcov.info
|
|
flags: unittests
|
|
name: codecov-umbrella
|
|
|
|
- name: Run security scan
|
|
run: npm audit --audit-level=moderate
|
|
|
|
- name: Run SAST
|
|
uses: aquasecurity/trivy-action@master
|
|
with:
|
|
scan-type: 'fs'
|
|
scan-ref: '.'
|
|
format: 'sarif'
|
|
output: 'trivy-results.sarif'
|
|
|
|
- name: Upload Trivy results to GitHub Security tab
|
|
uses: github/codeql-action/upload-sarif@v2
|
|
if: always()
|
|
with:
|
|
sarif_file: 'trivy-results.sarif'
|
|
|
|
# Build Job
|
|
build:
|
|
name: Build Docker Image
|
|
needs: test
|
|
runs-on: ubuntu-latest
|
|
outputs:
|
|
image-tag: ${{ steps.meta.outputs.tags }}
|
|
image-digest: ${{ steps.build.outputs.digest }}
|
|
steps:
|
|
- name: Checkout code
|
|
uses: actions/checkout@v4
|
|
|
|
- name: Configure AWS credentials
|
|
uses: aws-actions/configure-aws-credentials@v4
|
|
with:
|
|
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
|
|
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
|
|
aws-region: ${{ env.AWS_REGION }}
|
|
|
|
- name: Login to Amazon ECR
|
|
id: login-ecr
|
|
uses: aws-actions/amazon-ecr-login@v2
|
|
|
|
- name: Set up Docker Buildx
|
|
uses: docker/setup-buildx-action@v3
|
|
|
|
- name: Extract metadata
|
|
id: meta
|
|
uses: docker/metadata-action@v5
|
|
with:
|
|
images: ${{ steps.login-ecr.outputs.registry }}/${{ env.ECR_REPOSITORY }}
|
|
tags: |
|
|
type=ref,event=branch
|
|
type=ref,event=pr
|
|
type=semver,pattern={{version}}
|
|
type=semver,pattern={{major}}.{{minor}}
|
|
type=sha,prefix={{branch}}-
|
|
|
|
- name: Build and push Docker image
|
|
id: build
|
|
uses: docker/build-push-action@v5
|
|
with:
|
|
context: .
|
|
push: true
|
|
tags: ${{ steps.meta.outputs.tags }}
|
|
labels: ${{ steps.meta.outputs.labels }}
|
|
cache-from: type=gha
|
|
cache-to: type=gha,mode=max
|
|
build-args: |
|
|
BUILD_DATE=${{ github.event.repository.updated_at }}
|
|
VCS_REF=${{ github.sha }}
|
|
VERSION=${{ steps.meta.outputs.version }}
|
|
|
|
- name: Image vulnerability scan
|
|
uses: aquasecurity/trivy-action@master
|
|
with:
|
|
image-ref: ${{ steps.login-ecr.outputs.registry }}/${{ env.ECR_REPOSITORY }}:${{ github.sha }}
|
|
format: 'sarif'
|
|
output: 'trivy-image-results.sarif'
|
|
|
|
- name: Upload Trivy results to GitHub Security
|
|
uses: github/codeql-action/upload-sarif@v2
|
|
if: always()
|
|
with:
|
|
sarif_file: 'trivy-image-results.sarif'
|
|
|
|
# Deploy to Staging
|
|
deploy-staging:
|
|
name: Deploy to Staging
|
|
needs: build
|
|
runs-on: ubuntu-latest
|
|
if: github.ref == 'refs/heads/develop'
|
|
environment:
|
|
name: staging
|
|
url: https://staging.example.com
|
|
steps:
|
|
- name: Checkout code
|
|
uses: actions/checkout@v4
|
|
|
|
- name: Configure AWS credentials
|
|
uses: aws-actions/configure-aws-credentials@v4
|
|
with:
|
|
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
|
|
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
|
|
aws-region: ${{ env.AWS_REGION }}
|
|
|
|
- name: Deploy to ECS (Staging)
|
|
run: |
|
|
aws ecs update-service \
|
|
--cluster staging \
|
|
--service myapp-staging-service \
|
|
--force-new-deployment \
|
|
--region ${{ env.AWS_REGION }}
|
|
|
|
- name: Wait for deployment
|
|
run: |
|
|
aws ecs wait services-stable \
|
|
--cluster staging \
|
|
--services myapp-staging-service \
|
|
--region ${{ env.AWS_REGION }}
|
|
|
|
- name: Run integration tests
|
|
run: |
|
|
npm run test:integration -- --env=staging
|
|
|
|
# Deploy to Production
|
|
deploy-production:
|
|
name: Deploy to Production
|
|
needs: [build, deploy-staging]
|
|
runs-on: ubuntu-latest
|
|
if: github.ref == 'refs/heads/main'
|
|
environment:
|
|
name: production
|
|
url: https://example.com
|
|
steps:
|
|
- name: Checkout code
|
|
uses: actions/checkout@v4
|
|
|
|
- name: Configure AWS credentials
|
|
uses: aws-actions/configure-aws-credentials@v4
|
|
with:
|
|
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
|
|
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
|
|
aws-region: ${{ env.AWS_REGION }}
|
|
|
|
- name: Create deployment record
|
|
run: |
|
|
gh release create ${{ github.sha }} \
|
|
--title "Release ${{ github.sha }}" \
|
|
--notes "Deploying to production"
|
|
|
|
- name: Blue-green deployment
|
|
run: |
|
|
# Switch traffic to new task set
|
|
TASK_SET_ARN=$(aws ecs create-task-set \
|
|
--cluster production \
|
|
--service myapp-service \
|
|
--task-definition myapp:${{ github.sha }} \
|
|
--launch-type FARGATE \
|
|
--network-configuration "awsvpcConfiguration={subnets=[${{ env.PRIVATE_SUBNETS }}],securityGroups=[${{ env.SECURITY_GROUP }}],assignPublicIp=DISABLED}" \
|
|
--query 'taskSet.taskSetArn' \
|
|
--output text)
|
|
|
|
# Gradual rollout
|
|
for percentage in 10 25 50 75 100; do
|
|
aws ecs update-service-primary-task-set \
|
|
--cluster production \
|
|
--service myapp-service \
|
|
--primary-task-set $TASK_SET_ARN \
|
|
--task-set $TASK_SET_ARN
|
|
sleep 30
|
|
done
|
|
|
|
- name: Run smoke tests
|
|
run: |
|
|
npm run test:smoke -- --env=production
|
|
|
|
- name: Rollback on failure
|
|
if: failure()
|
|
run: |
|
|
# Rollback to previous task set
|
|
aws ecs update-service \
|
|
--cluster production \
|
|
--service myapp-service \
|
|
--task-definition myapp:previous \
|
|
--force-new-deployment
|
|
|
|
- name: Notify team
|
|
if: success()
|
|
uses: 8398a7/action-slack@v3
|
|
with:
|
|
status: ${{ job.status }}
|
|
text: |
|
|
Production deployment successful!
|
|
Commit: ${{ github.sha }}
|
|
Author: ${{ github.actor }}
|
|
webhook_url: ${{ secrets.SLACK_WEBHOOK }}
|
|
```
|
|
|
|
### 4. Kubernetes Deployment
|
|
|
|
#### Helm Chart for Production
|
|
|
|
```yaml
|
|
# helm/myapp/Chart.yaml
|
|
apiVersion: v2
|
|
name: myapp
|
|
description: A Helm chart for my application
|
|
type: application
|
|
version: 1.0.0
|
|
appVersion: "1.0"
|
|
|
|
# helm/myapp/values.yaml
|
|
replicaCount: 3
|
|
|
|
image:
|
|
repository: myapp
|
|
pullPolicy: IfNotPresent
|
|
tag: "1.0.0"
|
|
|
|
imagePullSecrets: []
|
|
nameOverride: ""
|
|
fullnameOverride: ""
|
|
|
|
serviceAccount:
|
|
create: true
|
|
annotations: {}
|
|
name: ""
|
|
|
|
podAnnotations: {}
|
|
|
|
podSecurityContext:
|
|
runAsNonRoot: true
|
|
runAsUser: 1000
|
|
fsGroup: 1000
|
|
|
|
securityContext:
|
|
allowPrivilegeEscalation: false
|
|
readOnlyRootFilesystem: true
|
|
runAsUser: 1000
|
|
capabilities:
|
|
drop:
|
|
- ALL
|
|
|
|
service:
|
|
type: ClusterIP
|
|
port: 80
|
|
annotations: {}
|
|
|
|
ingress:
|
|
enabled: true
|
|
className: "nginx"
|
|
annotations:
|
|
cert-manager.io/cluster-issuer: "letsencrypt-prod"
|
|
nginx.ingress.kubernetes.io/ssl-redirect: "true"
|
|
nginx.ingress.kubernetes.io/force-ssl-redirect: "true"
|
|
hosts:
|
|
- host: example.com
|
|
paths:
|
|
- path: /
|
|
pathType: Prefix
|
|
tls:
|
|
- secretName: myapp-tls
|
|
hosts:
|
|
- example.com
|
|
|
|
resources:
|
|
limits:
|
|
cpu: 1000m
|
|
memory: 512Mi
|
|
requests:
|
|
cpu: 100m
|
|
memory: 128Mi
|
|
|
|
autoscaling:
|
|
enabled: true
|
|
minReplicas: 3
|
|
maxReplicas: 10
|
|
targetCPUUtilizationPercentage: 80
|
|
targetMemoryUtilizationPercentage: 80
|
|
|
|
nodeSelector: {}
|
|
|
|
tolerations: []
|
|
|
|
affinity: {}
|
|
|
|
# helm/myapp/templates/deployment.yaml
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
metadata:
|
|
name: {{ include "myapp.fullname" . }}
|
|
labels:
|
|
{{- include "myapp.labels" . | nindent 4 }}
|
|
spec:
|
|
{{- if not .Values.autoscaling.enabled }}
|
|
replicas: {{ .Values.replicaCount }}
|
|
{{- end }}
|
|
selector:
|
|
matchLabels:
|
|
{{- include "myapp.selectorLabels" . | nindent 6 }}
|
|
template:
|
|
metadata:
|
|
{{- with .Values.podAnnotations }}
|
|
annotations:
|
|
{{- toYaml . | nindent 8 }}
|
|
{{- end }}
|
|
labels:
|
|
{{- include "myapp.selectorLabels" . | nindent 8 }}
|
|
spec:
|
|
{{- with .Values.imagePullSecrets }}
|
|
imagePullSecrets:
|
|
{{- toYaml . | nindent 8 }}
|
|
{{- end }}
|
|
serviceAccountName: {{ include "myapp.serviceAccountName" . }}
|
|
securityContext:
|
|
{{- toYaml .Values.podSecurityContext | nindent 8 }}
|
|
containers:
|
|
- name: {{ .Chart.Name }}
|
|
securityContext:
|
|
{{- toYaml .Values.securityContext | nindent 10 }}
|
|
image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
|
|
imagePullPolicy: {{ .Values.image.pullPolicy }}
|
|
ports:
|
|
- name: http
|
|
containerPort: 8080
|
|
protocol: TCP
|
|
livenessProbe:
|
|
httpGet:
|
|
path: /health
|
|
port: http
|
|
initialDelaySeconds: 30
|
|
periodSeconds: 10
|
|
timeoutSeconds: 5
|
|
failureThreshold: 3
|
|
readinessProbe:
|
|
httpGet:
|
|
path: /ready
|
|
port: http
|
|
initialDelaySeconds: 10
|
|
periodSeconds: 5
|
|
timeoutSeconds: 3
|
|
failureThreshold: 3
|
|
resources:
|
|
{{- toYaml .Values.resources | nindent 10 }}
|
|
volumeMounts:
|
|
- name: temp
|
|
mountPath: /tmp
|
|
- name: cache
|
|
mountPath: /app/cache
|
|
volumes:
|
|
- name: temp
|
|
emptyDir: {}
|
|
- name: cache
|
|
emptyDir: {}
|
|
{{- with .Values.nodeSelector }}
|
|
nodeSelector:
|
|
{{- toYaml . | nindent 8 }}
|
|
{{- end }}
|
|
{{- with .Values.affinity }}
|
|
affinity:
|
|
{{- toYaml . | nindent 8 }}
|
|
{{- end }}
|
|
{{- with .Values.tolerations }}
|
|
tolerations:
|
|
{{- toYaml . | nindent 8 }}
|
|
{{- end }}
|
|
```
|
|
|
|
### 5. Decision Trees
|
|
|
|
#### Automation Tool Selection
|
|
|
|
```
|
|
What to automate?
|
|
│
|
|
├─ Configuration management → Ansible, Chef, Puppet
|
|
├─ Infrastructure provisioning → Terraform, CloudFormation
|
|
├─ Container orchestration → Kubernetes, Docker Swarm
|
|
├─ CI/CD → Jenkins, GitLab CI, GitHub Actions
|
|
├─ Monitoring → Prometheus, Grafana, Datadog
|
|
├─ Log management → ELK Stack, Splunk
|
|
└─ Security scanning → Trivy, SonarQube
|
|
```
|
|
|
|
#### Deployment Strategy Selection
|
|
|
|
```
|
|
Deployment requirements?
|
|
│
|
|
├─ Zero downtime → Blue-green, Canary
|
|
├─ Quick rollback → Blue-green
|
|
├─ Gradual rollout → Canary, Rolling
|
|
├─ Simple infrastructure → Rolling
|
|
├─ Complex microservices → Canary
|
|
└─ Enterprise compliance → Blue-green with approvals
|
|
```
|
|
|
|
### 6. Anti-Patterns to Avoid
|
|
|
|
1. **Hard-coded secrets**: Use vaults, never hard-code
|
|
2. **No testing**: Always test before deploying
|
|
3. **Manual deployments**: Automate everything
|
|
4. **No rollback plan**: Always have a rollback strategy
|
|
5. **Missing monitoring**: You can't manage what you don't measure
|
|
6. **Large monoliths**: Break into smaller, deployable units
|
|
7. **No version control**: Everything must be in git
|
|
8. **Tight coupling**: Design for independence
|
|
9. **No documentation**: Document your automation
|
|
10. **Ignoring security**: Security must be built-in
|
|
|
|
### 7. Quality Checklist
|
|
|
|
Before considering automation production-ready:
|
|
|
|
- [ ] All infrastructure codified
|
|
- [ ] Secrets management implemented
|
|
- [ ] Automated testing complete
|
|
- [ ] CI/CD pipeline tested
|
|
- [ ] Rollback procedure tested
|
|
- [ ] Monitoring and alerting configured
|
|
- [ ] Security scanning integrated
|
|
- [ ] Documentation complete
|
|
- [ ] Peer review completed
|
|
- [ ] Disaster recovery tested
|
|
- [ ] Configuration drift detection active
|
|
- [ ] Automation idempotent
|
|
- [ ] Error handling implemented
|
|
- [ ] Performance testing completed
|
|
- [ ] Compliance requirements met
|
|
- [ ] Backup automation configured
|
|
- [ ] Logging and auditing enabled
|
|
- [ ] Team training completed
|
|
- [ ] Runbooks documented
|
|
- [ ] SLA requirements met
|
|
|
|
This comprehensive skill definition provides complete guidance for server automation across modern infrastructure.
|