# Server Automation Expert Skill ## Activation Criteria Activate this skill when the user: - Automates server provisioning and configuration - Implements Infrastructure as Code (IaC) - Designs CI/CD pipelines - Automates deployment processes - Manages container orchestration - Implements configuration management - Automates testing and quality assurance - Designs monitoring and alerting automation - Automates backup and disaster recovery - Implements GitOps practices - Automates security scanning and compliance - Manages multi-environment deployments - Implements blue-green or canary deployments - Automates infrastructure scaling - Needs reproducible infrastructure builds ## Core Methodology ### 1. Ansible Playbook Design #### Complete Ansible Setup ```yaml # ansible.cfg - Ansible Configuration [defaults] inventory = ./inventory host_key_checking = False retry_files_enabled = False gathering = smart fact_caching = jsonfile fact_caching_connection = /tmp/ansible_facts fact_caching_timeout = 86400 stdout_callback = yaml bin_ansible_callbacks = True callbacks_enabled = profile_tasks, timer jinja2_extensions = jinja2.ext.do display_skipped_hosts = False [ssh_connection] pipelining = True control_path = /tmp/ansible-ssh-%%h-%%p-%%r ``` ```yaml # inventory.yml - Dynamic Inventory --- all: children: production: hosts: prod-web-01: ansible_host: 10.0.1.10 ansible_user: deploy prod-web-02: ansible_host: 10.0.1.11 ansible_user: deploy prod-db-01: ansible_host: 10.0.20.10 ansible_user: deploy staging: hosts: staging-web-01: ansible_host: 10.1.1.10 ansible_user: deploy webservers: children: production: vars: nginx_worker_processes: auto nginx_worker_connections: 1024 databases: children: production: vars: postgresql_version: 15 postgresql_max_connections: 200 vars: ansible_python_interpreter: /usr/bin/python3 env: production ``` ```yaml # site.yml - Master Playbook --- - name: Configure web servers hosts: webservers become: true roles: - role: base tags: ['base'] - role: nginx tags: ['nginx', 'web'] - role: application tags: ['application'] - role: monitoring tags: ['monitoring'] - name: Configure database servers hosts: databases become: true roles: - role: base tags: ['base'] - role: postgresql tags: ['database', 'postgresql'] - role: monitoring tags: ['monitoring'] ``` #### Production-Ready Roles ```yaml # roles/base/tasks/main.yml --- - name: Update apt cache apt: update_cache: true cache_valid_time: 3600 tags: ['apt'] - name: Upgrade all packages apt: upgrade: dist autoremove: true tags: ['apt'] - name: Install base packages apt: name: - curl - wget - git - vim - htop - tmux - net-tools - tcpdump - strace - sysstat - fail2ban - ufw state: present tags: ['packages'] - name: Configure timezone timezone: name: UTC tags: ['system'] - name: Set hostname hostname: name: "{{ inventory_hostname }}" tags: ['system'] - name: Configure sysctl sysctl: name: "{{ item.name }}" value: "{{ item.value }}" state: present reload: true loop: - { name: "net.ipv4.ip_forward", value: "0" } - { name: "net.ipv4.conf.all.send_redirects", value: "0" } - { name: "net.ipv4.conf.default.send_redirects", value: "0" } - { name: "net.ipv4.conf.all.accept_source_route", value: "0" } - { name: "net.ipv4.conf.default.accept_source_route", value: "0" } - { name: "net.ipv4.conf.all.accept_redirects", value: "0" } - { name: "net.ipv4.conf.default.accept_redirects", value: "0" } - { name: "net.ipv4.icmp_echo_ignore_broadcasts", value: "1" } - { name: "net.ipv4.tcp_syncookies", value: "1" } - { name: "net.ipv4.tcp_max_syn_backlog", value: "2048" } - { name: "net.core.somaxconn", value: "1024" } tags: ['system', 'security'] - name: Configure limits pam_limits: domain: "*" limit_type: "{{ item.type }}" limit_item: "{{ item.item }}" value: "{{ item.value }}" loop: - { type: "soft", item: "nofile", value: "65536" } - { type: "hard", item: "nofile", value: "65536" } - { type: "soft", item: "nproc", value: "65536" } - { type: "hard", item: "nproc", value: "65536" } tags: ['system'] - name: Configure fail2ban copy: src: jail.local dest: /etc/fail2ban/jail.local owner: root group: root mode: '0644' notify: restart fail2ban tags: ['security'] - name: Ensure fail2ban is running service: name: fail2ban state: started enabled: true tags: ['security'] - name: Configure UFW ufw: state: enabled policy: deny direction: incoming tags: ['firewall'] - name: Allow SSH through UFW ufw: rule: allow port: "22" proto: tcp tags: ['firewall'] - name: Allow HTTP/HTTPS through UFW ufw: rule: allow port: "{{ item }}" proto: tcp loop: - "80" - "443" tags: ['firewall'] - name: Create deploy user user: name: deploy shell: /bin/bash groups: sudo append: true state: present tags: ['users'] - name: Add SSH key for deploy user authorized_key: user: deploy key: "{{ deploy_ssh_public_key }}" state: present tags: ['users'] ``` ```yaml # roles/nginx/tasks/main.yml --- - name: Add NGINX repository apt_repository: repo: "ppa:ondrej/nginx" state: present update_cache: true tags: ['nginx', 'repository'] - name: Install NGINX apt: name: nginx state: present tags: ['nginx', 'packages'] - name: Create nginx directories file: path: "{{ item }}" state: directory owner: www-data group: www-data mode: '0755' loop: - /var/www/html - /etc/nginx/sites-available - /etc/nginx/sites-enabled - /var/log/nginx tags: ['nginx', 'config'] - name: Configure nginx main config template: src: nginx.conf.j2 dest: /etc/nginx/nginx.conf owner: root group: root mode: '0644' validate: 'nginx -t -c %s' notify: reload nginx tags: ['nginx', 'config'] - name: Remove default nginx site file: path: /etc/nginx/sites-enabled/default state: absent notify: reload nginx tags: ['nginx', 'config'] - name: Configure nginx site template: src: site.conf.j2 dest: "/etc/nginx/sites-available/{{ application_name }}.conf" owner: root group: root mode: '0644' validate: 'nginx -t' notify: reload nginx tags: ['nginx', 'config'] - name: Enable nginx site file: src: "/etc/nginx/sites-available/{{ application_name }}.conf" dest: "/etc/nginx/sites-enabled/{{ application_name }}.conf" state: link notify: reload nginx tags: ['nginx', 'config'] - name: Ensure nginx is running service: name: nginx state: started enabled: true tags: ['nginx', 'service'] - name: Configure logrotate for nginx copy: src: nginx-logrotate dest: /etc/logrotate.d/nginx owner: root group: root mode: '0644' tags: ['nginx', 'logging'] ``` ```yaml # roles/nginx/templates/nginx.conf.j2 user www-data; worker_processes {{ nginx_worker_processes }}; worker_rlimit_nofile 65535; error_log /var/log/nginx/error.log warn; pid /var/run/nginx.pid; events { worker_connections {{ nginx_worker_connections }}; use epoll; multi_accept on; } http { include /etc/nginx/mime.types; default_type application/octet-stream; log_format main '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$http_x_forwarded_for"'; access_log /var/log/nginx/access.log main; sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 65; types_hash_max_size 2048; server_tokens off; client_body_buffer_size 128k; client_max_body_size 100m; client_header_buffer_size 1k; large_client_header_buffers 4 4k; gzip on; gzip_vary on; gzip_proxied any; gzip_comp_level 6; gzip_types text/plain text/css text/xml text/javascript application/json application/javascript application/xml+rss application/rss+xml font/truetype font/opentype application/vnd.ms-fontobject image/svg+xml; include /etc/nginx/conf.d/*.conf; include /etc/nginx/sites-enabled/*; } ``` ```yaml # roles/docker/tasks/main.yml --- - name: Install prerequisites apt: name: - apt-transport-https - ca-certificates - curl - gnupg - lsb-release state: present tags: ['docker', 'prerequisites'] - name: Add Docker GPG key apt_key: url: https://download.docker.com/linux/{{ ansible_distribution | lower }}/gpg state: present tags: ['docker', 'repository'] - name: Add Docker repository apt_repository: repo: "deb https://download.docker.com/linux/{{ ansible_distribution | lower }} {{ ansible_distribution_release }} stable" state: present update_cache: true tags: ['docker', 'repository'] - name: Install Docker apt: name: - docker-ce - docker-ce-cli - containerd.io - docker-compose-plugin state: present tags: ['docker', 'packages'] - name: Create Docker directory file: path: /etc/docker state: directory owner: root group: root mode: '0755' tags: ['docker', 'config'] - name: Configure Docker daemon copy: src: daemon.json dest: /etc/docker/daemon.json owner: root group: root mode: '0644' notify: restart docker tags: ['docker', 'config'] - name: Ensure deploy user can use Docker user: name: deploy groups: docker append: true tags: ['docker', 'users'] - name: Ensure Docker is running service: name: docker state: started enabled: true tags: ['docker', 'service'] - name: Install Python Docker SDK pip: name: docker state: present tags: ['docker', 'python'] ``` ### 2. Terraform Infrastructure as Code #### Production Terraform Configuration ```hcl # terraform/terraform.tf terraform { required_version = ">= 1.0" required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } backend "s3" { bucket = "terraform-state-prod" key = "production/terraform.tfstate" region = "us-east-1" encrypt = true dynamodb_table = "terraform-locks" } } # terraform/provider.tf provider "aws" { region = var.aws_region default_tags { tags = { Environment = var.environment Project = var.project_name ManagedBy = "Terraform" } } } # terraform/variables.tf variable "aws_region" { description = "AWS region" type = string default = "us-east-1" } variable "environment" { description = "Environment name" type = string default = "production" } variable "project_name" { description = "Project name" type = string default = "myapp" } variable "vpc_cidr" { description = "VPC CIDR block" type = string default = "10.0.0.0/16" } variable "availability_zones" { description = "List of availability zones" type = list(string) default = ["us-east-1a", "us-east-1b"] } variable "instance_types" { description = "Instance types by tier" type = map(string) default = { web = "t3.medium" app = "t3.large" db = "r6g.large" } } # terraform/vpc.tf resource "aws_vpc" "main" { cidr_block = var.vpc_cidr enable_dns_support = true enable_dns_hostnames = true tags = { Name = "${var.project_name}-${var.environment}-vpc" Tier = "Network" } } resource "aws_internet_gateway" "main" { vpc_id = aws_vpc.main.id tags = { Name = "${var.project_name}-${var.environment}-igw" } } resource "aws_subnet" "public" { count = length(var.availability_zones) vpc_id = aws_vpc.main.id cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index) availability_zone = var.availability_zones[count.index] map_public_ip_on_launch = true tags = { Name = "${var.project_name}-${var.environment}-public-${var.availability_zones[count.index]}" Tier = "Public" } } resource "aws_subnet" "private" { count = length(var.availability_zones) vpc_id = aws_vpc.main.id cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index + 10) availability_zone = var.availability_zones[count.index] map_public_ip_on_launch = false tags = { Name = "${var.project_name}-${var.environment}-private-${var.availability_zones[count.index]}" Tier = "Private" } } resource "aws_subnet" "database" { count = length(var.availability_zones) vpc_id = aws_vpc.main.id cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index + 20) availability_zone = var.availability_zones[count.index] map_public_ip_on_launch = false tags = { Name = "${var.project_name}-${var.environment}-database-${var.availability_zones[count.index]}" Tier = "Database" } } resource "aws_eip" "nat" { count = length(var.availability_zones) domain = "vpc" tags = { Name = "${var.project_name}-${var.environment}-nat-${count.index}" } depends_on = [aws_internet_gateway.main] } resource "aws_nat_gateway" "main" { count = length(var.availability_zones) allocation_id = aws_eip.nat[count.index].id subnet_id = aws_subnet.public[count.index].id tags = { Name = "${var.project_name}-${var.environment}-nat-${count.index}" } depends_on = [aws_internet_gateway.main] } resource "aws_route_table" "public" { vpc_id = aws_vpc.main.id route { cidr_block = "0.0.0.0/0" gateway_id = aws_internet_gateway.main.id } tags = { Name = "${var.project_name}-${var.environment}-public-rt" } } resource "aws_route_table" "private" { count = length(var.availability_zones) vpc_id = aws_vpc.main.id route { cidr_block = "0.0.0.0/0" nat_gateway_id = aws_nat_gateway.main[count.index].id } tags = { Name = "${var.project_name}-${var.environment}-private-rt-${count.index}" } } resource "aws_route_table_association" "public" { count = length(var.availability_zones) subnet_id = aws_subnet.public[count.index].id route_table_id = aws_route_table.public.id } resource "aws_route_table_association" "private" { count = length(var.availability_zones) subnet_id = aws_subnet.private[count.index].id route_table_id = aws_route_table.private[count.index].id } resource "aws_route_table_association" "database" { count = length(var.availability_zones) subnet_id = aws_subnet.database[count.index].id route_table_id = aws_route_table.private[count.index].id } # terraform/security_groups.tf resource "aws_security_group" "web" { name = "${var.project_name}-${var.environment}-web" description = "Security group for web tier" vpc_id = aws_vpc.main.id ingress { description = "HTTPS from anywhere" from_port = 443 to_port = 443 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } ingress { description = "HTTP from anywhere" from_port = 80 to_port = 80 protocol = "tcp" cidr_blocks = ["0.0.0.0/0"] } egress { description = "All outbound traffic" from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } tags = { Name = "${var.project_name}-${var.environment}-web" Tier = "Web" } } resource "aws_security_group" "app" { name = "${var.project_name}-${var.environment}-app" description = "Security group for application tier" vpc_id = aws_vpc.main.id ingress { description = "Application port from web tier" from_port = 8080 to_port = 8080 protocol = "tcp" security_groups = [aws_security_group.web.id] } egress { description = "All outbound traffic" from_port = 0 to_port = 0 protocol = "-1" cidr_blocks = ["0.0.0.0/0"] } tags = { Name = "${var.project_name}-${var.environment}-app" Tier = "Application" } } resource "aws_security_group" "database" { name = "${var.project_name}-${var.environment}-database" description = "Security group for database tier" vpc_id = aws_vpc.main.id ingress { description = "PostgreSQL from application tier" from_port = 5432 to_port = 5432 protocol = "tcp" security_groups = [aws_security_group.app.id] } tags = { Name = "${var.project_name}-${var.environment}-database" Tier = "Database" } } # terraform/ec2.tf resource "aws_launch_template" "web" { name_prefix = "${var.project_name}-${var.environment}-web-" image_id = data.aws_ami.amazon_linux_2.id instance_type = var.instance_types.web key_name = aws_key_pair.main.key_name network_interfaces { associate_public_ip_address = true security_groups = [aws_security_group.web.id] } user_data = base64encode(templatefile("${path.module}/templates/web_user_data.sh", { environment = var.environment })) tag_specifications { resource_type = "instance" tags = { Name = "${var.project_name}-${var.environment}-web" Tier = "Web" } } lifecycle { create_before_destroy = true } } resource "aws_autoscaling_group" "web" { desired_capacity = 2 max_size = 4 min_size = 2 vpc_zone_identifier = aws_subnet.public[*].id target_group_arns = [aws_lb_target_group.web.arn] launch_template { id = aws_launch_template.web.id version = "$Latest" } tag { key = "Name" value = "${var.project_name}-${var.environment}-web" propagate_at_launch = true } } # terraform/load_balancer.tf resource "aws_lb" "main" { name = "${var.project_name}-${var.environment}-lb" internal = false load_balancer_type = "application" security_groups = [aws_security_group.web.id] subnets = aws_subnet.public[*].id enable_deletion_protection = false tags = { Name = "${var.project_name}-${var.environment}-lb" } } resource "aws_lb_target_group" "web" { name = "${var.project_name}-${var.environment}-web-tg" port = 80 protocol = "HTTP" vpc_id = aws_vpc.main.id health_check { enabled = true healthy_threshold = 2 interval = 30 matcher = "200" path = "/health" port = "traffic-port" protocol = "HTTP" timeout = 5 unhealthy_threshold = 3 } } resource "aws_lb_listener" "https" { load_balancer_arn = aws_lb.main.arn port = "443" protocol = "HTTPS" ssl_policy = "ELBSecurityPolicy-TLS-1-2-2017-01" certificate_arn = aws_acm_certificate.main.arn default_action { type = "forward" target_group_arn = aws_lb_target_group.web.arn } } resource "aws_lb_listener" "http" { load_balancer_arn = aws_lb.main.arn port = "80" protocol = "HTTP" default_action { type = "redirect" redirect { port = "443" protocol = "HTTPS" status_code = "HTTP_301" } } } ``` ### 3. CI/CD Pipeline Design #### GitHub Actions Production Pipeline ```yaml # .github/workflows/ci-cd.yml name: CI/CD Pipeline on: push: branches: [main, develop] pull_request: branches: [main, develop] env: AWS_REGION: us-east-1 ECR_REPOSITORY: myapp ECS_CLUSTER: production ECS_SERVICE: myapp-service jobs: # CI Job test: name: Test runs-on: ubuntu-latest steps: - name: Checkout code uses: actions/checkout@v4 - name: Set up Node.js uses: actions/setup-node@v4 with: node-version: '20' cache: 'npm' - name: Install dependencies run: npm ci - name: Run linter run: npm run lint - name: Run tests run: npm run test:coverage - name: Upload coverage uses: codecov/codecov-action@v3 with: files: ./coverage/lcov.info flags: unittests name: codecov-umbrella - name: Run security scan run: npm audit --audit-level=moderate - name: Run SAST uses: aquasecurity/trivy-action@master with: scan-type: 'fs' scan-ref: '.' format: 'sarif' output: 'trivy-results.sarif' - name: Upload Trivy results to GitHub Security tab uses: github/codeql-action/upload-sarif@v2 if: always() with: sarif_file: 'trivy-results.sarif' # Build Job build: name: Build Docker Image needs: test runs-on: ubuntu-latest outputs: image-tag: ${{ steps.meta.outputs.tags }} image-digest: ${{ steps.build.outputs.digest }} steps: - name: Checkout code uses: actions/checkout@v4 - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} aws-region: ${{ env.AWS_REGION }} - name: Login to Amazon ECR id: login-ecr uses: aws-actions/amazon-ecr-login@v2 - name: Set up Docker Buildx uses: docker/setup-buildx-action@v3 - name: Extract metadata id: meta uses: docker/metadata-action@v5 with: images: ${{ steps.login-ecr.outputs.registry }}/${{ env.ECR_REPOSITORY }} tags: | type=ref,event=branch type=ref,event=pr type=semver,pattern={{version}} type=semver,pattern={{major}}.{{minor}} type=sha,prefix={{branch}}- - name: Build and push Docker image id: build uses: docker/build-push-action@v5 with: context: . push: true tags: ${{ steps.meta.outputs.tags }} labels: ${{ steps.meta.outputs.labels }} cache-from: type=gha cache-to: type=gha,mode=max build-args: | BUILD_DATE=${{ github.event.repository.updated_at }} VCS_REF=${{ github.sha }} VERSION=${{ steps.meta.outputs.version }} - name: Image vulnerability scan uses: aquasecurity/trivy-action@master with: image-ref: ${{ steps.login-ecr.outputs.registry }}/${{ env.ECR_REPOSITORY }}:${{ github.sha }} format: 'sarif' output: 'trivy-image-results.sarif' - name: Upload Trivy results to GitHub Security uses: github/codeql-action/upload-sarif@v2 if: always() with: sarif_file: 'trivy-image-results.sarif' # Deploy to Staging deploy-staging: name: Deploy to Staging needs: build runs-on: ubuntu-latest if: github.ref == 'refs/heads/develop' environment: name: staging url: https://staging.example.com steps: - name: Checkout code uses: actions/checkout@v4 - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} aws-region: ${{ env.AWS_REGION }} - name: Deploy to ECS (Staging) run: | aws ecs update-service \ --cluster staging \ --service myapp-staging-service \ --force-new-deployment \ --region ${{ env.AWS_REGION }} - name: Wait for deployment run: | aws ecs wait services-stable \ --cluster staging \ --services myapp-staging-service \ --region ${{ env.AWS_REGION }} - name: Run integration tests run: | npm run test:integration -- --env=staging # Deploy to Production deploy-production: name: Deploy to Production needs: [build, deploy-staging] runs-on: ubuntu-latest if: github.ref == 'refs/heads/main' environment: name: production url: https://example.com steps: - name: Checkout code uses: actions/checkout@v4 - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} aws-region: ${{ env.AWS_REGION }} - name: Create deployment record run: | gh release create ${{ github.sha }} \ --title "Release ${{ github.sha }}" \ --notes "Deploying to production" - name: Blue-green deployment run: | # Switch traffic to new task set TASK_SET_ARN=$(aws ecs create-task-set \ --cluster production \ --service myapp-service \ --task-definition myapp:${{ github.sha }} \ --launch-type FARGATE \ --network-configuration "awsvpcConfiguration={subnets=[${{ env.PRIVATE_SUBNETS }}],securityGroups=[${{ env.SECURITY_GROUP }}],assignPublicIp=DISABLED}" \ --query 'taskSet.taskSetArn' \ --output text) # Gradual rollout for percentage in 10 25 50 75 100; do aws ecs update-service-primary-task-set \ --cluster production \ --service myapp-service \ --primary-task-set $TASK_SET_ARN \ --task-set $TASK_SET_ARN sleep 30 done - name: Run smoke tests run: | npm run test:smoke -- --env=production - name: Rollback on failure if: failure() run: | # Rollback to previous task set aws ecs update-service \ --cluster production \ --service myapp-service \ --task-definition myapp:previous \ --force-new-deployment - name: Notify team if: success() uses: 8398a7/action-slack@v3 with: status: ${{ job.status }} text: | Production deployment successful! Commit: ${{ github.sha }} Author: ${{ github.actor }} webhook_url: ${{ secrets.SLACK_WEBHOOK }} ``` ### 4. Kubernetes Deployment #### Helm Chart for Production ```yaml # helm/myapp/Chart.yaml apiVersion: v2 name: myapp description: A Helm chart for my application type: application version: 1.0.0 appVersion: "1.0" # helm/myapp/values.yaml replicaCount: 3 image: repository: myapp pullPolicy: IfNotPresent tag: "1.0.0" imagePullSecrets: [] nameOverride: "" fullnameOverride: "" serviceAccount: create: true annotations: {} name: "" podAnnotations: {} podSecurityContext: runAsNonRoot: true runAsUser: 1000 fsGroup: 1000 securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: true runAsUser: 1000 capabilities: drop: - ALL service: type: ClusterIP port: 80 annotations: {} ingress: enabled: true className: "nginx" annotations: cert-manager.io/cluster-issuer: "letsencrypt-prod" nginx.ingress.kubernetes.io/ssl-redirect: "true" nginx.ingress.kubernetes.io/force-ssl-redirect: "true" hosts: - host: example.com paths: - path: / pathType: Prefix tls: - secretName: myapp-tls hosts: - example.com resources: limits: cpu: 1000m memory: 512Mi requests: cpu: 100m memory: 128Mi autoscaling: enabled: true minReplicas: 3 maxReplicas: 10 targetCPUUtilizationPercentage: 80 targetMemoryUtilizationPercentage: 80 nodeSelector: {} tolerations: [] affinity: {} # helm/myapp/templates/deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: {{ include "myapp.fullname" . }} labels: {{- include "myapp.labels" . | nindent 4 }} spec: {{- if not .Values.autoscaling.enabled }} replicas: {{ .Values.replicaCount }} {{- end }} selector: matchLabels: {{- include "myapp.selectorLabels" . | nindent 6 }} template: metadata: {{- with .Values.podAnnotations }} annotations: {{- toYaml . | nindent 8 }} {{- end }} labels: {{- include "myapp.selectorLabels" . | nindent 8 }} spec: {{- with .Values.imagePullSecrets }} imagePullSecrets: {{- toYaml . | nindent 8 }} {{- end }} serviceAccountName: {{ include "myapp.serviceAccountName" . }} securityContext: {{- toYaml .Values.podSecurityContext | nindent 8 }} containers: - name: {{ .Chart.Name }} securityContext: {{- toYaml .Values.securityContext | nindent 10 }} image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}" imagePullPolicy: {{ .Values.image.pullPolicy }} ports: - name: http containerPort: 8080 protocol: TCP livenessProbe: httpGet: path: /health port: http initialDelaySeconds: 30 periodSeconds: 10 timeoutSeconds: 5 failureThreshold: 3 readinessProbe: httpGet: path: /ready port: http initialDelaySeconds: 10 periodSeconds: 5 timeoutSeconds: 3 failureThreshold: 3 resources: {{- toYaml .Values.resources | nindent 10 }} volumeMounts: - name: temp mountPath: /tmp - name: cache mountPath: /app/cache volumes: - name: temp emptyDir: {} - name: cache emptyDir: {} {{- with .Values.nodeSelector }} nodeSelector: {{- toYaml . | nindent 8 }} {{- end }} {{- with .Values.affinity }} affinity: {{- toYaml . | nindent 8 }} {{- end }} {{- with .Values.tolerations }} tolerations: {{- toYaml . | nindent 8 }} {{- end }} ``` ### 5. Decision Trees #### Automation Tool Selection ``` What to automate? │ ├─ Configuration management → Ansible, Chef, Puppet ├─ Infrastructure provisioning → Terraform, CloudFormation ├─ Container orchestration → Kubernetes, Docker Swarm ├─ CI/CD → Jenkins, GitLab CI, GitHub Actions ├─ Monitoring → Prometheus, Grafana, Datadog ├─ Log management → ELK Stack, Splunk └─ Security scanning → Trivy, SonarQube ``` #### Deployment Strategy Selection ``` Deployment requirements? │ ├─ Zero downtime → Blue-green, Canary ├─ Quick rollback → Blue-green ├─ Gradual rollout → Canary, Rolling ├─ Simple infrastructure → Rolling ├─ Complex microservices → Canary └─ Enterprise compliance → Blue-green with approvals ``` ### 6. Anti-Patterns to Avoid 1. **Hard-coded secrets**: Use vaults, never hard-code 2. **No testing**: Always test before deploying 3. **Manual deployments**: Automate everything 4. **No rollback plan**: Always have a rollback strategy 5. **Missing monitoring**: You can't manage what you don't measure 6. **Large monoliths**: Break into smaller, deployable units 7. **No version control**: Everything must be in git 8. **Tight coupling**: Design for independence 9. **No documentation**: Document your automation 10. **Ignoring security**: Security must be built-in ### 7. Quality Checklist Before considering automation production-ready: - [ ] All infrastructure codified - [ ] Secrets management implemented - [ ] Automated testing complete - [ ] CI/CD pipeline tested - [ ] Rollback procedure tested - [ ] Monitoring and alerting configured - [ ] Security scanning integrated - [ ] Documentation complete - [ ] Peer review completed - [ ] Disaster recovery tested - [ ] Configuration drift detection active - [ ] Automation idempotent - [ ] Error handling implemented - [ ] Performance testing completed - [ ] Compliance requirements met - [ ] Backup automation configured - [ ] Logging and auditing enabled - [ ] Team training completed - [ ] Runbooks documented - [ ] SLA requirements met This comprehensive skill definition provides complete guidance for server automation across modern infrastructure.