feat: крупное обновление стека — пути, bootstrap, etcd, cert-manager, custom errors, ноды

## Переименование путей (rancher → kubernetes)

- Все пути /var/lib/rancher/k3s → /var/lib/kubernetes/k3s
- Все пути /etc/rancher/k3s   → /etc/kubernetes/k3s
- Добавлены переменные k3s_config_dir, k3s_data_dir, k3s_kubeconfig_path
- K3S install получил --data-dir и K3S_CONFIG_FILE флаги
- k3s-server-config.yaml.j2: добавлены write-kubeconfig и data-dir ключи
- Все роли (csi-nfs, ingress-nginx, cert-manager, prometheus, istio, cni)
  переведены на {{ k3s_kubeconfig_path }} вместо хардкода

## Bootstrap (новое)

- bootstrap.yml — playbook для первоначальной настройки нод
- roles/bootstrap/ — создаёт пользователя ansible, настраивает sudoers,
  деплоит SSH публичный ключ по паролю из vault
- host_vars/*/vault.yml.example — шаблоны с bootstrap_user/bootstrap_password
- make bootstrap, make vault-bootstrap-create NODE=..., make vault-bootstrap-edit NODE=...

## Добавление/удаление нод (новое)

- add-node.yml — добавляет мастер или воркер в существующий кластер через VIP
- remove-node.yml — cordon → drain → delete → uninstall → cleanup
- inventory/hosts.ini: добавлена группа [k3s_workers], обновлён [k3s_cluster:children]
- roles/k3s/tasks/main.yml: install_agent.yml для воркеров

## etcd backup/restore (новое)

- etcd-backup.yml / etcd-restore.yml — top-level playbooks
- roles/etcd/tasks/backup.yml — k3s etcd-snapshot save + retention cleanup
- roles/etcd/tasks/restore.yml — cluster-reset + перезапуск всех нод
- make etcd-backup, make etcd-restore SNAPSHOT=..., make etcd-list-snapshots

## cert-manager addon (новое)

- roles/cert-manager/ — установка через Helm + опциональный ClusterIssuer
- Поддержка: none | selfsigned | letsencrypt
- Шаблоны ClusterIssuer для selfsigned CA и ACME HTTP-01
- Управляется флагом cert_manager_enabled: false

## Custom error backend для ingress-nginx (новое)

- custom-error-page.html.j2 — тёмная Kubernetes-styled страница ошибок
- custom-error-backend.yaml.j2 — ConfigMap + Deployment (nginx) + Service
- nginx использует sub_filter для динамической подстановки X-Code/X-Message
- ingress-nginx Helm values: custom-http-errors, default-backend-service
- Управляется флагом ingress_nginx_custom_errors_enabled: true

## Hostname и пакеты (новое)

- prereqs.yml: установка hostname из inventory_hostname (пропускается в Molecule)
- prereqs.yml: установка k3s_common_packages (nfs-common, mc, htop, vim, wget, и др.)
- molecule_test: true в converge.yml исключает hostname из тестов

## Molecule improvements

- 3 платформы: master01 (Ubuntu 22.04) + worker01 (Ubuntu 22.04) + rpi01 (Debian 12)
- Molecule запускается из Docker контейнера через /var/run/docker.sock (DinD)
- Все пути в converge.yml и verify.yml обновлены под /etc/kubernetes/k3s

## Флаги включения компонентов

- kube_vip_enabled, nfs_server_enabled, csi_nfs_enabled, ingress_nginx_enabled
- cert_manager_enabled, istio_enabled, kiali_enabled, prometheus_stack_enabled
- Каждая роль пропускает установку через meta: end_play при disabled

## Документация

- README полностью переработан: все новые возможности с примерами
- Новые разделы: Управление нодами, etcd backup/restore, cert-manager, bootstrap
- Обновлены Makefile, docker/entrypoint.sh под все новые команды
This commit is contained in:
Sergey Antropoff
2026-04-23 06:32:14 +03:00
parent d9a35478a6
commit 24846d2e52
46 changed files with 1860 additions and 67 deletions

View File

@@ -0,0 +1,24 @@
---
# ─── Bootstrap — создание пользователя и деплой SSH ключа ────────────────────
# Пользователь, который будет создан для управления кластером
k3s_admin_user: ansible
k3s_admin_shell: /bin/bash
k3s_admin_comment: "K3S Ansible Admin"
# Дополнительные группы для k3s_admin_user (sudo добавляется отдельно)
k3s_admin_groups: []
# Путь к SSH публичному ключу внутри контейнера (монтируется из ~/.ssh)
# Поддерживает несколько ключей — укажи список файлов или строк
k3s_admin_ssh_public_key_files:
- /root/.ssh/id_ed25519.pub
# Дополнительные публичные ключи (строки) — добавляются помимо файлов выше
k3s_admin_ssh_additional_keys: []
# Отключить вход по паролю для SSH после деплоя ключа (рекомендуется)
k3s_admin_disable_password_auth: false
# Перезапустить SSH после изменения конфигурации
k3s_admin_restart_sshd: true

View File

@@ -0,0 +1,7 @@
---
- name: Restart sshd
ansible.builtin.systemd:
name: "{{ 'ssh' if ansible_os_family == 'Debian' else 'sshd' }}"
state: restarted
become: true
when: k3s_admin_restart_sshd | bool

View File

@@ -0,0 +1,117 @@
---
# ─────────────────────────────────────────────────────────────────────────────
# Bootstrap — создание пользователя для управления кластером + деплой SSH ключа
# Запускается один раз с первоначальными credentials (логин/пароль из vault)
# После этого все playbook работают через SSH ключ без пароля
# ─────────────────────────────────────────────────────────────────────────────
- name: Gather minimal facts
ansible.builtin.setup:
gather_subset:
- min
- name: Create admin group (if not exists)
ansible.builtin.group:
name: "{{ k3s_admin_user }}"
state: present
become: true
- name: Create k3s admin user
ansible.builtin.user:
name: "{{ k3s_admin_user }}"
comment: "{{ k3s_admin_comment }}"
shell: "{{ k3s_admin_shell }}"
groups: "{{ ([k3s_admin_user] + k3s_admin_groups) | unique }}"
append: true
create_home: true
state: present
become: true
- name: Configure passwordless sudo for admin user
ansible.builtin.copy:
dest: /etc/sudoers.d/{{ k3s_admin_user }}
content: "{{ k3s_admin_user }} ALL=(ALL) NOPASSWD:ALL\n"
mode: '0440'
validate: visudo -cf %s
become: true
- name: Ensure .ssh directory exists
ansible.builtin.file:
path: "/home/{{ k3s_admin_user }}/.ssh"
state: directory
owner: "{{ k3s_admin_user }}"
group: "{{ k3s_admin_user }}"
mode: '0700'
become: true
- name: Deploy SSH public keys from files
ansible.posix.authorized_key:
user: "{{ k3s_admin_user }}"
key: "{{ lookup('file', item) }}"
state: present
loop: "{{ k3s_admin_ssh_public_key_files }}"
become: true
loop_control:
label: "{{ item | basename }}"
ignore_errors: true
- name: Deploy additional SSH public keys (from vault strings)
ansible.posix.authorized_key:
user: "{{ k3s_admin_user }}"
key: "{{ item }}"
state: present
loop: "{{ k3s_admin_ssh_additional_keys }}"
become: true
when: k3s_admin_ssh_additional_keys | length > 0
- name: Disable SSH password authentication
ansible.builtin.lineinfile:
path: /etc/ssh/sshd_config
regexp: "^#?PasswordAuthentication"
line: "PasswordAuthentication no"
state: present
validate: /usr/sbin/sshd -t -f %s
become: true
when: k3s_admin_disable_password_auth | bool
notify: Restart sshd
- name: Ensure PermitRootLogin is disabled
ansible.builtin.lineinfile:
path: /etc/ssh/sshd_config
regexp: "^#?PermitRootLogin"
line: "PermitRootLogin no"
state: present
become: true
when: k3s_admin_disable_password_auth | bool
notify: Restart sshd
- name: Flush handlers (restart sshd if config changed)
ansible.builtin.meta: flush_handlers
- name: Verify SSH key authentication works
ansible.builtin.command: >
ssh -o StrictHostKeyChecking=no
-o BatchMode=yes
-o ConnectTimeout=10
-i {{ k3s_admin_ssh_public_key_files[0] | regex_replace('\.pub$', '') }}
{{ k3s_admin_user }}@{{ ansible_host | default(inventory_hostname) }}
echo ok
delegate_to: localhost
become: false
register: ssh_test
changed_when: false
failed_when: false
when: k3s_admin_ssh_public_key_files | length > 0
- name: Show bootstrap result
ansible.builtin.debug:
msg: >
Нода {{ inventory_hostname }}:
Пользователь '{{ k3s_admin_user }}' создан.
SSH ключ задеплоен.
{% if ssh_test is defined and ssh_test.rc == 0 %}
✓ SSH вход по ключу подтверждён.
{% else %}
⚠ SSH вход по ключу не проверен (возможно, ключ ещё не в ssh-agent).
{% endif %}
Добавь в inventory: ansible_user={{ k3s_admin_user }}

View File

@@ -0,0 +1,26 @@
---
# Включить установку cert-manager
cert_manager_enabled: false
cert_manager_version: "v1.15.3"
cert_manager_namespace: "cert-manager"
cert_manager_chart_repo: "https://charts.jetstack.io"
# ClusterIssuer: none | selfsigned | letsencrypt
cert_manager_issuer: "selfsigned"
# Let's Encrypt (нужен если cert_manager_issuer: letsencrypt)
cert_manager_acme_email: "admin@example.com"
cert_manager_acme_server: "prod" # prod | staging
cert_manager_acme_servers:
prod: "https://acme-v02.api.letsencrypt.org/directory"
staging: "https://acme-staging-v02.api.letsencrypt.org/directory"
cert_manager_resources:
requests:
cpu: 10m
memory: 32Mi
limits:
cpu: 100m
memory: 128Mi

View File

@@ -0,0 +1,88 @@
---
- name: Skip cert-manager if not enabled
ansible.builtin.meta: end_play
when: not cert_manager_enabled | default(false) | bool
- name: Add Jetstack Helm repo
kubernetes.core.helm_repository:
name: jetstack
repo_url: "{{ cert_manager_chart_repo }}"
environment:
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
- name: Install cert-manager via Helm
kubernetes.core.helm:
name: cert-manager
chart_ref: jetstack/cert-manager
chart_version: "{{ cert_manager_version }}"
release_namespace: "{{ cert_manager_namespace }}"
create_namespace: true
wait: true
timeout: "5m0s"
values:
installCRDs: true
resources: "{{ cert_manager_resources }}"
webhook:
resources: "{{ cert_manager_resources }}"
cainjector:
resources: "{{ cert_manager_resources }}"
environment:
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
register: cert_manager_deploy
- name: Wait for cert-manager webhook to be ready
ansible.builtin.command: >
k3s kubectl -n {{ cert_manager_namespace }} rollout status
deployment/cert-manager-webhook --timeout=120s
changed_when: false
retries: 5
delay: 10
- name: Create self-signed ClusterIssuer
ansible.builtin.template:
src: clusterissuer-selfsigned.yaml.j2
dest: /tmp/cert-manager-selfsigned-issuer.yaml
mode: '0644'
when: cert_manager_issuer == 'selfsigned'
- name: Apply self-signed ClusterIssuer
ansible.builtin.command: >
k3s kubectl apply -f /tmp/cert-manager-selfsigned-issuer.yaml
changed_when: true
retries: 5
delay: 10
when: cert_manager_issuer == 'selfsigned'
- name: Create Let's Encrypt ClusterIssuer
ansible.builtin.template:
src: clusterissuer-letsencrypt.yaml.j2
dest: /tmp/cert-manager-letsencrypt-issuer.yaml
mode: '0644'
when: cert_manager_issuer == 'letsencrypt'
- name: Apply Let's Encrypt ClusterIssuer
ansible.builtin.command: >
k3s kubectl apply -f /tmp/cert-manager-letsencrypt-issuer.yaml
changed_when: true
retries: 5
delay: 10
when: cert_manager_issuer == 'letsencrypt'
- name: Verify cert-manager pods
ansible.builtin.command: k3s kubectl -n {{ cert_manager_namespace }} get pods
register: cm_pods
changed_when: false
- name: Show cert-manager pods
ansible.builtin.debug:
msg: "{{ cm_pods.stdout_lines }}"
- name: Show ClusterIssuers
ansible.builtin.command: k3s kubectl get clusterissuer
register: cm_issuers
changed_when: false
failed_when: false
- name: Display ClusterIssuers
ansible.builtin.debug:
msg: "{{ cm_issuers.stdout_lines }}"

View File

@@ -0,0 +1,17 @@
---
# Let's Encrypt ClusterIssuer ({{ cert_manager_acme_server }})
# Требует: публичный домен + HTTP-01 challenge через ingress-nginx
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-{{ cert_manager_acme_server }}
spec:
acme:
server: "{{ cert_manager_acme_servers[cert_manager_acme_server] }}"
email: "{{ cert_manager_acme_email }}"
privateKeySecretRef:
name: letsencrypt-{{ cert_manager_acme_server }}-key
solvers:
- http01:
ingress:
ingressClassName: "{{ ingress_nginx_class_name | default('nginx') }}"

View File

@@ -0,0 +1,35 @@
---
# Self-signed root CA — для внутренних сертификатов
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: selfsigned-issuer
spec:
selfSigned: {}
---
# Самоподписанный CA сертификат кластера
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: cluster-ca
namespace: cert-manager
spec:
isCA: true
commonName: cluster-ca
secretName: cluster-ca-secret
privateKey:
algorithm: ECDSA
size: 256
issuerRef:
name: selfsigned-issuer
kind: ClusterIssuer
group: cert-manager.io
---
# CA Issuer — используй этот для выдачи сертификатов приложениям
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: cluster-ca-issuer
spec:
ca:
secretName: cluster-ca-secret

View File

@@ -11,7 +11,7 @@
chart_version: "{{ cilium_version }}"
release_namespace: "{{ cilium_namespace }}"
create_namespace: false
kubeconfig: /etc/rancher/k3s/k3s.yaml
kubeconfig: "{{ k3s_kubeconfig_path }}"
values:
k8sServiceHost: "{{ cilium_k8s_service_host }}"
k8sServicePort: "{{ cilium_k8s_service_port }}"

View File

@@ -1,4 +1,7 @@
---
# Включить установку CSI NFS Driver + StorageClass
csi_nfs_enabled: true
# Версия CSI NFS Driver
csi_nfs_version: "v4.8.0"

View File

@@ -1,4 +1,8 @@
---
- name: Skip CSI NFS if not enabled
ansible.builtin.meta: end_play
when: not csi_nfs_enabled | default(true) | bool
- name: Install NFS client on all K3S nodes
ansible.builtin.apt:
name: nfs-common
@@ -20,7 +24,7 @@
run_once: true
delegate_to: "{{ groups['k3s_master'][0] }}"
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
become: true
- name: Deploy CSI NFS Driver via Helm
@@ -55,7 +59,7 @@
run_once: true
delegate_to: "{{ groups['k3s_master'][0] }}"
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
become: true
- name: Deploy NFS StorageClass

View File

@@ -0,0 +1,13 @@
---
# Директория для снимков etcd на сервере
etcd_backup_dir: "{{ k3s_data_dir | default('/var/lib/kubernetes/k3s') }}/server/db/snapshots"
# Количество снимков для хранения (удаляет старые при превышении)
etcd_backup_retention: 5
# Скопировать снимок на локальную машину (откуда запускается Ansible)
etcd_backup_copy_to_local: false
etcd_backup_local_dir: "./etcd-backups"
# Имя снимка (пусто = автогенерация по дате)
etcd_backup_name: ""

View File

@@ -0,0 +1,56 @@
---
- name: Generate snapshot name
ansible.builtin.set_fact:
_etcd_snapshot_name: "{{ etcd_backup_name if etcd_backup_name | length > 0 else 'k3s-etcd-' + ansible_date_time.iso8601_basic_short }}"
- name: Create etcd snapshot
ansible.builtin.command: >
k3s etcd-snapshot save --name {{ _etcd_snapshot_name }}
become: true
register: etcd_snapshot_result
changed_when: true
- name: Show snapshot result
ansible.builtin.debug:
msg: "{{ etcd_snapshot_result.stdout_lines }}"
- name: List all snapshots
ansible.builtin.command: k3s etcd-snapshot ls
become: true
register: etcd_snapshot_list
changed_when: false
- name: Show snapshots
ansible.builtin.debug:
msg: "{{ etcd_snapshot_list.stdout_lines }}"
- name: Get snapshot files sorted by date
ansible.builtin.find:
paths: "{{ etcd_backup_dir }}"
patterns: "*.db"
age_stamp: mtime
register: snapshot_files
become: true
- name: Remove old snapshots beyond retention limit
ansible.builtin.file:
path: "{{ item.path }}"
state: absent
loop: "{{ (snapshot_files.files | sort(attribute='mtime'))[:-etcd_backup_retention] }}"
when: snapshot_files.files | length > etcd_backup_retention
become: true
- name: Copy snapshot to local machine
ansible.builtin.fetch:
src: "{{ etcd_backup_dir }}/{{ _etcd_snapshot_name }}.db"
dest: "{{ etcd_backup_local_dir }}/"
flat: true
become: true
when: etcd_backup_copy_to_local | bool
- name: Summary
ansible.builtin.debug:
msg: >
Снимок создан: {{ _etcd_snapshot_name }}.db
Путь: {{ etcd_backup_dir }}/{{ _etcd_snapshot_name }}.db
{% if etcd_backup_copy_to_local %}Скопирован в: {{ etcd_backup_local_dir }}/{% endif %}

View File

@@ -0,0 +1,104 @@
---
- name: Validate snapshot name is provided
ansible.builtin.assert:
that:
- etcd_restore_snapshot is defined
- etcd_restore_snapshot | length > 0
fail_msg: >
Укажи имя снимка: make etcd-restore SNAPSHOT=k3s-etcd-20250101T120000.db
Список доступных: make etcd-list-snapshots
- name: Check snapshot file exists
ansible.builtin.stat:
path: "{{ etcd_backup_dir }}/{{ etcd_restore_snapshot }}"
register: snapshot_stat
become: true
- name: Fail if snapshot not found
ansible.builtin.fail:
msg: >
Снимок не найден: {{ etcd_backup_dir }}/{{ etcd_restore_snapshot }}
Список доступных снимков: make etcd-list-snapshots
when: not snapshot_stat.stat.exists
- name: "ВНИМАНИЕ: восстановление удалит текущие данные etcd"
ansible.builtin.pause:
prompt: >
Восстановление etcd из: {{ etcd_restore_snapshot }}
Все текущие данные кластера будут ЗАМЕНЕНЫ данными из снимка.
Введи 'yes' для продолжения (Ctrl+C для отмены)
register: restore_confirm
when: not (etcd_restore_force | default(false) | bool)
- name: Check confirmation
ansible.builtin.fail:
msg: "Восстановление отменено"
when:
- not (etcd_restore_force | default(false) | bool)
- restore_confirm.user_input | default('') != 'yes'
# ── Остановить k3s на всех мастерах ──────────────────────────────────────────
- name: Stop k3s on all master nodes
ansible.builtin.systemd:
name: k3s
state: stopped
become: true
delegate_to: "{{ item }}"
loop: "{{ groups['k3s_master'] }}"
failed_when: false
# ── Остановить k3s-agent на всех воркерах (если есть) ────────────────────────
- name: Stop k3s-agent on all worker nodes
ansible.builtin.systemd:
name: k3s-agent
state: stopped
become: true
delegate_to: "{{ item }}"
loop: "{{ groups['k3s_workers'] | default([]) }}"
failed_when: false
when: groups['k3s_workers'] is defined and groups['k3s_workers'] | length > 0
- name: Wait for k3s to fully stop
ansible.builtin.pause:
seconds: 15
# ── Восстановление на первом мастере (k3s exits after reset) ─────────────────
- name: Restore etcd snapshot via cluster-reset
ansible.builtin.command: >
k3s server
--cluster-reset
--cluster-reset-restore-path={{ etcd_backup_dir }}/{{ etcd_restore_snapshot }}
become: true
environment:
K3S_TOKEN: "{{ k3s_token }}"
register: restore_result
changed_when: true
failed_when: restore_result.rc != 0
- name: Show restore output
ansible.builtin.debug:
msg: "{{ restore_result.stdout_lines }}"
# ── Запустить k3s на первом мастере ──────────────────────────────────────────
- name: Start k3s on first master
ansible.builtin.systemd:
name: k3s
state: started
become: true
- name: Wait for API server after restore
ansible.builtin.uri:
url: "https://127.0.0.1:6443/healthz"
validate_certs: false
status_code: 200
register: api_health
until: api_health.status == 200
retries: 30
delay: 10
- name: Summary
ansible.builtin.debug:
msg: >
etcd восстановлен из {{ etcd_restore_snapshot }}.
k3s запущен на {{ inventory_hostname }}.
Playbook автоматически запустит k3s на остальных мастерах и воркерах.

View File

@@ -1,4 +1,7 @@
---
# Включить установку ingress-nginx
ingress_nginx_enabled: true
# Версия ingress-nginx
ingress_nginx_version: "4.10.1" # Helm chart version
ingress_nginx_namespace: "ingress-nginx"
@@ -38,6 +41,26 @@ ingress_nginx_extra_args: {}
ingress_nginx_class_name: "nginx"
ingress_nginx_set_default_class: true
# ─── Custom error backend ─────────────────────────────────────────────────────
# Деплоит nginx-под с кастомной страницей ошибок, заменяет дефолтный backend
ingress_nginx_custom_errors_enabled: true
# Коды ошибок бекендов, перехватываемые контроллером → отправляются в error-backend
ingress_nginx_custom_http_errors: "400,401,403,404,405,408,413,429,500,502,503,504"
# Название кластера — отображается на странице ошибки
ingress_nginx_error_cluster_name: "K3S Cluster"
# Домен или описание кластера (опционально)
ingress_nginx_error_cluster_domain: ""
# nginx тег для error-backend пода
ingress_nginx_error_backend_nginx_tag: "1.27-alpine"
# Количество реплик error-backend
ingress_nginx_error_backend_replicas: 1
# ─── Ресурсы контроллера ──────────────────────────────────────────────────────
# Ресурсы контроллера
ingress_nginx_resources:
requests:

View File

@@ -1,4 +1,8 @@
---
- name: Skip ingress-nginx if not enabled
ansible.builtin.meta: end_play
when: not ingress_nginx_enabled | default(true) | bool
- name: Disable K3S built-in Traefik (required before ingress-nginx)
ansible.builtin.lineinfile:
path: "{{ k3s_config_dir }}/config.yaml"
@@ -30,7 +34,7 @@
delegate_to: "{{ groups['k3s_master'][0] }}"
run_once: true
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
- name: Template Helm values
ansible.builtin.template:
@@ -55,7 +59,37 @@
delegate_to: "{{ groups['k3s_master'][0] }}"
run_once: true
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
- name: Deploy custom error backend
when: ingress_nginx_custom_errors_enabled | bool
block:
- name: Render custom error backend manifest
ansible.builtin.template:
src: custom-error-backend.yaml.j2
dest: /tmp/ingress-nginx-error-backend.yaml
mode: '0644'
delegate_to: "{{ groups['k3s_master'][0] }}"
run_once: true
- name: Apply custom error backend
ansible.builtin.command: >
k3s kubectl apply -f /tmp/ingress-nginx-error-backend.yaml
become: true
delegate_to: "{{ groups['k3s_master'][0] }}"
run_once: true
changed_when: true
environment:
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
- name: Wait for error backend to be ready
ansible.builtin.command: >
k3s kubectl -n {{ ingress_nginx_namespace }}
rollout status deployment/ingress-nginx-errors --timeout=120s
become: true
delegate_to: "{{ groups['k3s_master'][0] }}"
run_once: true
changed_when: false
- name: Wait for ingress-nginx controller to be ready
ansible.builtin.command: >

View File

@@ -0,0 +1,163 @@
---
# Custom error backend для ingress-nginx
# Разворачивается Ansible — не редактировать вручную
apiVersion: v1
kind: ConfigMap
metadata:
name: ingress-nginx-errors-config
namespace: {{ ingress_nginx_namespace }}
labels:
app.kubernetes.io/name: ingress-nginx-errors
app.kubernetes.io/part-of: ingress-nginx
data:
nginx.conf: |
worker_processes 1;
error_log /dev/stderr warn;
pid /tmp/nginx.pid;
events { worker_connections 256; }
http {
include /etc/nginx/mime.types;
default_type text/html;
access_log /dev/stdout;
client_body_temp_path /tmp/client_body;
proxy_temp_path /tmp/proxy;
fastcgi_temp_path /tmp/fastcgi;
# Маппинг кода ошибки в человекочитаемое сообщение
map $http_x_code $error_message {
default "Произошла ошибка";
"400" "Некорректный запрос";
"401" "Требуется авторизация";
"403" "Доступ запрещён";
"404" "Страница не найдена";
"405" "Метод не разрешён";
"408" "Таймаут запроса";
"413" "Запрос слишком большой";
"429" "Слишком много запросов";
"500" "Внутренняя ошибка сервера";
"502" "Служба недоступна";
"503" "Служба временно недоступна";
"504" "Превышено время ожидания";
}
# Если X-Code пустой (прямой доступ к бекенду) — показываем 404
map $http_x_code $display_code {
"" "404";
default $http_x_code;
}
server {
listen 8080;
server_name _;
root /usr/share/nginx/html;
# Health check — ingress-nginx проверяет бекенд через этот endpoint
location /healthz {
return 200 "ok\n";
add_header Content-Type text/plain;
}
location / {
try_files /error.html =200;
# Подставляем код и сообщение в HTML через sub_filter
sub_filter '%%CODE%%' $display_code;
sub_filter '%%MESSAGE%%' $error_message;
sub_filter_once off;
add_header Cache-Control "no-cache, no-store, must-revalidate";
add_header X-Error-Code $display_code;
}
}
}
error.html: |
{{ lookup('template', 'custom-error-page.html.j2') | indent(4) }}
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: ingress-nginx-errors
namespace: {{ ingress_nginx_namespace }}
labels:
app.kubernetes.io/name: ingress-nginx-errors
app.kubernetes.io/part-of: ingress-nginx
spec:
replicas: {{ ingress_nginx_error_backend_replicas }}
selector:
matchLabels:
app.kubernetes.io/name: ingress-nginx-errors
template:
metadata:
labels:
app.kubernetes.io/name: ingress-nginx-errors
app.kubernetes.io/part-of: ingress-nginx
spec:
securityContext:
runAsNonRoot: true
runAsUser: 101
runAsGroup: 101
containers:
- name: error-backend
image: nginx:{{ ingress_nginx_error_backend_nginx_tag }}
ports:
- containerPort: 8080
volumeMounts:
- name: config
mountPath: /etc/nginx/nginx.conf
subPath: nginx.conf
- name: config
mountPath: /usr/share/nginx/html/error.html
subPath: error.html
- name: tmp
mountPath: /tmp
resources:
requests:
cpu: 10m
memory: 16Mi
limits:
cpu: 50m
memory: 32Mi
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 3
periodSeconds: 5
volumes:
- name: config
configMap:
name: ingress-nginx-errors-config
- name: tmp
emptyDir: {}
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
---
apiVersion: v1
kind: Service
metadata:
name: ingress-nginx-errors
namespace: {{ ingress_nginx_namespace }}
labels:
app.kubernetes.io/name: ingress-nginx-errors
app.kubernetes.io/part-of: ingress-nginx
spec:
selector:
app.kubernetes.io/name: ingress-nginx-errors
ports:
- name: http
port: 80
targetPort: 8080
type: ClusterIP

View File

@@ -0,0 +1,162 @@
<!DOCTYPE html>
<html lang="ru">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>%%CODE%% — {{ ingress_nginx_error_cluster_name }}</title>
<style>
*, *::before, *::after { margin: 0; padding: 0; box-sizing: border-box; }
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto,
'Helvetica Neue', Arial, sans-serif;
background: #0d1117;
color: #e6edf3;
min-height: 100vh;
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
padding: 1.5rem;
}
.card {
background: #161b22;
border: 1px solid #21262d;
border-radius: 12px;
padding: 3rem 3.5rem;
max-width: 520px;
width: 100%;
text-align: center;
box-shadow: 0 16px 48px rgba(0,0,0,0.4);
}
.icon {
width: 64px;
height: 64px;
margin: 0 auto 1.75rem;
}
.code {
font-size: 5.5rem;
font-weight: 800;
line-height: 1;
letter-spacing: -0.04em;
background: linear-gradient(135deg, #326CE5 0%, #00b4d8 100%);
-webkit-background-clip: text;
-webkit-text-fill-color: transparent;
background-clip: text;
margin-bottom: 0.5rem;
}
.divider {
width: 40px;
height: 3px;
background: linear-gradient(90deg, #326CE5, #00b4d8);
border-radius: 2px;
margin: 1.25rem auto;
}
.message {
font-size: 1.2rem;
font-weight: 500;
color: #c9d1d9;
margin-bottom: 0.5rem;
line-height: 1.4;
}
.meta {
font-size: 0.8rem;
color: #484f58;
font-family: 'SFMono-Regular', Consolas, 'Liberation Mono', Menlo, monospace;
margin-bottom: 2rem;
}
.meta span {
color: #30a46c;
}
.actions {
display: flex;
gap: 0.75rem;
justify-content: center;
flex-wrap: wrap;
}
.btn {
display: inline-flex;
align-items: center;
gap: 0.4rem;
padding: 0.55rem 1.2rem;
border-radius: 6px;
font-size: 0.875rem;
font-weight: 500;
text-decoration: none;
cursor: pointer;
transition: background 0.15s, border-color 0.15s, color 0.15s;
border: 1px solid transparent;
}
.btn-primary {
background: #326CE5;
color: #fff;
border-color: #1a56c9;
}
.btn-primary:hover { background: #1a56c9; }
.btn-ghost {
background: transparent;
color: #8b949e;
border-color: #30363d;
}
.btn-ghost:hover { color: #e6edf3; border-color: #8b949e; }
footer {
margin-top: 2rem;
font-size: 0.75rem;
color: #30363d;
}
</style>
</head>
<body>
<div class="card">
<!-- Kubernetes wheel icon -->
<svg class="icon" viewBox="0 0 64 64" fill="none" xmlns="http://www.w3.org/2000/svg">
<circle cx="32" cy="32" r="30" stroke="#326CE5" stroke-width="3" opacity="0.4"/>
<circle cx="32" cy="32" r="6" fill="#326CE5"/>
<!-- 7 spokes (360/7 ≈ 51.4°) -->
<line x1="32" y1="32" x2="32" y2="6" stroke="#326CE5" stroke-width="3" stroke-linecap="round"/>
<line x1="32" y1="32" x2="54" y2="18" stroke="#326CE5" stroke-width="3" stroke-linecap="round"/>
<line x1="32" y1="32" x2="58" y2="43" stroke="#326CE5" stroke-width="3" stroke-linecap="round"/>
<line x1="32" y1="32" x2="43" y2="59" stroke="#326CE5" stroke-width="3" stroke-linecap="round"/>
<line x1="32" y1="32" x2="21" y2="59" stroke="#326CE5" stroke-width="3" stroke-linecap="round"/>
<line x1="32" y1="32" x2="6" y2="43" stroke="#326CE5" stroke-width="3" stroke-linecap="round"/>
<line x1="32" y1="32" x2="10" y2="18" stroke="#326CE5" stroke-width="3" stroke-linecap="round"/>
<circle cx="32" cy="6" r="3" fill="#326CE5"/>
<circle cx="54" cy="18" r="3" fill="#326CE5"/>
<circle cx="58" cy="43" r="3" fill="#326CE5"/>
<circle cx="43" cy="59" r="3" fill="#326CE5"/>
<circle cx="21" cy="59" r="3" fill="#326CE5"/>
<circle cx="6" cy="43" r="3" fill="#326CE5"/>
<circle cx="10" cy="18" r="3" fill="#326CE5"/>
</svg>
<div class="code">%%CODE%%</div>
<div class="divider"></div>
<div class="message">%%MESSAGE%%</div>
<div class="meta">
<span>{{ ingress_nginx_error_cluster_name }}</span>
{% if ingress_nginx_error_cluster_domain %}
· {{ ingress_nginx_error_cluster_domain }}
{% endif %}
</div>
<div class="actions">
<a href="javascript:history.back()" class="btn btn-ghost">← Назад</a>
<a href="/" class="btn btn-primary">На главную</a>
</div>
</div>
<footer>{{ ingress_nginx_error_cluster_name }} · K3S Kubernetes Cluster</footer>
</body>
</html>

View File

@@ -37,6 +37,9 @@ controller:
# Логирование в JSON для удобного парсинга
config:
{% if ingress_nginx_custom_errors_enabled %}
custom-http-errors: "{{ ingress_nginx_custom_http_errors }}"
{% endif %}
log-format-upstream: >-
{"time":"$time_iso8601","remote_addr":"$remote_addr",
"x_forwarded_for":"$http_x_forwarded_for","request_id":"$req_id",
@@ -54,8 +57,11 @@ controller:
proxy-read-timeout: "600"
proxy-send-timeout: "600"
{% if ingress_nginx_extra_args %}
{% if ingress_nginx_extra_args or ingress_nginx_custom_errors_enabled %}
extraArgs:
{% if ingress_nginx_custom_errors_enabled %}
default-backend-service: "{{ ingress_nginx_namespace }}/ingress-nginx-errors"
{% endif %}
{% for key, value in ingress_nginx_extra_args.items() %}
{{ key }}: "{{ value }}"
{% endfor %}
@@ -83,6 +89,9 @@ controller:
failurePolicy: Fail
defaultBackend:
{% if ingress_nginx_custom_errors_enabled %}
enabled: false # кастомный error backend деплоится отдельно
{% else %}
enabled: true
image:
registry: registry.k8s.io
@@ -94,3 +103,4 @@ defaultBackend:
requests:
cpu: 10m
memory: 20Mi
{% endif %}

View File

@@ -19,7 +19,7 @@
delegate_to: "{{ groups['k3s_master'][0] }}"
run_once: true
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
- name: Create istio-system namespace
ansible.builtin.command: >
@@ -45,7 +45,7 @@
delegate_to: "{{ groups['k3s_master'][0] }}"
run_once: true
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
- name: Template istiod values
ansible.builtin.template:
@@ -70,7 +70,7 @@
delegate_to: "{{ groups['k3s_master'][0] }}"
run_once: true
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
- name: Wait for istiod to be ready
ansible.builtin.command: >
@@ -109,7 +109,7 @@
run_once: true
when: istio_install_gateway
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
- name: Apply default PeerAuthentication (mTLS mode)
ansible.builtin.template:
@@ -159,7 +159,7 @@
delegate_to: "{{ groups['k3s_master'][0] }}"
run_once: true
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
- name: Create kiali-admin ServiceAccount
ansible.builtin.command: >
@@ -240,7 +240,7 @@
delegate_to: "{{ groups['k3s_master'][0] }}"
run_once: true
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
- name: Wait for Kiali to be ready
ansible.builtin.command: >

View File

@@ -17,13 +17,15 @@
k3s_flannel_backend: "vxlan"
k3s_cni: "flannel"
k3s_install_dir: /usr/local/bin
k3s_config_dir: /etc/rancher/k3s
k3s_data_dir: /var/lib/rancher/k3s
k3s_config_dir: /etc/kubernetes/k3s
k3s_data_dir: /var/lib/kubernetes/k3s
k3s_kubeconfig_path: /etc/kubernetes/k3s/k3s.yaml
k3s_disable_traefik: true
k3s_disable_servicelb: false
k3s_disable_local_storage: false
k3s_extra_server_args: ""
k3s_api_url: "https://127.0.0.1:6443"
molecule_test: true
pre_tasks:
- name: Mock k3s binary (симулирует уже установленный k3s)
@@ -34,14 +36,14 @@
- name: Create k3s server data directory
ansible.builtin.file:
path: /var/lib/rancher/k3s/server
path: /var/lib/kubernetes/k3s/server
state: directory
mode: '0700'
- name: Mock k3s node-token
ansible.builtin.copy:
content: "K10::server:molecule-test-node-token\n"
dest: /var/lib/rancher/k3s/server/node-token
dest: /var/lib/kubernetes/k3s/server/node-token
mode: '0600'
tasks:
@@ -53,5 +55,5 @@
- name: Test server config template rendering
ansible.builtin.template:
src: "{{ playbook_dir }}/../../templates/k3s-server-config.yaml.j2"
dest: /etc/rancher/k3s/config.yaml
dest: /etc/kubernetes/k3s/config.yaml
mode: '0600'

View File

@@ -8,24 +8,24 @@
# ── Проверка директорий ─────────────────────────────────────────────────────
- name: Check k3s config directory exists
ansible.builtin.stat:
path: /etc/rancher/k3s
path: /etc/kubernetes/k3s
register: config_dir
- name: Assert config directory
ansible.builtin.assert:
that: config_dir.stat.isdir
fail_msg: "Директория /etc/rancher/k3s не создана"
fail_msg: "Директория /etc/kubernetes/k3s не создана"
# ── Проверка конфигурационного файла ────────────────────────────────────────
- name: Check config file exists
ansible.builtin.stat:
path: /etc/rancher/k3s/config.yaml
path: /etc/kubernetes/k3s/config.yaml
register: config_file
- name: Assert config file exists
ansible.builtin.assert:
that: config_file.stat.exists
fail_msg: "Файл /etc/rancher/k3s/config.yaml не создан"
fail_msg: "Файл /etc/kubernetes/k3s/config.yaml не создан"
- name: Check config file permissions (0600)
ansible.builtin.assert:
@@ -34,7 +34,7 @@
- name: Read config file
ansible.builtin.slurp:
src: /etc/rancher/k3s/config.yaml
src: /etc/kubernetes/k3s/config.yaml
register: config_raw
- name: Parse config as YAML

View File

@@ -17,8 +17,9 @@
set -o pipefail
curl -sfL https://get.k3s.io | \
INSTALL_K3S_VERSION="{{ k3s_version }}" \
INSTALL_K3S_EXEC="server" \
INSTALL_K3S_EXEC="server --data-dir {{ k3s_data_dir }}" \
K3S_TOKEN="{{ k3s_token }}" \
K3S_CONFIG_FILE="{{ k3s_config_dir }}/config.yaml" \
sh -
args:
executable: /bin/bash
@@ -36,13 +37,13 @@
- name: Wait for K3S node-token to be generated
ansible.builtin.wait_for:
path: /var/lib/rancher/k3s/server/node-token
path: "{{ k3s_data_dir }}/server/node-token"
timeout: 120
become: "{{ k3s_become }}"
- name: Read node-token
ansible.builtin.slurp:
src: /var/lib/rancher/k3s/server/node-token
src: "{{ k3s_data_dir }}/server/node-token"
register: node_token_raw
become: "{{ k3s_become }}"

View File

@@ -1,7 +1,7 @@
---
- name: Read kubeconfig from master
ansible.builtin.slurp:
src: /etc/rancher/k3s/k3s.yaml
src: "{{ k3s_kubeconfig_path }}"
register: kubeconfig_raw
become: "{{ k3s_become }}"

View File

@@ -6,10 +6,16 @@
ansible.builtin.include_tasks: rpi_cgroups.yml
when: ansible_architecture in ['armv7l', 'aarch64'] and rpi_cgroup_enable
- name: Install K3S server
- name: Install K3S server (master node)
ansible.builtin.include_tasks: install_server.yml
when: inventory_hostname in groups['k3s_master']
- name: Install K3S agent (worker node)
ansible.builtin.include_tasks: install_agent.yml
when:
- groups['k3s_workers'] is defined
- inventory_hostname in groups['k3s_workers']
- name: Configure node labels and taints
ansible.builtin.include_tasks: node_config.yml
when: k3s_node_labels | length > 0 or k3s_node_taints | length > 0

View File

@@ -13,6 +13,15 @@
- apt-transport-https
- gnupg
- iptables
- ipset
- socat
- conntrack
state: present
become: "{{ k3s_become }}"
- name: Install common utility packages
ansible.builtin.apt:
name: "{{ k3s_common_packages }}"
state: present
become: "{{ k3s_become }}"
@@ -65,3 +74,19 @@
state: directory
mode: '0755'
become: "{{ k3s_become }}"
- name: Set hostname to match inventory name
ansible.builtin.hostname:
name: "{{ inventory_hostname }}"
use: systemd
become: "{{ k3s_become }}"
when: not (molecule_test | default(false) | bool)
- name: Update /etc/hosts with inventory hostname
ansible.builtin.lineinfile:
path: /etc/hosts
regexp: '^127\.0\.1\.1'
line: "127.0.1.1 {{ inventory_hostname }}"
state: present
become: "{{ k3s_become }}"
when: not (molecule_test | default(false) | bool)

View File

@@ -12,13 +12,15 @@ flannel-backend: "none"
disable-network-policy: true
{% endif %}
write-kubeconfig: "{{ k3s_kubeconfig_path }}"
write-kubeconfig-mode: "0644"
data-dir: "{{ k3s_data_dir }}"
# HA embedded etcd: первый сервер инициализирует кластер, остальные присоединяются
{% if inventory_hostname == groups['k3s_master'][0] %}
{% if inventory_hostname == groups['k3s_master'][0] and not k3s_force_join | default(false) %}
cluster-init: true
{% else %}
server: "https://{{ hostvars[groups['k3s_master'][0]]['ansible_host'] }}:6443"
server: "https://{{ k3s_join_address | default(hostvars[groups['k3s_master'][0]]['ansible_host']) }}:6443"
{% endif %}
{% if k3s_disable_traefik or k3s_disable_servicelb or k3s_disable_local_storage %}

View File

@@ -1,4 +1,7 @@
---
# Включить установку kube-vip (VIP + LoadBalancer для Services)
kube_vip_enabled: true
# Версия kube-vip
kube_vip_version: "v0.8.3"
@@ -23,5 +26,5 @@ kube_vip_rbac_url: "https://kube-vip.io/manifests/rbac.yaml"
kube_vip_image: "ghcr.io/kube-vip/kube-vip"
# Путь для статического пода
kube_vip_manifest_dir: /var/lib/rancher/k3s/server/manifests
kube_vip_manifest_dir: "{{ k3s_data_dir | default('/var/lib/kubernetes/k3s') }}/server/manifests"
kube_vip_pod_manifest: "{{ kube_vip_manifest_dir }}/kube-vip.yaml"

View File

@@ -1,4 +1,8 @@
---
- name: Skip kube-vip if not enabled
ansible.builtin.meta: end_play
when: not kube_vip_enabled | default(true) | bool
- name: Resolve kube-vip network interface
ansible.builtin.set_fact:
_kube_vip_iface: "{{ kube_vip_interface if kube_vip_interface | length > 0 else ansible_default_ipv4.interface | default('eth0') }}"

View File

@@ -1,4 +1,7 @@
---
# Включить установку NFS сервера
nfs_server_enabled: true
# NFS экспорты — список точек монтирования
nfs_exports:
- path: /srv/nfs/k8s

View File

@@ -1,4 +1,8 @@
---
- name: Skip NFS server if not enabled
ansible.builtin.meta: end_play
when: not nfs_server_enabled | default(true) | bool
- name: Install NFS server packages
ansible.builtin.apt:
name: "{{ nfs_server_packages }}"

View File

@@ -19,7 +19,7 @@
delegate_to: "{{ groups['k3s_master'][0] }}"
run_once: true
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
- name: Update Helm repos
ansible.builtin.command: helm repo update
@@ -28,7 +28,7 @@
run_once: true
changed_when: false
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
- name: Create monitoring namespace
ansible.builtin.command: >
@@ -62,7 +62,7 @@
delegate_to: "{{ groups['k3s_master'][0] }}"
run_once: true
environment:
KUBECONFIG: /etc/rancher/k3s/k3s.yaml
KUBECONFIG: "{{ k3s_kubeconfig_path }}"
- name: Wait for Prometheus to be ready
ansible.builtin.command: >