# Implementation Plan: Consolidated Homelab Secrets and Access Management

## Phase 1: Documentation and Runbook Baseline

### Changes

#### 1. Secrets management runbook
**File:** `runbooks/secrets-management.md`
**Action:** create

- Document the approved target model:
  - Vaultwarden/Bitwarden-compatible vault is the primary human/admin secrets vault.
  - Vault is for passwords, app passwords, API tokens, recovery codes, break-glass notes, SSH credential metadata, and secure operational notes.
  - Assistant direct vault access is not part of MVP.
  - User-mediated secret release remains the initial operating model.
  - Secret values, private keys, passwords, recovery codes, and API tokens must never be committed to git.
- Add credential metadata template/checklist:
  - Credential name/reference
  - Owner
  - Purpose
  - Scope/permissions
  - Created date
  - Rotation due date
  - Storage location, without value
  - Dependent services
  - Revocation path
  - Verification steps after grant/rotate/revoke
- Add sections for:
  - SSH credentials
  - Service app passwords/API tokens
  - Vault users
  - Break-glass/recovery material
  - Backup/recovery expectations
  - Public exposure prohibition without separate review
- Include explicit statement that recovery material must not exist only inside Vaultwarden.

#### 2. Assistant SSH access runbook
**File:** `runbooks/configure-assistant-ssh-access.md`
**Action:** modify

- Add reusable checklists for:
  - Grant assistant SSH access to a Debian/Ubuntu host.
  - Verify non-interactive assistant SSH access.
  - Rotate assistant SSH key access.
  - Revoke assistant SSH access.
  - Verify revoked/old access fails.
- Preserve existing safety rules:
  - Never paste private keys.
  - Install public keys only.
  - Prefer dedicated `piagent` account.
  - Lock password.
  - Use least-privilege sudo/service permissions.
  - Require confirmation before destructive actions.
- Add explicit pilot-only guidance:
  - First lifecycle test must use a disposable Debian/Ubuntu VM or LXC.
  - Do not test grant/rotate/revoke flows first on Nextcloud, AMP, Proxmox, OPNsense, Home Assistant, or other production hosts.

#### 3. Server change log guidance
**File:** `docs/server-change-log.md`
**Action:** modify

- Add or update a short section/template for secrets and access-management changes.
- Include fields for:
  - Date/time
  - Host/service
  - Reason
  - Access/credential changed, by reference only
  - Actions taken
  - Files/services changed
  - Verification
  - Rollback/revocation notes
  - Confirmation that no secret values were logged

#### 4. Systems inventory conventions
**File:** `systems/inventory.md`
**Action:** modify

- Add conventions for documenting vault and access-management hosts without secrets.
- Add placeholder entries only where approved facts are known.
- Do not invent VM IDs, hostnames, IPs, Proxmox nodes, storage pools, or network bridges.

#### 5. Ticket artifact index/notes
**File:** `tickets/artifacts/2026-05-18-consolidated-homelab-secrets-management/05-plan.md`
**Action:** create

- This plan is the only artifact written by the planner subagent.

### Verification
Automated:
- [x] `test -f runbooks/secrets-management.md`
- [x] `test -f runbooks/configure-assistant-ssh-access.md`
- [x] `test -f docs/server-change-log.md`
- [x] `test -f systems/inventory.md`
- [x] `grep -RInE '(password|token|secret|private key|recovery code) *= *[^`< ]' runbooks docs systems tickets/artifacts/2026-05-18-consolidated-homelab-secrets-management || true` and manually confirm matches are policy text/placeholders, not real secrets.
- [ ] If available, run repo markdown/link checks. (No repo-specific markdown/link checker found in this pass.)
Manual:
- [x] Confirm `runbooks/secrets-management.md` matches the design decisions in `03-design.md`.
- [x] Confirm assistant direct vault access remains explicitly deferred.
- [x] Confirm all credential examples are metadata/placeholders only.
- [ ] Confirm user understands secrets stay outside git.

### Rollback
- Revert documentation edits with `git checkout -- runbooks/secrets-management.md runbooks/configure-assistant-ssh-access.md docs/server-change-log.md systems/inventory.md` if no other intentional changes exist.
- If using git, review `git diff` before rollback to avoid discarding unrelated work.

## Phase 2: Disposable SSH Lifecycle Pilot Preparation

### Stop Conditions Before Deployment Actions

- [ ] Stop before creating a VM/LXC, modifying Proxmox, or touching any host until the user provides and approves:
  - Proxmox node name
  - Whether pilot target is VM or LXC
  - VMID/CTID or permission to allocate one
  - Hostname
  - IP/DHCP plan
  - Network bridge/VLAN
  - Storage pool
  - OS/template or ISO
  - CPU/RAM/disk sizing
  - Snapshot/rollback method
  - Whether passwordless sudo is allowed for `piagent`
- [ ] Stop if the proposed target is not disposable.
- [ ] Stop if the only available target is production Nextcloud, AMP, Proxmox, OPNsense, Home Assistant, or a user workstation.
- [ ] Stop if there is no approved rollback path such as snapshot or destroy/recreate.

### Changes

#### 1. SSH hosts configuration
**File:** `.pi/ssh/hosts.json`
**Action:** modify only after pilot host details are approved

- Add a temporary named alias, for example `ssh-lifecycle-pilot`, using approved host details.
- Keep `allowRawHosts` disabled.
- Enable destructive-command confirmation for the pilot alias.
- Use a non-secret key path reference only; do not copy key contents.
- Remove or disable the alias after revoke/destroy if the pilot is temporary.

#### 2. Assistant SSH runbook pilot notes
**File:** `runbooks/configure-assistant-ssh-access.md`
**Action:** modify

- Add a worked pilot checklist for the disposable Debian/Ubuntu target:
  - Confirm target is disposable.
  - Install assistant public key using existing bootstrap procedure or documented manual commands.
  - Verify login.
  - Add new key for rotation.
  - Verify new key.
  - Remove old key.
  - Verify old key fails.
  - Revoke remaining key and sudo/account if appropriate.
  - Verify revoked access fails.
  - Destroy or revert pilot target if desired.

#### 3. Server change log entries
**File:** `docs/server-change-log.md`
**Action:** modify during pilot execution only

- Add entries for pilot host creation/access changes, grant, rotation, revocation, verification, and rollback/destroy.
- Record host, reason, actions, files/services changed, verification, and rollback.
- Do not record private key, password, token, or recovery values.

#### 4. Systems inventory pilot entry
**File:** `systems/inventory.md`
**Action:** modify only after pilot host exists or is approved

- Add a clearly marked temporary/disposable pilot host entry.
- Include hostname, IP, role, access alias, and disposal/rollback notes.
- Remove or mark retired after pilot destruction.

#### 5. Ticket notes
**File:** relevant ticket under `tickets/` if one exists for implementation
**Action:** modify

- Record pilot status, decisions, and verification outcomes.
- Do not create new scope beyond SSH lifecycle proof.

### Verification
Automated:
- [x] `python3 -m json.tool .pi/ssh/hosts.json >/dev/null`
- [x] Run the approved SSH connection test after grant, using only configured key paths.
- [x] Run the same connection test with the old key after rotation and confirm it fails.
- [x] Run the same connection test after revocation and confirm it fails.
Manual:
- [x] Confirm pilot host is disposable before access changes.
- [x] Confirm login succeeds after grant.
- [x] Confirm rotated key works before removing old key.
- [x] Confirm old/revoked access fails.
- [x] Confirm server change log contains actions and rollback notes without secrets.
- [x] Confirm rollback by destruction is available and completed.

### Rollback
- Revert `.pi/ssh/hosts.json` to remove the pilot alias.
- Remove/revoke assistant public keys from the pilot host or destroy/revert the pilot VM/LXC.
- Remove/disable `piagent` sudo/account on the pilot host if not destroying it.
- Update `systems/inventory.md` to mark the pilot retired or remove the temporary entry.
- Log rollback in `docs/server-change-log.md`.

## Phase 3: Vaultwarden VM Deployment Planning

### Stop Conditions Before Deployment Actions

- [ ] Stop before creating any Vaultwarden VM until the user provides and approves:
  - Proxmox node name
  - VMID allocation
  - VM hostname/FQDN
  - Static IP or DHCP reservation
  - Network bridge/VLAN and LAN/Tailscale reachability plan
  - Storage pool
  - CPU/RAM/disk sizing
  - OS image/template
  - Admin SSH access method
  - TLS strategy/internal DNS name
  - Backup destination and encryption approach
  - Snapshot plan
- [ ] Stop before configuring public DNS, port forwarding, public reverse proxy, or internet exposure; this requires a separate security review.
- [ ] Stop before granting assistant direct vault access; this is explicitly out of MVP.
- [ ] Stop before migrating critical secrets if backup and recovery steps are not approved.

### Changes

#### 1. Vaultwarden deployment runbook
**File:** `runbooks/vaultwarden-deployment.md`
**Action:** create

- Document dedicated Proxmox VM approach.
- Include sections for:
  - Required user/Proxmox details before implementation
  - VM sizing placeholders to be filled only from user-approved values
  - Hostname/FQDN and internal DNS plan
  - LAN/Tailscale-only exposure
  - TLS approach
  - Admin SSH access model
  - Service deployment approach
  - Update/maintenance model
  - Snapshot before major changes
  - No public exposure without separate review
  - No assistant direct vault access in MVP
- Include a preflight checklist and explicit go/no-go gate.

#### 2. Vaultwarden operations runbook
**File:** `runbooks/vaultwarden-operations.md`
**Action:** create

- Document routine operations:
  - Start/stop/restart/status checks
  - Update procedure
  - Health checks
  - User/admin account management at metadata level only
  - Credential migration checklist
  - Revocation and rotation expectations
  - Incident/lockout response pointers
- Do not include secret values or recovery codes.

#### 3. Systems inventory vault entry
**File:** `systems/inventory.md`
**Action:** modify after user approves concrete details

- Add Vaultwarden VM entry with approved hostname/IP/role/access boundary.
- Mark as critical infrastructure.
- Document that vault is LAN/Tailscale-only.
- Reference `runbooks/vaultwarden-deployment.md` and `runbooks/vaultwarden-operations.md`.

#### 4. Network plan update
**File:** `systems/network-plan.md`
**Action:** modify

- Add planned vault network placement.
- Document LAN/Tailscale-only access boundary.
- Explicitly state no public port forwarding by default.
- Leave unknown VLAN/firewall values as `TBD - user/proxmox approval required`, not guessed values.

#### 5. Server change log template/entry
**File:** `docs/server-change-log.md`
**Action:** modify during actual deployment only

- Add deployment entries when infrastructure is actually touched.
- Include snapshot, VM creation, package/service installation, TLS/reverse proxy, firewall/network changes, verification, and rollback notes.

### Verification
Automated:
- [x] `test -f runbooks/vaultwarden-deployment.md`
- [x] `test -f runbooks/vaultwarden-operations.md`
- [x] If structured snippets are added, validate their syntax with appropriate tools. (No structured snippets added.)
- [ ] If available, run markdown/link checks. (No repo-specific markdown/link checker found in this pass.)
Manual:
- [x] User approves VM instead of LXC.
- [ ] User approves hostname, IP/DNS, network boundary, and TLS strategy. (Partially complete: `vm.dropcutstud.io`, DHCP, default bridge/storage, default/next VMID, minimum sizing, and no public exposure approved; user will handle LAN DNS via Unbound, Tailscale is separate, final TLS remains unresolved.)
- [x] User confirms no public port forwarding is planned.
- [ ] User confirms recovery material will be stored outside the vault. (Follow-up ticket created.)
- [ ] User confirms backup approach before critical secrets are migrated. (Follow-up ticket created.)

### Rollback
- Planning-only rollback: revert documentation changes.
- Deployment rollback, when later implemented: stop Vaultwarden service, revert Proxmox snapshot or destroy VM, remove DNS/reverse-proxy/firewall entries, remove inventory entry or mark retired, and log rollback.

## Phase 4: Basic Backup and Restore Test Planning

### Stop Conditions Before Deployment Actions

- [ ] Stop before configuring backup jobs until the user approves:
  - Backup destination
  - Encryption method
  - Retention policy
  - Off-host/offline copy expectation
  - Operator responsible for recovery material
- [ ] Stop before relying on Vaultwarden for critical secrets until at least one basic backup has completed.
- [ ] Stop before treating Vaultwarden as authoritative until a restore test succeeds or the user explicitly accepts the risk.
- [ ] Stop before restore testing if there is no isolated restore target or if the test could overwrite production data.

### Changes

#### 1. Vaultwarden backup runbook
**File:** `runbooks/vaultwarden-backup.md`
**Action:** create

- Document backup scope:
  - Vaultwarden database/data directory
  - Configuration
  - Attachments/sends/icons if enabled
  - Admin token material, by storage reference only
  - TLS/reverse-proxy config
  - Backup encryption material, by external storage reference only
- Document backup frequency, destination, encryption, retention, and verification approach once approved.
- Include manual backup and restore-prep checklists.
- Include rule that backup decryption/recovery material must not live only inside Vaultwarden.

#### 2. Vaultwarden restore test runbook
**File:** `runbooks/vaultwarden-restore-test.md`
**Action:** create

- Document isolated restore target requirements.
- Include restore test checklist:
  - Prepare isolated target.
  - Restore backup copy.
  - Start service isolated from production.
  - Verify login/data availability without modifying production.
  - Record test date/result.
  - Destroy isolated restore target or preserve as approved.
- Include failure handling and rollback.

#### 3. Server change log backup/restore entries
**File:** `docs/server-change-log.md`
**Action:** modify during actual backup/restore testing only

- Log backup configuration, first backup result, restore test, verification, and rollback/destruction of restore target.
- Do not log backup encryption keys, master passwords, recovery codes, or tokens.

#### 4. Systems inventory update
**File:** `systems/inventory.md`
**Action:** modify after backup/restore targets are approved

- Add backup/restore target references at metadata level.
- Mark any restore-test host as disposable/temporary.

### Verification
Automated:
- [x] `test -f runbooks/vaultwarden-backup.md`
- [x] `test -f runbooks/vaultwarden-restore-test.md`
- [ ] Later implementation only: backup command/job exits successfully.
- [ ] Later implementation only: expected backup artifacts exist and are non-empty.
Manual:
- [ ] Confirm backup scope covers all Vaultwarden-critical data/config.
- [ ] Confirm backup encryption/recovery material is accessible outside the vault.
- [ ] Confirm restore test target is isolated from production.
- [ ] Confirm restored data is usable in the isolated restore target.
- [ ] Confirm failed restore test leaves production vault unchanged.

### Rollback
- Planning-only rollback: revert runbook/docs changes.
- Backup configuration rollback, when later implemented: disable backup job, remove invalid backup config, preserve any last known-good backups, and log rollback.
- Restore test rollback: stop/destroy isolated restore target and confirm production Vaultwarden remains unchanged.

## Phase 5: Initial Credential Migration and Operating Practice

### Stop Conditions Before Deployment Actions

- [ ] Stop before migrating critical credentials until Vaultwarden basic backup exists.
- [ ] Stop before mass migration; MVP migration should be small and reversible.
- [ ] Stop before entering any secret values into repo, chat transcript, ticket, or command history.
- [ ] Stop before changing production service passwords/tokens without a service-specific rollback/revocation plan.

### Changes

#### 1. Secrets management runbook migration checklist
**File:** `runbooks/secrets-management.md`
**Action:** modify

- Add initial migration checklist for a small set of low-risk credentials or metadata records.
- Include per-credential steps:
  - Confirm owner.
  - Confirm purpose/scope.
  - Store value in Vaultwarden only.
  - Record metadata in repo if useful, without value.
  - Record revocation path.
  - Confirm backup coverage.
- Include guidance for break-glass/recovery materials and what must stay outside assistant access.

#### 2. Vaultwarden operations runbook migration section
**File:** `runbooks/vaultwarden-operations.md`
**Action:** modify

- Add operational procedure for creating vault entries and secure notes.
- Add user/admin account revocation checklist.
- Add service token/app-password rotation checklist at metadata level.

#### 3. Systems inventory references
**File:** `systems/inventory.md`
**Action:** modify as credentials are migrated

- Add metadata references for services whose credential storage location changes to Vaultwarden.
- Do not record values.

#### 4. Server change log entries
**File:** `docs/server-change-log.md`
**Action:** modify during actual service credential changes only

- Log credential migrations only when they affect server-side services or operational access.
- Include credential reference, purpose, dependent service, verification, and rollback/revocation path.
- Do not log values.

### Verification
Automated:
- [ ] Run the plaintext-secret grep review from Phase 1 after documentation updates.
- [ ] If available, run markdown/link checks.
Manual:
- [ ] Confirm each migrated credential has owner/purpose/scope/storage-location/revocation metadata.
- [ ] Confirm no credential value was written to git.
- [ ] Confirm migrated credentials remain recoverable if vault is unavailable, where break-glass is required.
- [ ] Confirm dependent services still work after any credential rotation.

### Rollback
- For documentation-only migration metadata, revert docs or mark entries obsolete.
- For service credential migration, use the documented per-service rollback:
  - Restore previous credential if safe and approved, or
  - Issue a new replacement credential, update dependent service, verify, and revoke failed/old credential.
- Log rollback without secret values.

## Phase 6: Later Assistant Authorization Design Gate

### Changes

#### 1. Assistant vault authorization design artifact
**File:** `tickets/artifacts/2026-05-18-consolidated-homelab-secrets-management/06-assistant-authorization-design.md` or a future ticket artifact
**Action:** create later only after MVP deployment/backup/restore baseline

- Design whether and how the assistant may request or receive secrets.
- Include:
  - Explicit user approval flow
  - Least-privilege role/task boundaries
  - Audit logging
  - Session/expiry controls
  - Revocation process
  - Emergency stop path
  - Alternatives to direct vault access
- Compare user-mediated release, scoped app passwords/API tokens, SSH certificates, Tailscale ACLs, SOPS/age, and OpenBao/Vault.

#### 2. Assistant role architecture
**File:** `docs/assistant-role-architecture.md`
**Action:** modify later only if assistant authorization is approved

- Add approved assistant access boundary and role-specific constraints.

#### 3. Agentic engineering controls
**File:** `docs/agentic-engineering-lite.md`
**Action:** modify later only if assistant authorization is approved

- Add vault/secret access control gates, audit requirements, and stop/rollback paths.

#### 4. Secrets and vault runbooks
**File:** `runbooks/secrets-management.md`, `runbooks/vaultwarden-operations.md`
**Action:** modify later only if assistant authorization is approved

- Add final approved assistant authorization operations.
- Do not add implementation steps until design is approved.

### Verification
Automated:
- [ ] Documentation checks only.
- [ ] Confirm no automation for assistant vault access is introduced before approval.
Manual:
- [ ] User reviews and approves assistant authorization model before implementation.
- [ ] Confirm blast radius is acceptable.
- [ ] Confirm revocation/disable path is clear.
- [ ] Confirm auditability is sufficient.

### Rollback
- If assistant authorization is not approved, leave direct vault access unimplemented and document the decision.
- If documentation was added prematurely, revert it or mark as rejected/deferred.
- Any future implemented assistant access must have a tested disable/revoke path before use.

## Git and Review Notes

- [x] Run `git status --short` before implementation begins.
- [x] Keep changes scoped by phase where practical.
- [x] Review `git diff` before committing or rolling back.
- [x] Do not commit secret values, private keys, recovery codes, API tokens, app passwords, or copied credential material.
- [ ] Prefer separate commits for documentation baseline, pilot SSH lifecycle, Vaultwarden planning, backup/restore planning, and later authorization design.
- [ ] If unrelated dirty work exists, avoid broad checkout/reset commands and coordinate before rollback.

## Global Acceptance Checklist

- [x] Documentation baseline exists before touching any host.
- [x] Disposable SSH lifecycle pilot is completed before production-host access lifecycle changes.
- [x] User/Proxmox details were obtained before the disposable pilot VM/LXC creation.
- [ ] Vaultwarden deployment plan is approved before VM creation or service installation.
- [ ] Vault remains LAN/Tailscale-only by default.
- [ ] No public exposure is configured without separate security review.
- [ ] Basic backup exists before critical secrets are migrated.
- [ ] Restore test passes before Vaultwarden is treated as authoritative, unless user explicitly accepts risk.
- [x] Assistant direct vault access remains deferred until later authorization/security design is approved.
- [x] All operational changes are logged in `docs/server-change-log.md` without secret values.
