Puppet Pocket Book

Puppet Pocket Book — Uplatz

50 Expanded Cards • One-Column Colorful Layout • Fundamentals · DSL · Hiera · Modules · Bolt · PuppetDB · Testing · Security · 20 Interview Q&A

1) What is Puppet?

Puppet is a declarative configuration management and automation platform. You describe the desired end state of infrastructure (packages, files, services, users, cloud resources); Puppet converges systems to that state idempotently across Linux/Unix/Windows and network devices.

2) Architecture Overview

  • Puppet Server (Master/Primary): compiles catalogs from code + data.
  • Puppet Agent: runs on nodes, applies catalogs locally.
  • Facter: gathers facts (OS, IP, memory).
  • PuppetDB: stores facts, catalogs, reports, exported resources.
  • Code Delivery: r10k/Code Manager sync control repo → environments.

3) Catalog Compilation Flow

Agent sends facts → Server merges site code + modules + Hiera data + classification → compiles a catalog (ordered resources) → agent enforces locally → reports back (events, changes, metrics).

4) Declarative DSL & Idempotency

package { 'nginx': ensure => installed }
service { 'nginx': ensure => running, enable => true }
file { '/etc/nginx/nginx.conf':
  ensure  => file,
  content => template('profile_nginx/nginx.conf.erb'),
  notify  => Service['nginx'],
}

Re-applying configurations yields no-op if already in desired state.

5) Resources & Metaparameters

  • Ordering: require, before, notify, subscribe
  • Tags for selective runs; refresh events trigger service restarts.
  • Providers implement platform-specific actions (apt, yum, windows, systemd).

6) Classes & Defined Types

Class — singleton configuration unit; defined type — reusable “constructor” for many instances.

class role::web { include profile::nginx }
define profile::vhost($docroot) { ... }
profile::vhost { 'site1': docroot => '/var/www/site1' }

7) Roles & Profiles Pattern

Profiles wrap reusable module logic for a component; Roles compose profiles to represent a server role. Keeps site code clean and environment-agnostic.

8) Environments & Control Repo

Multiple environments (dev/test/prod) isolate code. A control repo (Git) holds environment branches; r10k/Code Manager deploys to /etc/puppetlabs/code/environments/<env>.

# r10k example
r10k deploy environment -p  # deploy all with module dependencies

9) Hiera: Data Outside Code

Place per-env/per-site data in YAML. Lookups merge by hierarchy (org → env → role → node).

# hiera.yaml (v5)
version: 5
hierarchy:
  - name: "Per-node"
    path: "nodes/%{trusted.certname}.yaml"
  - name: "Per-role"
    path: "roles/%{facts.role}.yaml"
  - name: "Common"
    path: "common.yaml"

10) Secure Data with eyaml & Sensitive

Encrypt secrets with hiera-eyaml and mark parameters as Sensitive to avoid leaking in logs.

# Hiera value: password: ENC[PKCS7, ...]
class { 'db':
  password => Sensitive(hiera('db::password')),
}

11) Files, Templates (ERB/EPP)

Generate configs with variables and facts via ERB/EPP. Notify services on change.

# ERB
user  nginx;
worker_processes <%= @facts['processors']['count'] %>;

12) Packages & Services (Cross-Platform)

package { $facts['os']['family'] ? {
  'RedHat' => 'httpd',
  'Debian' => 'apache2',
  default  => 'httpd',
}: ensure => present }

service { $facts['os']['family'] ? {
  'RedHat' => 'httpd',
  'Debian' => 'apache2',
  default  => 'httpd',
}: ensure => running, enable => true }

13) Users, Groups, SSH

user { 'deploy': ensure => present, shell => '/bin/bash' }
file { '/home/deploy/.ssh/authorized_keys':
  ensure  => file, owner => 'deploy', mode => '0600', content => hiera('deploy::keys')
}

14) Cron, Systemd, SELinux

cron { 'logrotate-hourly': minute => '5', hour => '*/1', command => '/usr/sbin/logrotate /etc/logrotate.conf' }
service { 'nginx': provider => 'systemd', ensure => running }
augeas { 'set-selinux':
  context => '/files/etc/selinux/config',
  changes => 'set SELINUX permissive',
}

15) Windows: Packages, Services, Registry

package { '7zip': ensure => installed, provider => 'chocolatey' }
service { 'Spooler': ensure => running, enable => true }
registry_value { 'HKLM\Software\Uplatz\Key':
  ensure => present, type => string, data => 'value',
}

16) Powershell & DSC

exec { 'set-exec-policy':
  command => 'powershell -NoProfile -ExecutionPolicy Bypass -Command "Set-ExecutionPolicy RemoteSigned -Force"',
  unless  => 'powershell -NoProfile -Command "(Get-ExecutionPolicy) -eq ''RemoteSigned''"'
}

Use DSC resources via Puppet DSC modules where deeper Windows control is required.

17) Ordering & Refresh

file { '/etc/app.cfg': content => template('app/cfg.erb'), notify => Service['appd'] }
service { 'appd': ensure => running, enable => true }

On change, notify sends refresh to restart/reload services.

18) Resource Collectors & Exported Resources

Collect resources across nodes (requires PuppetDB).

# export a resource on web nodes
@@sshkey { $::trusted['certname']: type => 'ssh-rsa', key => $pubkey }
# collect on bastion
Sshkey <| |>

19) Staging, Fileserver, Filebucket

  • Filebucket stores file backups for rollback.
  • Fileserver serves static files/binaries to agents.
  • Stage large deployments by roles/environments to minimize blast radius.

20) Node Classification

Assign classes via site.pp, an External Node Classifier (ENC), or the PE Console. Prefer roles by $trusted.certname, facts, or groups for scale.

21) Control Repo Layout

control-repo/
  ├── environment.conf
  ├── hiera.yaml
  ├── data/ (Hiera)
  ├── site/ (roles, profiles)
  └── Puppetfile (module sources)

Keep site logic (roles/profiles) here; reference third-party modules in Puppetfile.

22) r10k & Code Manager

r10k deploys Git branches to environments; Code Manager (PE) adds webhook-driven deployments, RBAC, and file sync across compilers.

23) PDK (Puppet Development Kit)

pdk new module profile_nginx
pdk validate
pdk test unit

Standardizes module skeletons, metadata, RuboCop/puppet-lint rules, and rspec scaffolding.

24) Linting & Style

puppet parser validate site.pp
puppet-lint site/
rubocop

Automate in CI to prevent bad code reaching production.

25) Unit Tests with rspec-puppet

it 'should compile profile::nginx' do
  is_expected.to compile.with_all_deps
  is_expected.to contain_service('nginx').with_ensure('running')
end

26) Integration Tests with Beaker/Test Kitchen

Spin ephemeral VMs/containers to run puppet apply and assert system state. Gate merges on passing suites.

27) CI/CD Pipeline (Example)

  • PR: lint + unit
  • Merge to dev: r10k deploy → canary agent run
  • Promote to prod: r10k deploy → phased rollout

28) Environment Isolation & Pins

Pin module versions per environment via Puppetfile. Avoid floating versions to keep runs reproducible.

29) Performance Tuning (Server)

  • Right-size JVM heap; tune JRuby pool count
  • Enable Code Cache/File Sync
  • Use compile masters/load balancer in larger estates

30) Agent Scale & Scheduling

Stagger runinterval to avoid thundering herd; cache facts; use environment_timeout wisely to balance freshness and performance.

31) Q: Puppet vs Ansible vs Chef — when pick Puppet?

A: Choose Puppet for large, long-lived fleets needing strongly typed, declarative, idempotent state with robust classification, reporting, and compliance. Its compiled catalogs, Hiera, and PuppetDB excel at scaled governance.

32) Q: Class vs Defined Type?

A: Class is singleton (declared once per node). Defined type is a template you can instantiate many times (e.g., multiple vhosts). Use classes for roles/profiles, defined types for repeatable components.

33) Q: Order resources safely?

A: Prefer relationship metaparameters (require, before, notify, subscribe) or dependency chains; avoid relying on declaration order.

34) Q: Hiera hierarchy design?

A: Highest specificity → lowest: node → role/profile → environment → common. Keep secrets in eyaml; keep data flat and predictable; document keys.

35) Q: Prevent secret leakage in reports?

A: Use Sensitive type for parameters, avoid notify => Service[...] on resources holding secrets, and disable debug logging on secret rendering paths.

36) Q: What is PuppetDB used for?

A: Central store for facts, catalogs, reports, and exported resources. Enables PQL queries, inventory, orchestration, and cross-node resource collection.

37) Q: Exported resources & collectors?

A: Nodes export resources (e.g., ssh keys) to PuppetDB; other nodes collect them via resource collectors. Great for bastion known-hosts, monitoring registrations, etc.

38) Q: PE vs Open Source Puppet?

A: Puppet Enterprise adds RBAC, Console GUI, Code Manager, Orchestrator, compliance/reporting, and supported modules. OSS provides the engine and community modules.

39) Q: r10k vs Code Manager?

A: r10k is CLI-driven Git deployer; Code Manager (PE) adds webhook integration, RBAC, file-sync, and orchestration for multi-compiler architectures.

40) Q: Testing strategy?

A: Lint + parser validate → rspec-puppet (units) → Beaker/Test Kitchen (acceptance) → canary environment → phased rollout. Gate merges on tests.

41) Q: Handle “poison” configs safely?

A: Use staging and canaries; apply to a small node group; monitor reports; roll back via Git revert/r10k. For runtime failures, guard with conditionals and unless/onlyif in exec.

42) Q: Custom facts vs external facts?

A: Custom facts (Ruby) live in modules (lib/facter); external facts are simple scripts or files under /etc/puppetlabs/facter/facts.d. Prefer external for simplicity; custom for logic.

43) Q: When to write a custom type/provider?

A: When no built-in resource covers your system object, and you need idempotent lifecycle management (create/destroy/exists?). Keep providers minimal and testable.

44) Q: ENC vs site.pp classification?

A: ENC centralizes node→class mapping (often from CMDB). For small sites, site.pp may suffice. At scale, ENC/Console enables role-based, fact-based grouping and auditing.

45) Q: How to speed up compiles?

A: Profile JRuby pool, reduce Hiera I/O, cache template results, trim giant hierarchies, split monolith profiles, use additional compile masters behind a load balancer.

46) Q: Agent run interval & drift?

A: Default ~30m; decrease for tighter drift control, increase for scale. For critical drift, use pxp-agent (PE) to task/plan on demand.

47) Q: Windows package management approaches?

A: Chocolatey provider, MSI package resources, Powershell DSC integration for complex resources, Registry resources for settings.

48) Q: Secrets patterns?

A: Hiera eyaml + Sensitive; alternatively external secret stores (Vault) via lookup functions. Never log secrets; template carefully.

49) Q: Puppet Bolt vs Agent runs?

A: Bolt is agentless orchestration (tasks/plans) for ad-hoc or day-2 ops; agent runs enforce declared state continuously. Use both: Bolt for one-off changes, Puppet for desired state.

50) Q: Quick “Web Tier” design?

A: Role role::web → profiles profile::nginx, profile::php, profile::hardening. Hiera for env data (ports, certs). Canary rollout, DL for failure reports, metrics via PuppetDB. Secrets: eyaml + Sensitive. CI: PDK lint/tests; CD: r10k deploy + phased agent runs.