Puppet Pocket Book — Uplatz
50 Expanded Cards • One-Column Colorful Layout • Fundamentals · DSL · Hiera · Modules · Bolt · PuppetDB · Testing · Security · 20 Interview Q&A
1) What is Puppet?
Puppet is a declarative configuration management and automation platform. You describe the desired end state of infrastructure (packages, files, services, users, cloud resources); Puppet converges systems to that state idempotently across Linux/Unix/Windows and network devices.
2) Architecture Overview
- Puppet Server (Master/Primary): compiles catalogs from code + data.
- Puppet Agent: runs on nodes, applies catalogs locally.
- Facter: gathers facts (OS, IP, memory).
- PuppetDB: stores facts, catalogs, reports, exported resources.
- Code Delivery: r10k/Code Manager sync control repo → environments.
3) Catalog Compilation Flow
Agent sends facts → Server merges site code + modules + Hiera data + classification → compiles a catalog (ordered resources) → agent enforces locally → reports back (events, changes, metrics).
4) Declarative DSL & Idempotency
package { 'nginx': ensure => installed }
service { 'nginx': ensure => running, enable => true }
file { '/etc/nginx/nginx.conf':
ensure => file,
content => template('profile_nginx/nginx.conf.erb'),
notify => Service['nginx'],
}
Re-applying configurations yields no-op if already in desired state.
5) Resources & Metaparameters
- Ordering:
require
,before
,notify
,subscribe
- Tags for selective runs; refresh events trigger service restarts.
- Providers implement platform-specific actions (apt, yum, windows, systemd).
6) Classes & Defined Types
Class — singleton configuration unit; defined type — reusable “constructor” for many instances.
class role::web { include profile::nginx }
define profile::vhost($docroot) { ... }
profile::vhost { 'site1': docroot => '/var/www/site1' }
7) Roles & Profiles Pattern
Profiles wrap reusable module logic for a component; Roles compose profiles to represent a server role. Keeps site code clean and environment-agnostic.
8) Environments & Control Repo
Multiple environments (dev/test/prod) isolate code. A control repo (Git) holds environment branches; r10k/Code Manager deploys to /etc/puppetlabs/code/environments/<env>
.
# r10k example
r10k deploy environment -p # deploy all with module dependencies
9) Hiera: Data Outside Code
Place per-env/per-site data in YAML. Lookups merge by hierarchy (org → env → role → node).
# hiera.yaml (v5)
version: 5
hierarchy:
- name: "Per-node"
path: "nodes/%{trusted.certname}.yaml"
- name: "Per-role"
path: "roles/%{facts.role}.yaml"
- name: "Common"
path: "common.yaml"
10) Secure Data with eyaml & Sensitive
Encrypt secrets with hiera-eyaml and mark parameters as Sensitive
to avoid leaking in logs.
# Hiera value: password: ENC[PKCS7, ...]
class { 'db':
password => Sensitive(hiera('db::password')),
}
11) Files, Templates (ERB/EPP)
Generate configs with variables and facts via ERB/EPP. Notify services on change.
# ERB
user nginx;
worker_processes <%= @facts['processors']['count'] %>;
12) Packages & Services (Cross-Platform)
package { $facts['os']['family'] ? {
'RedHat' => 'httpd',
'Debian' => 'apache2',
default => 'httpd',
}: ensure => present }
service { $facts['os']['family'] ? {
'RedHat' => 'httpd',
'Debian' => 'apache2',
default => 'httpd',
}: ensure => running, enable => true }
13) Users, Groups, SSH
user { 'deploy': ensure => present, shell => '/bin/bash' }
file { '/home/deploy/.ssh/authorized_keys':
ensure => file, owner => 'deploy', mode => '0600', content => hiera('deploy::keys')
}
14) Cron, Systemd, SELinux
cron { 'logrotate-hourly': minute => '5', hour => '*/1', command => '/usr/sbin/logrotate /etc/logrotate.conf' }
service { 'nginx': provider => 'systemd', ensure => running }
augeas { 'set-selinux':
context => '/files/etc/selinux/config',
changes => 'set SELINUX permissive',
}
15) Windows: Packages, Services, Registry
package { '7zip': ensure => installed, provider => 'chocolatey' }
service { 'Spooler': ensure => running, enable => true }
registry_value { 'HKLM\Software\Uplatz\Key':
ensure => present, type => string, data => 'value',
}
16) Powershell & DSC
exec { 'set-exec-policy':
command => 'powershell -NoProfile -ExecutionPolicy Bypass -Command "Set-ExecutionPolicy RemoteSigned -Force"',
unless => 'powershell -NoProfile -Command "(Get-ExecutionPolicy) -eq ''RemoteSigned''"'
}
Use DSC resources via Puppet DSC modules where deeper Windows control is required.
17) Ordering & Refresh
file { '/etc/app.cfg': content => template('app/cfg.erb'), notify => Service['appd'] }
service { 'appd': ensure => running, enable => true }
On change, notify
sends refresh to restart/reload services.
18) Resource Collectors & Exported Resources
Collect resources across nodes (requires PuppetDB).
# export a resource on web nodes
@@sshkey { $::trusted['certname']: type => 'ssh-rsa', key => $pubkey }
# collect on bastion
Sshkey <| |>
19) Staging, Fileserver, Filebucket
- Filebucket stores file backups for rollback.
- Fileserver serves static files/binaries to agents.
- Stage large deployments by roles/environments to minimize blast radius.
20) Node Classification
Assign classes via site.pp
, an External Node Classifier (ENC), or the PE Console. Prefer roles by $trusted.certname
, facts, or groups for scale.
21) Control Repo Layout
control-repo/
├── environment.conf
├── hiera.yaml
├── data/ (Hiera)
├── site/ (roles, profiles)
└── Puppetfile (module sources)
Keep site logic (roles/profiles) here; reference third-party modules in Puppetfile.
22) r10k & Code Manager
r10k deploys Git branches to environments; Code Manager (PE) adds webhook-driven deployments, RBAC, and file sync across compilers.
23) PDK (Puppet Development Kit)
pdk new module profile_nginx
pdk validate
pdk test unit
Standardizes module skeletons, metadata, RuboCop/puppet-lint rules, and rspec scaffolding.
24) Linting & Style
puppet parser validate site.pp
puppet-lint site/
rubocop
Automate in CI to prevent bad code reaching production.
25) Unit Tests with rspec-puppet
it 'should compile profile::nginx' do
is_expected.to compile.with_all_deps
is_expected.to contain_service('nginx').with_ensure('running')
end
26) Integration Tests with Beaker/Test Kitchen
Spin ephemeral VMs/containers to run puppet apply
and assert system state. Gate merges on passing suites.
27) CI/CD Pipeline (Example)
- PR: lint + unit
- Merge to
dev
: r10k deploy → canary agent run - Promote to
prod
: r10k deploy → phased rollout
28) Environment Isolation & Pins
Pin module versions per environment via Puppetfile. Avoid floating versions to keep runs reproducible.
29) Performance Tuning (Server)
- Right-size JVM heap; tune JRuby pool count
- Enable Code Cache/File Sync
- Use compile masters/load balancer in larger estates
30) Agent Scale & Scheduling
Stagger runinterval
to avoid thundering herd; cache facts; use environment_timeout wisely to balance freshness and performance.
31) Q: Puppet vs Ansible vs Chef — when pick Puppet?
A: Choose Puppet for large, long-lived fleets needing strongly typed, declarative, idempotent state with robust classification, reporting, and compliance. Its compiled catalogs, Hiera, and PuppetDB excel at scaled governance.
32) Q: Class vs Defined Type?
A: Class is singleton (declared once per node). Defined type is a template you can instantiate many times (e.g., multiple vhosts). Use classes for roles/profiles, defined types for repeatable components.
33) Q: Order resources safely?
A: Prefer relationship metaparameters (require
, before
, notify
, subscribe
) or dependency chains; avoid relying on declaration order.
34) Q: Hiera hierarchy design?
A: Highest specificity → lowest: node → role/profile → environment → common. Keep secrets in eyaml; keep data flat and predictable; document keys.
35) Q: Prevent secret leakage in reports?
A: Use Sensitive
type for parameters, avoid notify => Service[...]
on resources holding secrets, and disable debug logging on secret rendering paths.
36) Q: What is PuppetDB used for?
A: Central store for facts, catalogs, reports, and exported resources. Enables PQL queries, inventory, orchestration, and cross-node resource collection.
37) Q: Exported resources & collectors?
A: Nodes export resources (e.g., ssh keys) to PuppetDB; other nodes collect them via resource collectors. Great for bastion known-hosts, monitoring registrations, etc.
38) Q: PE vs Open Source Puppet?
A: Puppet Enterprise adds RBAC, Console GUI, Code Manager, Orchestrator, compliance/reporting, and supported modules. OSS provides the engine and community modules.
39) Q: r10k vs Code Manager?
A: r10k is CLI-driven Git deployer; Code Manager (PE) adds webhook integration, RBAC, file-sync, and orchestration for multi-compiler architectures.
40) Q: Testing strategy?
A: Lint + parser validate → rspec-puppet (units) → Beaker/Test Kitchen (acceptance) → canary environment → phased rollout. Gate merges on tests.
41) Q: Handle “poison” configs safely?
A: Use staging and canaries; apply to a small node group; monitor reports; roll back via Git revert/r10k. For runtime failures, guard with conditionals and unless/onlyif
in exec
.
42) Q: Custom facts vs external facts?
A: Custom facts (Ruby) live in modules (lib/facter
); external facts are simple scripts or files under /etc/puppetlabs/facter/facts.d
. Prefer external for simplicity; custom for logic.
43) Q: When to write a custom type/provider?
A: When no built-in resource covers your system object, and you need idempotent lifecycle management (create/destroy/exists?). Keep providers minimal and testable.
44) Q: ENC vs site.pp classification?
A: ENC centralizes node→class mapping (often from CMDB). For small sites, site.pp may suffice. At scale, ENC/Console enables role-based, fact-based grouping and auditing.
45) Q: How to speed up compiles?
A: Profile JRuby pool, reduce Hiera I/O, cache template results, trim giant hierarchies, split monolith profiles, use additional compile masters behind a load balancer.
46) Q: Agent run interval & drift?
A: Default ~30m; decrease for tighter drift control, increase for scale. For critical drift, use pxp-agent
(PE) to task/plan on demand.
47) Q: Windows package management approaches?
A: Chocolatey provider, MSI package
resources, Powershell DSC integration for complex resources, Registry resources for settings.
48) Q: Secrets patterns?
A: Hiera eyaml + Sensitive; alternatively external secret stores (Vault) via lookup functions. Never log secrets; template carefully.
49) Q: Puppet Bolt vs Agent runs?
A: Bolt is agentless orchestration (tasks/plans) for ad-hoc or day-2 ops; agent runs enforce declared state continuously. Use both: Bolt for one-off changes, Puppet for desired state.
50) Q: Quick “Web Tier” design?
A: Role role::web
→ profiles profile::nginx
, profile::php
, profile::hardening
. Hiera for env data (ports, certs). Canary rollout, DL for failure reports, metrics via PuppetDB. Secrets: eyaml + Sensitive. CI: PDK lint/tests; CD: r10k deploy + phased agent runs.