About Matt Ridpath

I am a systems administrator from central North Carolina. My preferred platform is Linux, but I will also get my hands dirty with Windows from time to time. I particularly enjoy working with Puppet, as I find it makes systems easier and more fun to manage, eliminating so many of the repetitive tasks. I fully admit that I'm not the best coder out there, and that my preferred language is Perl. I derive the most satisfaction from the problem-solving part of my job, and from making systems more easily-configurable and less static. My interests include kayaking, hiking, car repair, and journal writing.

Set up Percona pt-heartbeat for Monitoring of MySQL Replication

In general, for monitoring standard MySQL replication, it is common practice to check the Seconds_Behind_Master variable. There are cases, however, where the Seconds_Behind_Master can have a low number, but replication is in fact broken. For checking this more accurately, Percona has written a program called pt-heartbeat, which continuously inserts timestamped data into a single row in a single table on the master and verifies the data on the slave. More information about the program itself can be found in the guide here. pt-heartbeat is compatible with all forks of MySQL, including MariaDB, Percona, and MySQL Community Edition. In this post, I will explain how you can configure this using Puppet and Nagios on a CentOS 6 system (it should be noted that Zabbix can be used for this as well).

One of the critical prerequisites for setting up pt-heartbeat monitoring is that both the master and the slave(s) have their clocks properly synchronized with NTP. If you’re using Puppet, an easy way of managing this is to use the puppetlabs/ntp Forge module. An out-of-sync system clock can result in a skew of the seconds of delay count. This guide also uses Nagios and NRPE with exported configurations through PuppetDB in the example code. However, this is not necessary in order to monitor pt-heartbeat. You would just leave off the “@@nagios_service” resources in your Puppet manifests. Finally, this assumes that you are using the puppetlabs/mysql Forge module to manage MySQL on the server. It is possible, however, to use this example with a manually-configured MySQL server; you will need to add the user account and database manually, though.

For this example, I have placed all Puppet files and manifests for pt-heartbeat under a “percona” module. Note: for this example, init.pp is not actually in use and is empty. Below is the directory tree:

percona
├── files
│   ├── heartbeat_master_cfg
│   ├── nrpe_pt_heartbeat_proc
│   └── pt-heartbeat_init
├── manifests
│   ├── heartbeat
│   │   ├── master.pp
│   │   └── slave.pp
│   ├── heartbeat.pp
│   ├── init.pp
│   ├── params.pp
│   └── repo.pp
└── templates
    ├── heartbeat_setup.sql.erb
    └── nrpe_check_mysql_repl.erb

Before setting up anything, you will need Percona’s Yum repository installed on the master and on the slaves. In this guide, it is managed by the percona::repo manifest:

class percona::repo {

  yumrepo { 'percona':
    baseurl    => "http://repo.percona.com/centos/${::operatingsystemmajrelease}/os/x86_64/",
    mirrorlist => absent,
    descr      => 'Percona',
    gpgcheck   => 0,
  }

}

The sole purpose of the percona::heartbeat manifest is to install the required packages, in this case the Percona Toolkit, which contains the pt-heartbeat program, and Percona’s Nagios Plugins:

class percona::heartbeat {

  include percona::repo

  package { [ 'percona-toolkit', 'percona-nagios-plugins' ]:
    ensure  => present,
    require => Yumrepo['percona'],
  }

}

Now create a percona::params class that reads the require parameters in from Hiera:

class percona::params {

  $master_server_id   = hiera(master_server_id, undef)
  $heartbeat_mysql_pw = hiera(heartbeat_mysql_pw, undef)
  $mysql_repl_pw      = hiera(mysql_repl_pw, undef)
  $server_id          = hiera(mysql_server_id, undef)

}

Before continuing, I will attempt to explain the purpose of the above parameters. The $heartbeat_mysql_pw parameter is the password of the heartbeat account that will be used to insert rows into and query the Heartbeat database and table. The $mysql_repl_pw parameter is the password of the account that you use for replication between your master and slave. I chose this account because the pmp-check-mysql-replication-running Nagios plugin requires either the SUPER or REPLICATION CLIENT privilege and this account has the latter. If you so choose, you can also use the root account for this, but I personally try to minimize usage of the root account. As recommended in previous posts, you should encrypt your passwords that you store in Hiera with hiera-eyaml. The $server_id parameter is the server_id of the master running the pt-heartbeat daemon, while $master_server_id is the server_id of the master the Nagios plugin will be checking against on the slave. So why are these not one and the same? This is because there are a number of scenarios where a master may also be a slave, and you may also want to check the replication lag on it. However, if you just have a single master in your server topology, you can probably combine these two parameters.

Now that the params class has been created, we can then proceed to creating the classes for the master and slave configurations. The class percona::heartbeat::master contains all of the resources required to configure pt-heartbeat on your master:

class percona::heartbeat::master (
  $heartbeat_mysql_pw = $percona::params::heartbeat_mysql_pw,
  $server_id          = $percona::params::server_id,
) inherits percona::params {

  include percona::heartbeat

  file { '/usr/local/etc/heartbeat_setup.sql':
    ensure  => file,
    content => template('percona/heartbeat_setup.sql.erb'),
  }

  ::mysql::db { 'heartbeat':
    ensure   => present,
    user     => 'heartbeat',
    password => $heartbeat_mysql_pw,
    host     => 'localhost',
    grant    => 'SELECT',
    sql      => '/usr/local/etc/heartbeat_setup.sql',
    require  => File['/usr/local/etc/heartbeat_setup.sql'],
  }

  file { '/etc/pt-heartbeat':
    ensure  => file,
    source  => 'puppet:///modules/percona/heartbeat_master_cfg',
    owner   => 'root',
    group   => 'root',
    mode    => '0600',
    require => Mysql::Db['heartbeat'],
    notify  => Service['pt-heartbeat'],
  }

  file { '/etc/init.d/pt-heartbeat':
    ensure  => file,
    mode    => '0755',
    owner   => 'root',
    group   => 'root',
    source  => 'puppet:///modules/percona/pt-heartbeat_init',
    require => Package['percona-toolkit'],
  }

  service { 'pt-heartbeat':
    ensure  => running,
    enable  => true,
    require => File['/etc/init.d/pt-heartbeat'],
  }

  file { '/etc/nrpe.d/check_pt_heartbeat_proc.cfg':
    ensure  => file,
    source  => 'puppet:///modules/percona/nrpe_pt_heartbeat_proc',
    owner   => 'root',
    group   => 'nrpe',
    mode    => '0640',
    require => File['/etc/nrpe.d'],
    notify  => Service['nrpe'],
  }

  @@nagios_service { "check_pt_heartbeat_proc_${::hostname}":
    check_command       => 'check_nrpe!check_pt_heartbeat_proc',
    use                 => 'generic-service',
    host_name           => $::fqdn,
    notification_period => '24x7',
    service_description => 'pt-heartbeat Process',
    max_check_attempts  => 3,
  }

}

First, the class reads in the necessary parameters. Second, it pushes out the SQL file that creates the database and table that are required for pt-heartbeat. In this example, the template for this is located at templates/heartbeat_setup.sql.erb, with the below content:

DROP TABLE IF EXISTS `heartbeat`;

CREATE TABLE `heartbeat` (
  `ts` varchar(26) NOT NULL,
  `server_id` int(10) unsigned NOT NULL,
  `file` varchar(255) DEFAULT NULL,
  `position` bigint(20) unsigned DEFAULT NULL,
  `relay_master_log_file` varchar(255) DEFAULT NULL,
  `exec_master_log_pos` bigint(20) unsigned DEFAULT NULL,
  PRIMARY KEY (`server_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

LOCK TABLES `heartbeat` WRITE;
INSERT INTO `heartbeat` (ts, server_id) VALUES (NOW(), <%= @server_id %>);
UNLOCK TABLES;

Once this SQL file is available on the master, the mysql::db provider can then create the heartbeat database and user, and execute the SQL. Following this, the manifest pushes out the configuration file used by pt-heartbeat script itself. For my example, this will be located at /etc/pt-heartbeat on the master and will contain the following content:

update
socket=/var/lib/mysql/mysql.sock
database=heartbeat
table=heartbeat
pid=/var/run/pt-heartbeat.pid

You can also specify these settings as command line options when running pt-heartbeat. I prefer, however, to have these in a separate file when using an init script. When starting up pt-heartbeat on boot, you could have it set to start in /etc/rc.local or anacron. However, the proper way of doing this would be to have it run as an init script, as in this example. Below is my simple init script (parts of this script were borrowed from the article here):

#!/bin/bash
#
# description: pt-heartbeat server init script
#
# Get function from functions library
. /etc/init.d/functions
# Start the service pt-heartbeat
start() {
        if [ ! -f '/etc/pt-heartbeat' ] ; then
            echo "Configuration file not found. Exiting ..."
            exit 1
        fi
        echo -n "Starting pt-heartbeat service: "
        pt-heartbeat --config /etc/pt-heartbeat --defaults-file=/root/.my.cnf --daemonize
        ### Create the lock file ###
        touch /var/lock/subsys/pt-heartbeat
        success $"pt-heartbeat service startup"
        echo
}
# Restart the service pt-heartbeat
stop() {
        echo -n "Stopping pt-heartbeat service: "
        kill `cat /var/run/pt-heartbeat.pid`
        ### Now, delete the lock and pid files ###
	rm -f /var/run/pt-heartbeat.pid
        rm -f /var/lock/subsys/pt-heartbeat
        success $"pt-heartbeat service shutdown"
        echo
}
### main logic ###
case "$1" in
  start)
        start
        ;;
  stop)
        stop
        ;;
  status)
	status -p /var/run/pt-heartbeat.pid -l pt-heartbeat pt-heartbeat
        ;;
  restart)
        stop
        start
        ;;
  *)
        echo $"Usage: $0 {start|stop|restart|status}"
        exit 1
esac
exit 0

Note that this example requires that the root credentials be stored in /root/.my.cnf, which in this example is managed by the puppetlabs/mysql Forge module.

Optionally, you can have NRPE and Nagios check periodically for the pt-heartbeat process. Note, however, your slave will complain as well if this isn’t running. Below is the NRPE check, located in /etc/nrpe.d/check_pt_heartbeat_proc.cfg:

command[check_pt_heartbeat_proc]=/usr/lib64/nagios/plugins/check_procs -c 1:1 -a '/usr/bin/pt-heartbeat'

Once you have created the above files, templates, and manifests, include the percona::heartbeat::master class in your master’s catalog and trigger a Puppet run on it. If everything was created successfully and the pt-heartbeat service is running, you should be able to connect to MySQL and run the below query. The timestamp in the “ts” column should be constantly updating:

mysql> SELECT ts FROM heartbeat.heartbeat;
+----------------------------+
| ts                         |
+----------------------------+
| 2015-09-15T09:03:00.003470 |
+----------------------------+
1 row in set (0.01 sec)

Now we can proceed to setting up the checks on the slave. First, create the class containing all of the resources needed to configure the slave:

class percona::heartbeat::slave (
  $heartbeat_mysql_pw = $percona::params::heartbeat_mysql_pw,
  $mysql_repl_pw      = $percona::params::mysql_repl_pw,
  $master_server_id   = $percona::params::master_server_id,
) inherits percona::params {

  include percona::heartbeat

  file { '/etc/nrpe.d/check_mysql_repl.cfg':
    ensure  => file,
    content => template('percona/nrpe_check_mysql_repl.erb'),
    owner   => 'root',
    group   => 'nrpe',
    mode    => '0640',
    require => File['/etc/nrpe.d'],
    notify  => Service['nrpe'],
  }

  @@nagios_service { "check_mysql_repl_delay_${::hostname}":
    check_command       => 'check_nrpe!check_mysql_repl_delay',
    use                 => 'generic-service',
    host_name           => $::fqdn,
    notification_period => '24x7',
    service_description => 'MySQL Replication Delay',
    max_check_attempts  => 3,
  }

  @@nagios_service { "check_mysql_repl_running_${::hostname}":
    check_command       => 'check_nrpe!check_mysql_repl_running',
    use                 => 'generic-service',
    host_name           => $::fqdn,
    notification_period => '24x7',
    service_description => 'MySQL Replication Running',
    max_check_attempts  => 3,
  }

}

Unlike the class for the master, only a single file is needed to configure the slave, /etc/nrpe.d/check_mysql_repl.cfg. The template for this contains the below content:

command[check_mysql_repl_delay]=/usr/lib64/nagios/plugins/pmp-check-mysql-replication-delay -H localhost -l heartbeat -p <%= @heartbeat_mysql_pw %> -T heartbeat.heartbeat -s <%= @master_server_id %>
command[check_mysql_repl_running]=/usr/lib64/nagios/plugins/pmp-check-mysql-replication-running -H localhost -l repl -p <%= @mysql_repl_pw %>

The top NRPE command, check_mysql_repl_delay, executes Percona’s replication delay plugin. This compares the time stamp in the heartbeat table to the system time and throws a warning or an error depending on how many seconds behind it is. You can optionally specify the warning and critical second thresholds with the -w and -c options, respectively. The defaults are 300 seconds as the warning threshold. and 600 seconds as the critical threshold. The bottom command, check_mysql_repl_running, checks if replication is running and if not, alerts Nagios.

One additional note: if you are running MySQL version 5.6 or later, the pmp-check-mysql-replication-delay will complain if you specify the password in the NRPE command; in Nagios, it will display the message, “Warning: Using a password on the command line interface can be insecure”, even if the check itself is green. The solution is to create a file located at /etc/nagios/my.cnf that contains the user name and password of the heartbeat account:

# /etc/nagios/my.cnf file resource
  file { '/etc/nagios/my.cnf':
    ensure  => file,
    content => template('percona/nagios_my_cnf.erb'),
    owner   => 'root',
    group   => 'nrpe',
    mode    => '0640',
    require => Package['nrpe'],
  }

# templates/nagios_my_cnf.erb
[client]
user=heartbeat
host=localhost
password='<%= @heartbeat_mysql_pw %>'
socket=/var/lib/mysql/mysql.sock

# check_mysql_repl_delay command
command[check_mysql_repl_delay]=/usr/lib64/nagios/plugins/pmp-check-mysql-replication-delay --defaults-file /etc/nagios/my.cnf -T heartbeat.heartbeat -s <%= @master_server_id %>

Include the percona::heartbeat::slave class in your slave’s catalog and trigger a Puppet run on it. If you’re using exported resources to manage your Nagios configuration, the two Nagios checks should be automatically configured on your Nagios server. MySQL replication is now being monitored more accurately with pt-heartbeat and Nagios.

Configure a MariaDB Galera Cluster with Puppet

One of the ways in which you can have multi-master replication with high-availability in MySQL is to set up a Galera cluster. As an interesting project in order to learn more about Galera, I decided to see if it would be possible to build out a cluster using automation with Puppet and the MySQL Forge module. I was able to test this successfully using either MariaDB 5.5 or 10.0 on CentOS 6. In this example, I configured this on three systems, which is the recommended minimum number of cluster members for Galera.

The first thing you will need when installing this is the MariaDB 5.5 or 10.0 Yum repository. To configure this, create a class similar to below:

class galera::repo {

  yumrepo { 'mariadb55':
    baseurl  => "http://yum.mariadb.org/5.5/centos${::osmajrelease}-amd64/",
    descr    => 'MariaDB 5.5',
    gpgcheck => 0,
  }

}

The only package that you need to have Puppet install with a specific package resource is the Galera package itself. The other packages are installed by the MySQL Forge module. I’ve created an “install” class to manage this single resource below:

class galera::install {

  package { 'galera':
    ensure  => present,
    require => Yumrepo['mariadb55']
  }

}

You need to ensure that the puppetlabs/mysql module is available on your Puppet master. If you’re using R10K to manage your modules, just add the following to your Puppetfile:

mod 'puppetlabs/mysql'

If not, you would install it by running sudo puppet module install puppetlabs-mysql. If you’re using the puppetlabs/firewall module to manage iptables on your nodes, you will also need to create a class that opens the necessary TCP ports for Galera:

class galera::firewall {

  firewall { '102 open ports for galera cluster':
    state  => 'NEW',
    dport  => ['3306', '4567', '4444'],
    proto  => 'tcp',
    action => 'accept',
  }

}

Next, create a class for your parameters that are read in from Hiera:

class galera::params {

  $galera_cluster = hiera(galera_cluster)
  $galera_servers = any2array(hiera(galera_servers))
  $sst_pw         = hiera(sst_pw)
  $root_password  = hiera(mysql_root_pw)

}

Before going on to the next step, I should explain the purpose of the above parameters. $galera_cluster is the name assigned to the cluster itself. If you would like, you can can leave this out altogether and just set it statically in your configuration file. However, if you are going to have multiple Galera clusters that use this module, you will need this parameter. $galera_servers is an array of servers in the cluster excluding the IP of the server itself. So if you have three servers in your cluster with IPs of 192.168.1.50, 192.168.1.51, and 192.168.1.52, in host hiera YAML file for 192.168.1.50, you will specify galera_servers as the following:

galera_servers:
  - '192.168.1.51'
  - '192.168.1.52'

And so forth for 192.168.1.51 and 192.168.1.52. $root_password is of course the MySQL root account password. Since this is not hashed and is stored as plain text, I recommend setting up EYAML to encrypt it. More information on that can be found here. Finally, $sst_pw is the plain-text password for the SST user that Galera uses to sync data between the nodes in the cluster. Simply using the root account as the SST user would work as well, but for security purposes I recommend setting up a separate SST account that only has the permissions required to perform replication.

Next, create a configuration class that sets up the MariaDB client and server, creates the required account, and configures the Galera cluster:

class galera::config (
  $galera_cluster = $galera::params::galera_cluster,
  $galera_servers = $galera::params::galera_servers,
  $sst_pw         = $galera::params::sst_pw,
  $root_password  = $galera::params::root_password,
) inherits galera::params {

  class { '::mysql::client':
    package_name => 'MariaDB-client',
    require      => Yumrepo['mariadb55'],
  }

  class { '::mysql::server':
    package_name            => 'MariaDB-Galera-server',
    remove_default_accounts => true,
    service_enabled         => true,
    service_manage          => true,
    service_name            => 'mysql',
    root_password           => $root_password,
    override_options        => {
      'mysqld' => {
        'bind-address'           => '0.0.0.0',
        'pid-file'               => '/var/lib/mysql/mysqld.pid',
        'binlog-format'          => 'ROW',
        'default-storage-engine' => 'innodb',
      },
    },
    require                 => Yumrepo['mariadb55'],
  }

  ::mysql_user { 'sst@%':
    ensure        => present,
    password_hash => mysql_password($sst_pw),
  }

  ::mysql_grant { 'sst@%/*.*':
    ensure     => present,
    options    => ['GRANT'],
    privileges => [ 'RELOAD', 'LOCK TABLES', 'REPLICATION CLIENT' ],
    table      => '*.*',
    user       => 'sst@%',
  }

  file { '/etc/my.cnf.d/server.cnf':
    ensure  => file,
    content => template('galera/server.cnf.erb'),
    owner   => 'mysql',
    group   => 'mysql',
    mode    => '0640',
    require => [ Package['mysql-server'], Package['galera'], ],
  }

}

First, this class reads in the necessary parameters from the galera::params class. It then calls the MySQL Forge module to install and configure the MariaDB client and server. The SST user is added and configured with the minimum privileges required. Finally, the template for /etc/my.cnf.d/server.cnf, which contains the MariaDB/Galera specific settings, is pushed out with the appropriate permissions. If you have read the code closely, you might notice that this resource is missing a “notify => Service[‘mysql’]” that refreshes the service each time the file is changed. This is because in order to start a Galera cluster, you will need to bootstrap the first server in the cluster; “notify => Service[‘mysql’]” will produce an error on the first server because it will call /etc/init.d/mysql restart, which will in turn error out. For the purposes of this example, I’ve started up the cluster manually on each server. For the first system, after running Puppet, you would run sudo service mysql bootstrap. For each subsequent system, after running Puppet, you would execute sudo service mysql restart. However, if you’re super-obsessed with making the whole setup automated, you can add code that does this. First add the following to your params class:

  $bootstrap = hiera(mysql_bootstrap, false)

  $init_command = $bootstrap ? {
    true  => 'service mysql stop && service mysql bootstrap',
    false => 'service mysql restart',
  }

For the host hiera YAML file for the first server in the cluster, add this parameter: mysql_bootstrap: true. Then add the below to your configuration class:

# Additional parameter at the top
  $init_command   = $galera::params::init_command,
  ...
  exec { 'init_galera':
    command => $init_command,
    unless  => 'test -f /var/lib/mysql/galera.cache',
    path    => '/bin:/sbin:/usr/bin:/usr/sbin',
    require => [ Service['mysqld'], Mysql_user['sst@%'], Mysql_grant['sst@%/*.*'], File['/etc/my.cnf.d/server.cnf'], ],
  }

Again, I don’t recommend this approach for building a production system, but I wanted to show that it’s possible.

Next, create a template for /etc/my.cnf.d/server.cnf. I’ve placed this particular template in galera/templates/server.cnf.

[mariadb]
query_cache_size=0
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address=gcomm://<%= @galera_servers.join(',') %>
wsrep_cluster_name='<%= @galera_cluster %>'
wsrep_node_address='<%= @ipaddress %>'
wsrep_node_name='<%= @hostname %>'
wsrep_sst_method=rsync
wsrep_sst_auth=sst:<%= @sst_pw %>

Finally, create a base manifest that includes all of the above classes:

class galera {

  include galera::config
  include galera::firewall
  include galera::install
  include galera::repo

}

You can now commit the code to repository that contains your Puppet code, include the class in your server’s manifest, and execute Puppet on each one of the Galera nodes. If you’ve chosen to manage the service manually, you won’t have to run it on each one in a particular order. To bootstrap the cluster, on any one of the nodes execute sudo service mysql bootstrap. On all remaining servers, execute sudo service mysql restart. You should receive results from the below commands similar to below on any of the servers:

matthew@mariadb1:~$ sudo mysql -e "show status like 'wsrep_connected'\G"
*************************** 1. row ***************************
Variable_name: wsrep_connected
        Value: ON
matthew@mariadb1:~$ sudo mysql -e "show status like 'wsrep_incoming_addresses'\G;"
*************************** 1. row ***************************
Variable_name: wsrep_incoming_addresses
        Value: 192.168.87.53:3306,192.168.87.56:3306,192.168.87.52:3306

Your Galera cluster is now ready to accept connections. When you insert data on any one of the servers, it should replicate to the other servers in the cluster. In future blog posts, I will explore different methods for load balancing a cluster.

Author’s note: in a previous revision of this post, I had separate parameters for the SST user’s password and password hash. Now, having learned about the mysql_password() function in the puppetlabs/mysql Forge module, I’ve removed the password hash parameter to simplify the directions somewhat.

Monitoring Windows with Nagios and Exported Resources in Puppet

Suppose you would like to monitor your Windows systems alongside your Linux systems in Nagios, but want to avoid configuring every single one manually. Also, you want these systems to be removed from Nagios automatically when you decommission them. Puppet, NSClient++, and Chocolatey provide an excellent means for accomplishing this. In this post I will explain how I got this up and running on my CentOS 6 Nagios server.

This article assumes that you’re already using PuppetDB with exported resources in your Puppet environment. As this is outside the scope of this guide, I’m not going to go into how to set this up. However, it is fairly simple to do, especially with the puppetlabs/puppetdb Forge module. If you’re using Puppet Enterprise, you’re likely already using it. This also assumes that you’re using Puppet to manage your Nagios server. In the class that manages your Nagios server, you will need these two lines, to collect the exported resources from your nodes:

  Nagios_host    <<||>> { notify => Service['nagios'] }
  Nagios_service <<||>> { notify => Service['nagios'] }

You will need to ensure that you have Chocolatey available as a package provider for your Windows nodes. My previous post explains how to configure Chocolatey with Puppet. Chocolatey will be used to install NSClient++, which is a monitoring agent for Windows that includes a NRPE server with which Nagios can interface. The documentation on NSClient++ can be found here.

To prepare your Nagios server for managing Windows hosts, first ensure that you have a Nagios command that can initiate NRPE commands against Windows clients. Because the check_nrpe plugin will throw SSL errors against NSClient++, I’ve defined a nagios_command resource like below for Windows systems that does not use SSL:

  nagios_command { 'check_nrpe_win':
    command_name => 'check_nrpe_win',
    command_line => '$USER1$/check_nrpe -H $HOSTADDRESS$ -n -c $ARG1$ -t 30',
  }

Next, define a standard Nagios hostgroup for Windows that includes basic items to check, such as disk space. You may need to alter this for your own environment. I’ve put this under its own class, nagios::server::hostgroup_windows.

class nagios::server::hostgroup_windows {

  nagios_hostgroup { 'windows_hosts':
    alias => 'Windows Hosts',
  }

  nagios_service { 'check_win_cpu':
    check_command       => 'check_nrpe_win!check_cpu',
    use                 => 'generic-service',
    hostgroup_name      => 'windows_hosts',
    notification_period => '24x7',
    service_description => 'CPU Load',
    max_check_attempts  => 3,
  }

  nagios_service { 'check_win_mem':
    check_command       => 'check_nrpe_win!alias_memory',
    use                 => 'generic-service',
    hostgroup_name      => 'windows_hosts',
    notification_period => '24x7',
    service_description => 'Memory Usage',
    max_check_attempts  => 3,
  }

  nagios_service { 'check_win_drives':
    check_command       => 'check_nrpe_win!alias_space',
    use                 => 'generic-service',
    hostgroup_name      => 'windows_hosts',
    notification_period => '24x7',
    service_description => 'Disk Usage',
    max_check_attempts  => 3,
  }

  nagios_service { 'check_rdp':
    check_command       => 'check_rdp',
    use                 => 'generic-service',
    hostgroup_name      => 'windows_hosts',
    notification_period => '24x7',
    service_description => 'RDP',
    max_check_attempts  => 3,
  }

}

Next, create a template for nsclient.ini, which is the main configuration file for NSClient++. The one I’ve created for this guide is relatively simple; you can refer to the NSClient++ documentation for more options. The template, in my case, is located at nagios/templates/nsclient.ini.erb.

[/modules]
CheckSystem=enabled
CheckDisk=enabled
CheckExternalScripts=enabled
NRPEServer=enabled

[/settings/default]
allowed hosts = your_nagios_server_IP

[/settings/NRPE/server]
use ssl = false
allow arguments = true
allow nasty characters = false
port = 5666

[/settings/external scripts/alias]
alias_memory = check_memory "warn=free < 10%" "crit=free < 5%"
alias_space = check_drivesize "warn=free < 10%" "crit=free < 5%" drive=*

Finally, create a class that installs NSClient++ and manages the service on the Windows systems, and exports the nagios_host resource to the Nagios server.

class nagios::windows {

  package { 'nscp':
    ensure   => present,
  }

  service { 'nscp':
    ensure  => running,
    enable  => true,
    require => Package['nscp'],
  }

  file { 'C:/Program Files/NSClient++/nsclient.ini':
    ensure  => file,
    content => template('nagios/nsclient.ini.erb'),
    notify  => Service['nscp'],
  }

  @@nagios_host { $::fqdn:
    ensure             => present,
    alias              => $::hostname,
    address            => $::ipaddress,
    hostgroups         => 'windows_hosts',
    use                => 'generic-host',
    max_check_attempts => 3,
    check_command      => 'check_ping!100.0,20%!500.0,60%',
  }

}

Once you have committed these changes to your Puppet repository, you would then include nagios::windows in the catalog for your Windows host and trigger a Puppet run on it to install NSClient++, enable the service, and export the resource to your Nagios server. Then, execute a Puppet run on your Nagios server. The result should hopefully be like below.

Nagios Windows

And now your host should be available in Nagios for monitoring.

One additional note: if you include the Windows Hosts host group in the catalog of your Nagios server, you must then have at least one Windows nagios_host exported. Otherwise, Nagios will not start, as it does not allow empty host groups.

How I Configured Chocolatey with Puppet

Chocolatey is an apt-like package manager for Windows (https://chocolatey.org) that greatly simplifies the installation of software, especially with Puppet (versus having to call MSI packages with obscure switches that may or may not work). Many of my future tutorials that involve managing Windows with Puppet will require that Chocolatey be configured. Here I will explain how I’ve gotten Chocolatey up and running on Windows with Puppet.

This guide assumes that you have Puppet already installed on Windows. If you’re familiar with installing Puppet on Linux systems, it’s about the same for Windows. You would download and install the MSI package from the link here and afterwards sign the certificate request on your master server. You will also need to install the chocolatey/chocolatey and puppetlabs/powershell Forge modules. If you’re using R10K to manage your modules, just add the following to your Puppetfile:

mod 'puppetlabs/powershell'
mod 'chocolatey/chocolatey'

Otherwise, just install them using sudo puppet module install puppetlabs-powershell and sudo puppet module install chocolatey-chocolatey. Once these have been installed, I would then recommend defining some default parameters for the package and file resources at the top scope, in site.pp.

if $::kernel == 'windows' {
  File {
    owner              => undef,
    group              => undef,
    source_permissions => 'ignore',
  }

  Package {
    provider => 'chocolatey',h
  }
}

These tell Puppet not to attempt to apply *nix-style permissions to Windows file resources and to use Chocolatey as the default provider for packages. Now create a class that installs Chocolatey itself. Since the chocolatey/chocolatey module currently is not capable of installing Chocolatey, your class will need to install it using an exec resource. I’ve named my class windows::chocolatey and have created it under windows/manifests/chocolatey.pp.

class windows::chocolatey {

  exec { 'install_chocolatey':
    command  => "set-executionpolicy unrestricted -force -scope process; (iex ((new-object net.webclient).DownloadString('https://chocolatey.org/install.ps1')))>\$null 2>&1",
    provider => 'powershell',
    creates  => 'C:/ProgramData/chocolatey',
  }

}

The above command for installing Chocolatey is from Chocolatey installation guide. It’s possible that this may change in the near future. Therefore, you should refer to that page before setting up your exec. If this is for a lab or evaluation environment, you may also want to have Puppet use Chocolatey to keep up to date with the latest Chocolatey release.

  package { 'chocolatey':
    ensure  => latest,
    require => Exec['install_chocolatey'],
  }

Once you have created this module and committed it to the repository containing your custom modules, you would then include the Chocolatey class (windows::chocolatey) in the catalog for your Windows node and initiate a Puppet run on it to apply the class. Now you can use Puppet to manage packages that have been made available by contributors to the Chocolatey project. A full listing can be found here. To manage a particular package with Puppet, include it the same way you would as with a package for Linux:

class windows::git {

  package { 'git':
    ensure => installed,
  }

}

Why You Should Always Test Out New Puppet Forge Modules Before Upgrading

Image

Modules from the Puppet Forge are a great resource. They simplify getting your systems managed under Puppet and save you from having to start from scratch in many cases. I have the utmost gratitude to the individuals who write or contribute to these modules. With that said, the people who contribute to them will sometimes introduce new features that diverge significantly from previous way, causing things to break. Because of this, in my lab environment I set most of the modules in my Puppetfile to ensure that the latest version is installed at all times, so that I can spot any potential any issues. DO NOT DO THIS IN A PRODUCTION ENVIRONMENT, HOWEVER!

mod 'puppetlabs/concat', :latest
mod 'puppetlabs/firewall', :latest
mod 'puppetlabs/git', :latest

Prior to upgrading a version of a Forge module, you should review all of the release notes all the way back to the version you’re upgrading from. It should be also noted you should apply this level of caution to “Puppet Labs-supported” modules as well as those created by community members. One example I recently encountered was with the puppetlabs/puppetdb module. As noted in the change log, the latest version, 5.0.0, has the manage_pg_repo parameter set to true by default, so it will install a version of Postgres from the Postgres Yum repo, while the previous version, 4.3.0, had this off by default, and thus used CentOS’s Postgres package. A demonstration below in a Vagrant box running CentOS 6 shows the result of me blindly upgrading.

First, I create a Vagrantfile that builds the initial environment:

$shell = <<SHELL
puppet module install puppetlabs/puppetdb --version=4.3.0 \
--modulepath=/vagrant/modules
SHELL

Vagrant.configure('2') do |config|
  config.vm.box = 'puppetlabs/centos-6.6-64-puppet'
  
  config.vm.provision "shell", inline: $shell

  config.vm.provision "puppet" do |puppet|
    puppet.module_path = "modules"
  end

end

In my Vagrant project folder, I create a simple manifests/default.pp manifest that just includes the PuppetDB module:

node default {

  include puppetdb

}

Issue a vagrant up and watch while it installs a PuppetDB server running on the OS’s Postgres package. Once complete vagrant ssh in and start up PuppetDB. It should be functioning properly.

Vagrant1

Now I’m going to break PuppetDB with the updated module. First, run sudo puppet module install puppetlabs/puppetdb –modulepath=/etc/puppet/modules, so that it installs the module inside the environment. Then run a sudo puppet apply /vagrant/manifests/default.pp to apply the manifest you created when provisioning the VM.

PuppetDB Module Failures

Guess what? If this were a production system, you’re going to be fixing PuppetDB. The module installed Postgresql 9.4 alongside the OS’s 8.x Postgres, causing a conflict and preventing it from starting. It’s quite apparent that the individuals who put this in did not actually test for this scenario. They probably assume that everyone installing their module would be doing so with a clean system, which is often not the case.

The lesson here: Puppet Forge modules are an excellent resource, but you should research and, if possible, test the changes made in the latest versions before upgrading your production environment. Otherwise, there’s always the possibility you may break something even more critical than PuppetDB.

How to Setup an OpenVPN Server with Puppet

Recently I decided that I wanted to setup an OpenVPN server so that I could access my lab environment at home over an encrypted connection. In order to automate the configuration of this, I decided to use the luxflux/openvpn Forge module. These directions are for a CentOS/Enterprise Linux 6 system.

One thing I noticed in the directions on the GitHub page for the site is that it says that there is a class for the module called openvpn::servers, which supposedly allows you to setup the module using class parameters in Hiera. This class does not actually exist in the current version of the module, which I quickly learned after attempting to add it in my ENC. Instead, I had to create my own module with its own set of parameters that called the openvpn::server defined type to configure my VPN server.

First, I created a module called site_openvpn in my set of custom modules, with a params class that contains the list of parameters to be read in from Hiera.

class site_openvpn::params {

  $vpn_clients = hiera(vpn_clients)
  $vpn_lcl_net = hiera(vpn_lcl_net)
  $vpn_rem_ip = hiera(vpn_rem_ip, $::ipaddress)
  $vpn_subnet = hiera(vpn_subnet)

}

The meaning of the above parameters are:

  • $vpn_clients: this is an array of your VPN client names. Example: test_laptop.
  • $vpn_lcl_net: the subnet of your LAN, such as your home or office network
  • $vpn_rem_ip: the IP your clients will connect to. This may be different than the IP of your VPN server if it resides behind a NAT.
  • $vpn_subnet: the network your VPN clients will connect on.

Create a firewall manifest that opens the necessary ports for OpenVPN:

class site_openvpn::firewall (
$vpn_subnet = $site_openvpn::params::vpn_subnet,
) inherits site_openvpn::params {

  firewall { '120 allow 1194/TCP for OpenVPN':
    state  => 'NEW',
    dport  => '1194',
    proto  => 'tcp',
    action => 'accept',
  }

  firewall { '121 allow TUN connections':
    chain   => 'INPUT',
    proto   => 'all',
    iniface => 'tun+',
    action  => 'accept',
  }

  firewall { '122 forward TUN forward connections':
    chain   => 'FORWARD',
    proto   => 'all',
    iniface => 'tun+',
    action  => 'accept',
  }

  firewall { '123 tun+ to eth0':
    chain    => 'FORWARD',
    proto    => 'all',
    iniface  => 'tun+',
    outiface => 'eth0',
    state    => [ 'ESTABLISHED', 'RELATED' ],
    action   => 'accept',
  }

  firewall { '124 eth0 to tun+':
    chain    => 'FORWARD',
    proto    => 'all',
    iniface  => 'eth0',
    outiface => 'tun+',
    state    => [ 'ESTABLISHED', 'RELATED' ],
    action   => 'accept',
  }

  firewall { '125 POSTROUTING':
    table    => 'nat',
    proto    => 'all',
    chain    => 'POSTROUTING',
    source   => "${vpn_subnet}/24",
    outiface => 'eth0',
    jump     => 'MASQUERADE',
  }

}

Create a init.pp manifest that calls the luxflux/openvpn with your parameters. Note, for dhcp-option DNS, I am using the OpenVPN server itself as a caching/forwarding DNS server, because I’ve run into problems with the DNS traffic forwarding correctly. You can point this to another server, however.

class site_openvpn (
  $vpn_clients = $site_openvpn::params::vpn_clients,
  $vpn_lcl_net = $site_openvpn::params::vpn_lcl_net,
  $vpn_rem_ip  = $site_openvpn::params::vpn_rem_ip,
  $vpn_subnet  = $site_openvpn::params::vpn_subnet,
) inherits site_openvpn::params {

  include ::openvpn
  include site_openvpn::firewall

  ::openvpn::server { $::fqdn:
    country      => 'US',
    province     => 'NC',
    city         => 'Your City',
    organization => $::domain,
    email        => "root@${::domain}",
    server       => "$vpn_subnet 255.255.255.0",
    route        => [
      "${vpn_lcl_net} 255.255.255.0",
    ],
    push         => [
      "route ${vpn_lcl_net} 255.255.255.0",
      "redirect-gateway def1 bypass-dhcp",
      "dhcp-option DNS $::ipaddress_tun0",
    ],
    c2c          => true,
  }

  # Create the VPN client configs
  ::openvpn::client { $vpn_clients:
    server      => $::fqdn,
    remote_host => $vpn_rem_ip,
  }

}

You will also need to set net.ipv4.ip_forward to 1 in /etc/sysctl.conf, so that the traffic can be forwarded through your OpenVPN server to the local LAN. I have a separate class that manages this, but you can include the code in your site_openvpn manifest as well:

sysctl/manifests/init.pp:

class sysctl {

  file { '/etc/sysctl.conf':
    ensure => file,
  }

  exec { 'sysctl -p':
    command     => '/sbin/sysctl -p',
    refreshonly => true,
    subscribe   => File['/etc/sysctl.conf'],
  }

}

sysctl/manifests/ip_forward.pp

class sysctl::ip_forward {

  include sysctl

  augeas { 'sysctl_ip_forward':
    context => '/files/etc/sysctl.conf',
    onlyif  => "get net.ipv4.ip_forward == '0'",
    changes => "set net.ipv4.ip_forward '1'",
    notify  => Exec['sysctl -p'],
  }

}

Finally, create the hiera parameters for the host. I’m creating these in a .yaml file for the host:

---
vpn_rem_ip: '192.168.1.67'
vpn_subnet: '192.168.100.0'
vpn_lcl_net: '192.168.1.0'
vpn_clients:
  - 'iphone6'
  - 'macbook'

Include the site_openvpn and sysctl::ip_forward modules in your node definition or ENC if applicable, and run Puppet on the OpenVPN server. This will install OpenVPN, configure the server, and generate the client configs.

Once installation is complete, the .ovpn client config files will be accessible at /etc/openvpn//download-configs. You can SCP these individual files to import into a Mac (such as Tunnelblick), Windows, Linux, or mobile OpenVPN client.

Author’s note: in my previous post, I forgot to include “proto => ‘all'” in some of the firewall resources, which would cause issues with UDP traffic such as DNS forwarding correctly.