Using Certbot To Generate Certificates Without SSH Access
This Blog has until now not had a publicly verifiable SSL certificate. This means we've avoided publicising it and it means all logins have been in the clear. The only HTTPS on the domain name has been provided by a privately signed certificate with our own root certificate that no one else has. We have not been prepared to pay £140 + VAT for a wildcarded SSL certificate for a privately owned website. So we've investigated Let's Encrypt's software to see if it can provide us with a solution. Initially, no. We did not meet the pre-requisites since our cPanel did not have the Let's Encrypt plugin, nor were our hosting providers likely to install it since they make money from selling certificates, nor do we have SSH access.
Out of curiosity, my son decided to play with certbot anyway. The thought was that we might be able to FTP the certificate to the web server and install it by a back door. Actually, we got lucky and discovered we had cPanel API access. This meant we could automate the installation from a UNIX host we owned, that UNIX host being one of our Raspberry Pis. We just needed to customise the execution of certbot, create an API key and learn how to make the API interface work. Here's our solution.
- Requirements
- The Plan
- Implementation
- Getting started with certbot
- Automating DNS TXT Records
- Serial Numbers
- Line Index
- Get TXT Record
- Update TXT Record
- Remove TXT Record
- Helper Functions
- Automating Certificate Installations
- Integration for Certificate Creation
- Configuration
- --manual-auth-hook
- --manual-cleanup-hook
- Initial Certificate Creation
- DNS Propagation Concerns
- Script for Initial Certificate Creation
- Wildcarded Domains
- Renewal
- Library Functions for cPanel API
- Script for Certificate Renewal
- Conclusions
- Acknowledgements
- References
Requirements
The prerequisite is access to an always on Linux machine that can be configured to host the certbot software and some of our own. We have a Raspberry Pi to hand for this. The the Pi can be configured to provide the following:
- Provide a certificate with wildcarded domains
- Install to multiple domains via cPanel
- Automatically renew on expiry (since these certificates last only 3 months)
The Plan
There was a choice to make about Python versus Bash for the scripting language. In this case, we decided that Bash was more appropriate for the additional scripting required since Bash is closer to the operating system for file and process commands, and we already had utilities such as curl (for HTTP requests) and jq for JSON response parsing installed. (And Dad needs to learn Python first.)
Implementation
Getting started with certbot
You will need to install certbot and register for an account (at the command line). The recent version (1.21.0 for us) uses the Snap installer rather than yum or apt. We did have one problem with our Snap installation on a Raspberry Pi. We had to prevent a shared library preloading so that the Snap installation of hello-world worked without a warning message. That meant commenting out the one and only line in /etc/ld.so.preload as the library referenced could not be found.
certbot register --email your@email.address --no-eff-email
There are the terms and conditions of using the services which requires you to revoke certificates on compromise etc. What you really want to be aware of and watch are the rate limits. Between us, we failed to keep track of those and what we were doing, so we've been locked out for a short while.
Automating DNS TXT Records
For all the DNS routines two things are necessary:
- A serial number: Serial numbers in DNS zone files provide a way for the server to verify that the contents of a particular zone file are up-to-date. You need to have the correct serial number in order to effect the desired change.
- A line index: It turns out you are editing a text zone file keyed by line number not by the name of a TXT record
Serial Numbers
The article linked above has very usefully given a method of discovering the serial number that must be used to edit the DNS records. Firstly retrieve the Start of Authority from DNS, then parse out the correct number for the serial number. I found I had to query the name server or Google's 8.8.8.8 addresses as others were not reliable, so let's go with a name server for the required DNS zone.
# Return a list of name servers for a given DNS zone.
#
# Usage:
# declare -a auth_ns=(
# $(get_name_servers ${ZONE})
# )
#
function get_name_servers() {
local zone=${1}
if [ -z "${zone}" ]; then
error_msg "The first argument must be a DNS zone."
exit 1
fi
dig ${zone} NS +short | sed 's/\.$//'
}
# Query a DNS zone's "start of authority" (SOA) record for the serial number
# to use on the next zone update.
#
# Usage: SERIAL=$(get_serial_num abbey1.org.uk ns4.xilo.net)
#
function get_serial_num() {
local zone=${1}
local nameserver=${2}
if [ -z "${zone}" ]; then
error_msg "The first argument must be a DNS zone."
exit 1
fi
if [ -z "${nameserver}" ]; then
error_msg "The second argument must be the DNS zone's name server address."
exit 1
fi
dig @${nameserver} -t soa ${zone} +short | awk '{print $3}'
}
The following Bash script extract will then seed a serial number variable to be used with any future function that edits the DNS zone. We decided that the serial number should be retained between DNS operations, so a global variable was used.
# Lookup the authorative name servers for ZONE
declare -a AUTH_NS=(
$(get_name_servers ${ZONE})
)
# Seed the serial number from DNS's SOA record
SERIAL="$(get_serial_num ${ZONE} ${AUTH_NS[0]})"
There's a second method included in the DNS zone update functions below, originally used until we were shown the above method which involves parsing a textual error message. The parsing method is expected to be more fragile since it looks like it is probably specific to the cPanel API setup, and the string format could vary.
Line Index
The cPanel API query https://${CPANEL_HOST}/execute/DNS/parse_zone returns a nested JSON object of all DNS field entries with base64 encoded values. So the JSON is parsed by jq to filter out all TXT records matching the required base64 encoding of the required dname. Note that all DNAMES get their DNS zone appended automatically, so strip that off before passing it as a parameter. When the queried TXT record already exists, the line number is determined by the base64 match and returned. If it is not found, then the safe thing to do is use the maximum line index+1. The return code reflects whether it is a new line or an existing one and the line index to echo'ed to stdout.
# Lookup the line_index of a TXT record for a DNAME.
#
# Parameters:
# * The DNAME for the TXT record to search for
#
# STDOUT:
# The line_index of the TXT field for the DNAME.
#
# Returns:
# 0 (success) for the line index of an existing record
# 1 (fail) for a failed search, the line index is the existing
# maximum line index foudn + 1 to be used to add a new
# TXT record.
#
# Usage:
# get_line_index "_acme-challenge"
#
# The API call will search for "_acme-challenge.${ZONE}" because the DNS zone
# will be appended by the cPanel API assuming for security so that one can
# only edit your own DNS entries.
#
function get_line_index() {
local name=${1}
if [ -z "${name}" ]; then
error_msg "The first argument must be a string for the TXT record's DNAME."
exit 1
fi
local b64name=$(echo -n "${name}" | base64)
local json=$(
curl -s \
--data "zone=${ZONE}" \
-H "${AUTHHDR}" \
"https://${CPANEL_HOST}/execute/DNS/parse_zone"
)
# null returned if no match
local li=$(
jq --arg b64name "${b64name}" '
.data | map(
select(.record_type == "TXT" and .dname_b64 == $b64name)
| .line_index
) | .[0]
' <<< "${json}"
)
if [ "${li}" == "null" ]; then
li=$(jq '.data | map(.line_index) | max' <<< ${json})
# .data | max_by(.line_index) | .line_index
(( li++ ))
echo ${li}
return 1
else
echo ${li}
return 0
fi
}
Get TXT Record
Required as a useful test that any changes have been effected on the cPanel source end, when wondering why DNS queries do not reflect the change.
# Lookup the TXT record for a DNAME.
#
# Parameters:
# * The DNAME for the TXT record to search for
#
# STDOUT:
# The TXT value for the DNAME.
#
# Returns:
# 0 (success) if the TXT record was found
# 1 (fail) if the TXT record was not found
#
# Usage:
# get_dns_txt "_acme-challenge"
#
# The API call will lookup "_acme-challenge.${ZONE}" because the DNS zone
# will be appended by the cPanel API assuming for security so that one can
# only edit your own DNS entries.
#
function get_dns_txt() {
local name=${1}
if [ -z "${name}" ]; then
error_msg "The first argument must be a string for the TXT record's DNAME."
exit 1
fi
local b64name=$(echo -n "${name}" | base64)
local json=$(
curl -s \
--data "zone=${ZONE}" \
-H "${AUTHHDR}" \
"https://${CPANEL_HOST}/execute/DNS/parse_zone"
)
# null returned if no match
local txt=$(
jq -r --arg b64name "${b64name}" '
.data | map(
select(.record_type == "TXT" and .dname_b64 == $b64name)
| .data_b64[0]
) | .[0]
' <<< "${json}"
)
debug_msg "Base 64 encoded TXT value '${txt}'"
if [ "${txt}" == "null" ]; then
return 1
else
base64 -d <<< "${txt}"
return 0
fi
}
Update TXT Record
This covers both additions and changes to DNS TXT records. Get the line index required to edit or add and supply the TXT value supplied by certbot. If the API returns a serial number error, parse out the new serial number and try once more. Then fetch the value and verify the change has been made.
# Set the value of a TXT record for a DNAME.
#
# Parameters:
# * The DNAME for the TXT record to edit or add. Note the ${ZONE} part gets
# automatically appended by cPanel, so this parameter must omit it from the
# end of the value.
# * The TXT record's value
#
# Usage:
# update_dns_txt "_acme-challenge" "ABCDEF1234567890"
#
# The API call will lookup "_acme-challenge.${ZONE}" because the DNS zone
# will be appended by the cPanel API assuming for security so that one can
# only edit your own DNS entries.
#
function update_dns_txt() {
local name=${1}
local txt=${2}
if [ -z "${name}" ]; then
error_msg "The first argument must be a string for the TXT record's DNAME."
exit 1
fi
if [ -z "${txt}" ]; then
error_msg "The second argument must be a string for the TXT record's value."
exit 1
fi
# Check SERIAL has been set globally
if [ -z ${SERIAL+x} ]; then
error_msg "The global variable SERIAL is not set, cannot proceed without it."
exit 1
fi
local func="edit"
local li
li=$(get_line_index "${name}")
EC=${?}
if [ "${EC}" -gt 0 ]; then
func="add"
debug_msg "Line index for '${name}.${ZONE}': ${li}, Add"
else
func="edit"
debug_msg "Line index for '${name}.${ZONE}': ${li}, Edit"
fi
json='{
"dname" : "'${name}'",
"ttl" : '${TTL}',
"record_type" : "TXT",
"line_index" : '${li}',
"data" : ["'${txt}'"]
}'
debug_msg "Request\n$(jq . <<< "${json}")"
loop=2
while [ ${loop} -gt 0 ]; do
response=$(curl -s \
--data "zone=${ZONE}" \
--data "serial=${SERIAL}" \
--data-urlencode "${func}=${json}" \
-H "${AUTHHDR}" \
"https://${CPANEL_HOST}/execute/DNS/mass_edit_zone"
)
debug_msg "Response\n$(jq . <<< "${response}")"
err=$(echo "${response}" | jq '.errors[0]')
if [ "${err}" != "null" ]; then
SERIAL=$(sed 's/^"The given serial number ([0-9]\+) does not match the DNS zone’s serial number (\([0-9]\+\))\. Refresh your view of the DNS zone, then resubmit\."$/\1/' <<< "${err}")
((loop--))
if [[ ! (${SERIAL} =~ ^[0-9]+$) ]]; then
error_msg "Unable to parse error '${err}' for a serial number."
exit 1
fi
debug_msg "Looping with serial=${SERIAL}"
else
SERIAL=$(echo "${response}" | jq -r '.data.new_serial')
loop=0
debug_msg "Setting serial=${SERIAL}"
fi
done
fetchval=$(get_dns_txt "${name}")
if [ "${txt}" == "${fetchval}" ]; then
debug_msg "TXT record for '${name}.${ZONE}' was set correctly to '${fetchval}'"
return 0
else
warning_msg "TXT record for '${name}.${ZONE}' was set incorrectly, '${txtval}' != '${fetchval}'"
return 1
fi
}
Remove TXT Record
Get the line index required to remove the TXT value from the DNS zone. If the API returns a serial number error, parse out the new serial number and try once more. Then fetch the value and verify the deletion has been made.
# Remove the value of a TXT record for a DNAME.
#
# Parameters:
# * The DNAME for the TXT record to remove. Note the ${ZONE} part gets
# automatically appended by cPanel, so this parameter must omit it from the
# end of the value.
#
# Usage:
# remove_dns_txt "_acme-challenge"
#
# The API call will lookup "_acme-challenge.${ZONE}" because the DNS zone
# will be appended by the cPanel API assuming for security so that one can
# only edit your own DNS entries.
#
function remove_dns_txt() {
local name=${1}
if [ -z "${name}" ]; then
error_msg "The first argument must be a string for the TXT record's DNAME."
exit 1
fi
# Check SERIAL has been set globally
if [ -z ${SERIAL+x} ]; then
error_msg "The global variable SERIAL is not set, cannot proceed without it."
exit 1
fi
local li
li=$(get_line_index "${name}")
EC=${?}
if [ "${EC}" -gt 0 ]; then
debug_msg "TXT record '${name}.${ZONE}' does not exist"
return 1
else
debug_msg "Line index for '${name}.${ZONE}': ${li}"
fi
loop=2
while [ ${loop} -gt 0 ]; do
response=$(curl -s \
--data "zone=${ZONE}" \
--data "serial=${SERIAL}" \
--data-urlencode "remove=${li}" \
-H "${AUTHHDR}" \
"https://${CPANEL_HOST}/execute/DNS/mass_edit_zone"
)
debug_msg "Response\n$(jq . <<< "${response}")"
err=$(echo "${response}" | jq '.errors[0]')
if [ "${err}" != "null" ]; then
SERIAL=$(sed 's/^"The given serial number ([0-9]\+) does not match the DNS zone’s serial number (\([0-9]\+\))\. Refresh your view of the DNS zone, then resubmit\."$/\1/' <<< "${err}")
((loop--))
if [[ ! (${SERIAL} =~ ^[0-9]+$) ]]; then
error_msg "Unable to parse error '${err}' for a serial number."
exit 1
fi
debug_msg "Looping with serial=${SERIAL}"
else
SERIAL=$(echo "${response}" | jq -r '.data.new_serial')
loop=0
debug_msg "Setting serial=${SERIAL}"
fi
done
fetchval=$(get_dns_txt "${name}")
if [ -z "${fetchval}" ]; then
debug_msg "TXT record for '${name}.${ZONE}' was deleted."
return 0
else
warning_msg "TXT record for '${name}.${ZONE}' still set to '${fetchval}'."
return 1
fi
}
Helper Functions
These functions have been mentioned above and should have been obvious what their intent was, but here they are.
# Seed the serial number with Year+Month+0
SERIAL="$(date '+%Y%m0')"
# Print a debug message
#
# Usage:
# debug_msg "Something you might like to know about"
#
function debug_msg() {
[ ${DEBUG} -gt 0 ] && echo -e "$(date '+%F %T') $(hostname) DEBUG: ${FUNCNAME[1]} - ${*}" >> ${LOGFILE}
}
# Print an information message (or note)
#
# Usage:
# info_msg "Update on something"
#
function info_msg() {
echo -e "$(date '+%F %T') $(hostname) INFO: ${FUNCNAME[1]} - ${*}" | tee -a ${LOGFILE}
}
# Print an warning message
#
# Usage:
# warning_msg "Something a bit odd"
#
function warning_msg() {
echo -e "$(date '+%F %T') $(hostname) WARNING: ${FUNCNAME[1]} - ${*}" | tee -a ${LOGFILE} >&2
}
# Print an error message
#
# Usage:
# error_msg "Something went wrong"
#
function error_msg() {
echo -e "$(date '+%F %T') $(hostname) ERROR: ${FUNCNAME[1]} - ${*}" | tee -a ${LOGFILE} >&2
}
Automating Certificate Installations
Nothing much to explain here, the code illustrates what needs to be done via the cPanel API. Clearly I need to improve the error checking and return a sensible value.
# Install an SSL Certificate on cPanel
#
# Parameters:
# * domain
# - The domiain name to which the SSL certificate will be associated
# * certificate file name
# - The path name to the Let's Encrypt generated certificate
# * private key file name
# - The path name to the Let's Encrypt generated private key
# * Certificate Authority bundle file name
# - The path name to the Let's Encrypt CA bundle
#
# STDOUT:
# Confirmation text from the API call.
#
# Usage: Example for a personal Root CA
# cert_install \
# ${ZONE} \
# /root/CertificateAuthority/signed/www.${ZONE}.crt \
# /etc/ssl/private/${ZONE}.key \
# /etc/ssl/certs/RootCA.pem
#
function cert_install() {
local domain=${1}
local cert_file=${2}
local key_file=${3}
# This might be optional and hence we need to amend the checks for this value?
local ca_file=${4}
if [ -z "${domain}" ]; then
error_msg "The first argument must be the cPanel zone."
exit 1
fi
if [ ! -f "${cert_file}" ]; then
error_msg "The second argument must be the certificate file name, the file '${cert_file}' cannot be found."
exit 1
fi
if [ ! -f "${key_file}" ]; then
error_msg "The second argument must be the private key file name, the file '${key_file}' cannot be found."
exit 1
fi
if [ ! -f "${ca_file}" ]; then
error_msg "The second argument must be the Certification Authority (CA) bundle file '${ca_file}' name, the file cannot be found."
exit 1
fi
json=$(curl -s \
--data "zone=${ZONE}" \
--data "domain=${domain}" \
--data-urlencode "cert=$(cat ${cert_file})" \
--data-urlencode "key=$(cat ${key_file})" \
--data-urlencode "cabundle=$(cat ${ca_file})" \
-H "${AUTHHDR}" \
"https://${CPANEL_HOST}/execute/SSL/install_ssl"
)
debug_msg "Certificate\n$(cat ${cert_file})"
debug_msg "Private Key\n$(cat ${key_file})"
debug_msg "CA Bundle\n$(cat ${ca_file})"
debug_msg "Response\n$(jq . <<< "${json}")"
jq -r '.data.message' <<< ${json}
}
Integration for Certificate Creation
So the biggest issue is that certbot needs to query the DNS TXT records to check you have the rights to create a certificate, but DNS changes take time to propagate. The problem is, how much time? If certbot cannot get the TXT record back from its DNS query you get caught up in a failed attempt rate limit.
There is a Failed Validation limit of 5 failures per account, per hostname, per hour. This limit is higher on our staging environment, so you can use that environment to debug connectivity problems. Exceeding the Failed Validations limit is reported with the error message too many failed authorizations recently.
Let's Encrypt Rate Limits
We know about this limit. We need to put sufficient pause in any scripts to allow for DNS changes to propagate. Perhaps inside a cloud provider's infrastructure the local DNS is queried (via a plugin?) and hence quicker and more reliable. But we're outside such environments, and experiments suggest we need to wait about 10-12 minutes to get multiple DNS servers to agree. Later we explain a reliable method to ensure that DNS has been updated. The TXT record value to use is provided at the time of certbot invocation, and is valid for just the one session of checking. Next time you verify ownership of your domain you will need to set a different value.
There are two scripts that you need to write, in whatever scripting language you like. certbot then sets environment variables with the TXT dname and values you should use, for each (sub-)domain name listed in your certificate request. Only each domain is done one at a time, so you can't set all the TXT fields, then check them all and then delete them all, you will have to wait for DNS propagation to occur multiple times, once for each (sub-)domain you request. I assume this is a penalty of needing to use environment variables for compatibility with multiple scripting languages.
- --manual-auth-hook ./script-auth.sh
- --manual-cleanup-hook ./script-clean.sh
Configuration
The Bash scripts have so far assumed some values, not least the API key for cPanel. These values, some of which are sensitive or just not for blog publication, have been separated into a configuration file to be amended on a per instance basis.
# certbot automation global values
#
# This file should be source'd by related Bash scripts.
# If this file has already been source'd, bail out now to avoid reading a subsequent time.
[ -n "${_CERT_UPDATE_CONFIG_}" ] && return 0
_CERT_UPDATE_CONFIG_=1
# DNS Editor Zone
ZONE="your_domain.tld"
# cPanel API key inside the HTTP Header for curl
AUTHHDR="Authorization: cpanel username:XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
# cPanel Host Domain Name
# Format is "domain_name:port"
CPANEL_HOST="server.domain.tld:2083"
# This is the Email address you registered with https://letsencrypt.org/ when using:
# certbot register --email <email-address> --no-eff-email
EMAIL="<email-address>"
# TTL (time to live) is the amount of time (in seconds) a DNS server will cache the record for
declare -i TTL=300 # 300s = 5m
# Maximum number of times to verify the DNS TXT records have been updated
declare -i TXT_VERIFY=100
# Valid Values for DEBUG
# 0 - No Debug Printed
# 1 - Debug Printed
declare -i DEBUG=0
# Create a session log file in the parent directory
LOGFILE="$(dirname $(dirname ${BASH_SOURCE[0]}))/cert-create.log"
--manual-auth-hook
For a given domain the plan is:
- See if all the name servers are clear for a ${ZONE} (but don't error out);
- Add a TXT record;
- Verify it has been set via cPanel;
- Verify all the name servers have been updated for a ${ZONE};
- (Throughout) do any logging;
- Email logs to those that care.
#!/bin/bash
#
# certbot creates a variety of environment variables when calling this script.
#
THIS=$(realpath ${0})
SCRIPT=$(basename ${0})
THISDIR=$(realpath $(dirname ${0}))
CONFIG="${THISDIR}/config"
cd ${THISDIR}
source ${CONFIG}/certbot.config.bash
source ${THISDIR}/dns-record-lib.bash
errs=0
# Expect these variables to have been defined
for v in ZONE TXT_VERIFY DEBUG LOGFILE; do
if [ -z "${!v}" ]; then
error_msg "Global variable '${v}' is not defined."
((errs++))
fi
done
if [ "${errs}" -gt 0 ]; then
exit 1
fi
# Internal global variables
# Get the list of DNS name servers for ZONE
declare -a AUTH_NS=(
$(get_name_servers ${ZONE})
)
# Seed the serial number
SERIAL="$(get_serial_num ${ZONE} ${AUTH_NS[0]})"
txt_record="_acme-challenge.${CERTBOT_DOMAIN}"
name=${txt_record%.${ZONE}}
value=${CERTBOT_VALIDATION}
info_msg "Domain: ${CERTBOT_DOMAIN}"
info_msg "Remaining challenges: ${CERTBOT_REMAINING_CHALLENGES}"
info_msg "Setting domain '${txt_record}' TXT record to '${CERTBOT_VALIDATION}'."
info_msg "cPanel dname: ${name}"
# Start from clean
if remove_dns_txt "${name}"; then
info_msg "Old TXT record removed"
else
info_msg "No old TXT record needed removing"
fi
debug_msg "Performing name server checks to ensure no old TXT records."
pass=0
for ns in ${AUTH_NS[@]}; do
txt_check=$(dig @${ns} -t txt ${txt_record} +short +norecurse | sed 's/^"\(.*\)"$/\1/')
if [ -z "${txt_check}" ]; then
debug_msg "TXT record for '${txt_record}' in '${ns}' is clear."
((pass++))
else
debug_msg "TXT record for '${txt_record}' in '${ns}' is set."
fi
done
info_msg "${pass} of ${#AUTH_NS[@]} name servers clear."
if [ ${pass} -lt ${#AUTH_NS[@]} ]; then
warning_msg "Remove previous TXT records from all authorative name servers."
else
debug_msg "Good to proceed to setting TXT record up."
fi
# Perform the DND TXT record update
line_index=$(get_line_index "${name}")
EC=${?}
if [ "${EC}" -gt 0 ]; then
debug_msg "Line Index for '${name}': ${line_index}, Add"
else
debug_msg "Line Index for '${name}': ${line_index}, Edit"
fi
# Includes 'get_dns_txt "${name}"' to verify the update
update_dns_txt "${name}" "${value}"
debug_msg "Checking TXT set correctly in authorative name servers (non-recursive)."
for ((loop=1; loop<=${TXT_VERIFY}; loop++)); do
sleep_msg 1m
pass=0
for ns in ${AUTH_NS[@]}; do
txt_check=$(dig @${ns} -t txt ${txt_record} +short +norecurse | sed 's/^"\(.*\)"$/\1/')
if [ "${txt_check}" == "${value}" ]; then
debug_msg "Name server check loop ${loop} for '${txt_record}' in '${ns}' PASSED"
((pass++))
else
debug_msg "Name server check loop ${loop} for '${txt_record}' in '${ns}' FAILED"
fi
done
info_msg "Loop ${loop} of ${TXT_VERIFY}: ${pass} of ${#AUTH_NS[@]} name servers updated."
if [ ${pass} -eq ${#AUTH_NS[@]} ]; then
break
fi
done
--manual-cleanup-hook
For a given domain the plan is:
- Remove a TXT record;
- (Throughout) do any logging;
- Email logs to those that care.
#!/bin/bash
THIS=$(realpath ${0})
SCRIPT=$(basename ${0})
THISDIR=$(realpath $(dirname ${0}))
CONFIG="${THISDIR}/config"
cd ${THISDIR}
source ${CONFIG}/cert-update.config.bash
source ${THISDIR}/dns-record-lib.bash
errs=0
# Expect these variables to have been defined
for v in ZONE DEBUG LOGFILE; do
if [ -z "${!v}" ]; then
error_msg "Global variable '${v}' is not defined."
((errs++))
fi
done
if [ "${errs}" -gt 0 ]; then
exit 1
fi
# Get the list of DNS name servers for ZONE
declare -a AUTH_NS=(
$(get_name_servers ${ZONE})
)
info_msg "Clean up script run started"
name="_acme-challenge.${CERTBOT_DOMAIN}"
name=${name%.${ZONE}}
# Need to avoid all cleanup operations trying to happen concurrently and failing by
# using the serial number. So use this variable value to slip each one in time.
sleep "${CERTBOT_REMAINING_CHALLENGES}s"
# Seed the serial number
SERIAL="$(get_serial_num ${ZONE} ${AUTH_NS[0]})"
if remove_dns_txt "${name}"; then
info_msg "TXT record removed"
else
warning_msg "No TXT record removed"
fi
# Hook '--manual-cleanup-hook' for domain.tld reported error code 1
exit 0
Initial Certificate Creation
Before detailing the controlling code, we need a reliable way to check if certbot ownership test will pass by DNS.
DNS Propagation Concerns
Over a week of trial runs, many at great length, trying to establish when it was safe to allow certbot finally perform its test, we think we have now identified the safe test for when DNS has been updated sufficiently. It turns out that certbot queries your domain's authoritative name servers to check on proof of ownership. Knowing this is a massive help as we can now predict what certbot will see. No need to verify propagation to anywhere else (e.g. OpenDNS, Google) which will have recursively made they way back to your name servers anyway, but actually their query results were quite random. The randomness of the queries is being put down to one of our three authoritative name servers being much slower to update than the other two. We've managed to ensure a positive proof of ownership by DNS by looping until all our domain's authoritative name servers had updated. Well, actually we loop up to 100 times with a 1 minute sleep between loops. It can take an hour to get all three name servers updated, but be as quick as 1 minute.
Script for Initial Certificate Creation
The script runs the appropriate certbot command line, which includes the references to the two scripts above for the manual hooks. Then if successful, uploads the certificate via the cPanel API to all the domains that need to reference the certificate. Finally, it emails out notification of what happened during execution. You will need to ensure your local device and account is correctly setup to send emails.
#!/bin/bash
THIS=$(realpath ${0})
THISDIR=$(realpath $(dirname ${0}))
SCRIPT=$(basename ${0})
CONFIG="${THISDIR}/config"
cd ${THISDIR}
source ${CONFIG}/certbot.config.bash
source ${THISDIR}/dns-record-lib.bash
errs=0
# Expect these variables to have been defined
for v in ZONE EMAIL DEBUG LOGFILE; do
if [ -z "${!v}" ]; then
error_msg "Global variable '${v}' is not defined."
((errs++))
fi
done
[ "${errs}" -gt 0 ] && exit 1
# Whether to verify the domain ownership ONLY or to request a certificate for real and upload.
#
# 0 - Verify the domain names and request a proper certificate which is then uploaded
# 1 - Verify the domain names only using '--test-cert' but do not upload a staging certificate
VERIFYONLY=0
POSITIONAL=()
while [[ ${#} -gt 0 ]]; do
case ${1} in
--verify)
shift # past argument
VERIFYONLY=1
;;
--help)
shift # past argument
echo "${SCRIPT} [--verify] [--help]"
exit 0
;;
*)
error_msg "Unknown command line parameter '${1}'."
shift
exit 1
;;
esac
done
# Truncate the log file at the start of each run
echo "$(date '+%F %T') $(hostname) INFO: Run started" > ${LOGFILE}
if [ "${VERIFYONLY}" -gt 0 ]; then
echo "$(date '+%F %T') $(hostname) INFO: Running domain verification mode, no certificate will be requested." >> ${LOGFILE}
fi
# certonly - Makes certbot not try to install the certs locally.
# --expand - Makes certbot add more URLs to the cert without getting upset.
# --manual-auth-hook - Script to get executed to for authentication
# --non-interactive - Allows automation in certbot
# --manual - Allows automation in certbot
# --preferred-challenges=dns - Makes certbot use the dns challenge to allow wild carded domains
# --email - Your account email address
#
# -d * - All lines after this are the selected domains, which should be
# -d * - generated from a file.
#
# WHEN IN PRODUCTION REMOVE THESE SWITCHES
# --test-cert - Only for testing to avoid the 5 failed authentications
# - per hour rule.
# --dry-run - Test "renew" or "certonly" without saving any certificates
# to disk
#
# certbot penalties:
#
# Up to 5 failed TXT record verification attempts per hour, unless using the --test-cert switch
# 1 credit per certificate
#
CMDSW=""
if [ "${VERIFYONLY}" -gt 0 ]; then
CMDSW="--dry-run"
fi
output=$(
(
certbot \
certonly \
--agree-tos \
--expand \
--manual-auth-hook "${THISDIR}/dns-record-auth.bash" \
--manual-cleanup-hook "${THISDIR}/dns-record-cleanup.bash" \
--non-interactive \
--manual \
--preferred-challenges=dns \
--email "${EMAIL}" \
--server "https://acme-v02.api.letsencrypt.org/directory" \
--cert-name "${ZONE}" \
${CMDSW} \
$(domains_arg "${CONFIG}/domains.altnames.txt")
EC=${?}
EC=1
if [[ ${EC} -eq 0 ]]; then
if [ "${VERIFYONLY}" -eq 0 ]; then
# Check the three files required for a certificate upload are present
err=0
for f in ${THISDIR}/live/${ZONE}/cert.pem ${THISDIR}/live/${ZONE}/privkey.pem ${THISDIR}/live/${ZONE}/chain.pem; do
if [ ! -f ${f} ]; then
error_msg "Certificate upload missing file '${f}'."
((err++))
fi
done
[ "${err}" -gt 0 ] && exit 1
declare -a installs=(
$(grep -v '^#' "${CONFIG}/domains.install.txt")
)
for c in ${installs[@]}; do
cert_install \
${c} \
${THISDIR}/live/${ZONE}/cert.pem \
${THISDIR}/live/${ZONE}/privkey.pem \
${THISDIR}/live/${ZONE}/chain.pem
done
info_msg "Installed certificate."
else
info_msg "Skipping installation as verify mode is on."
fi
else
warning_msg "Not installing certificate as certbot failed."
fi | tee -a ${LOGFILE}
) 2>&1
)
echo ""
echo "Certbot Create Output:"
echo ""
echo "${output}"
(
cat << EOF
Subject: Let's Encrypt Certificate Create Log
To: $(grep -v '^#' "${CONFIG}/email_addresses.txt" | tr "\n" ",")
From: "${HOSTNAME^} Let's Encrypt Updater" <${USER}@${HOSTNAME}.${ZONE}>
Content‐Type: text/html
EOF
if [ "${VERIFYONLY}" -gt 0 ]; then
VERIFYMSG="<p>From a domain verification run only.</p>"
fi
cat << EOF
<h1>Output</h1>
<pre>
${output}
</pre>
EOF
if [ "${DEBUG}" -gt 0 ]; then
cat << EOF
<h2>Log File</h2>
<p>Debug is on, this is the log file.</p>
<pre>
$(cat ${LOGFILE})
</pre>
EOF
fi
echo "<h2>Certificate</h2>"
if [[ ("${VERIFYONLY}" -eq 0) && (-f "${THISDIR}/live/${ZONE}/cert.pem") ]]; then
echo "<pre>"
openssl x509 -noout -text -in ${THISDIR}/live/${ZONE}/cert.pem
echo "</pre>"
else
echo "<p>'${THISDIR}/live/${ZONE}/cert.pem' does not exist.</p>"
fi
) | sendmail -t
Wildcarded Domains
This might be obvious to others, but we found conflicting information on the Internet (what a surprise) about what a wildcarded domain covers. A certificate with a wildcarded domain name such as *.domain.tld does not cover domain.tld. So you might think to just add the non-wildcarded version to the list that cerbot includes with -d switches. That causes a different problem, where certbot then tries to verify both domain.tld for the wildcarded entry and domain.tld for the non-wildcarded entry in the same run, but with different TXT record values. This means one of those tests will always fail domain name verification. If the domain is already verified you can get away with both entries as the domain is not re-verified. We chose to set up a redirect from domain.tld to www.domain.tld instead so we could ignore the problem.
Renewal
Our installation has setup automatic renewals via systemd timers. The renewal duly ran unattended just hours after cPanel emailed me to say the certificates were expiring about 2 months after the date of this initial blog post.
root@melrose:/etc # systemctl list-timers NEXT LEFT LAST PASSED UNIT ACTIVATES Fri 2022-01-28 17:09:00 GMT 14min left Fri 2022-01-28 16:39:15 GMT 15min ago phpsessionclean.timer phpsessionclean.service Fri 2022-01-28 21:03:23 GMT 4h 8min left Fri 2022-01-28 14:24:15 GMT 2h 30min ago snapd.refresh.timer snapd.refresh.service Fri 2022-01-28 22:30:34 GMT 5h 35min left Fri 2022-01-28 09:05:15 GMT 7h ago apt-daily.timer apt-daily.service Sat 2022-01-29 06:18:54 GMT 13h left Fri 2022-01-28 06:47:15 GMT 10h ago apt-daily-upgrade.timer apt-daily-upgrade.service Sat 2022-01-29 11:04:00 GMT 18h left Fri 2022-01-28 14:47:15 GMT 2h 7min ago snap.certbot.renew.timer snap.certbot.renew.service Sat 2022-01-29 15:39:15 GMT 22h left Fri 2022-01-28 15:39:15 GMT 1h 15min ago systemd-tmpfiles-clean.timer systemd-tmpfiles-clean.service 6 timers listed. Pass --all to see loaded but inactive timers, too.
Library Functions for cPanel API
Presently "invalid" == "out of date", but there could be other features that are used to decide if certificates are invalid in the future too.
# List all 'invalid' certificates stored in cPanel. List them all then filter by
# now > "Valid to" date. Return a shell compatible list that can be put into an
# array.
#
# Parameters:
# 1) Domain name
#
# Usage: invalid_cert_list ${ZONE}
#
# Returns:
# some_domain_tld_a7c92_689e3_1645124194_b435e0bef1a2c2cbbfca4594b0e8180d \
# some_domain_tld_aae1e_d196f_1823802814_6af2586d735a925bf5ce5cca053b693a
#
function invalid_cert_list() {
local domain=${1}
if [ -z "${domain}" ]; then
error_msg "The first argument must be the cPanel zone."
exit 1
fi
# Upload the certificate
json=$(curl -s \
--data "zone=${ZONE}" \
--data "domain=${domain}" \
-H "${AUTHHDR}" \
"https://${CPANEL_HOST}/execute/SSL/list_certs"
)
jq '.data | map(
select(now > .not_after) | .id
) | @sh' <<< "${json}" | tr -d \"\'
}
# Delete a certificate via the cPanel API.
#
# Parameters:
# 1) The certificate ID
#
# Usage: cert_delete "some_domain_tld_a7c92_689e3_1645124194_b435e0bef1a2c2cbbfca4594b0e8180d"
#
function cert_delete() {
local certid=${1}
if [ -z "${certid}" ]; then
error_msg "The second argument must be the certificate ID."
exit 1
fi
# Upload the certificate
json=$(curl -s \
--data-urlencode "id=${certid}" \
-H "${AUTHHDR}" \
"https://${CPANEL_HOST}/execute/SSL/delete_cert"
)
jq '.' <<< "${json}"
}
Script for Certificate Renewal
The intention is that this script could be run by a cron job weekly or monthly. If the certificate does not require renewal then certbot tells you, and there is no new certificate to upload, so that step is skipped. This script should be run when the systemd timers method fails or just does not run. This has been the case in one of two renewals.
#!/bin/bash
#
# References:
# * cPanel API https://api.docs.cpanel.net/cpanel/introduction/
#
# Cron does not have a default path, this needs to be set up specially for this script.
PATH="/bin:/usr/bin:/usr/sbin:/snap/bin"
THIS=$(realpath ${0})
THISDIR=$(realpath $(dirname ${0}))
SCRIPT=$(basename ${0})
CONFIG="${THISDIR}/config"
cd ${THISDIR}
source ${CONFIG}/certbot.config.bash
source ${THISDIR}/dns-record-lib.bash
errs=0
# Expect these variables to have been defined
for v in ZONE EMAIL DEBUG LOGFILE; do
if [ -z "${!v}" ]; then
error_msg "Global variable '${v}' is not defined."
((errs++))
fi
done
[ "${errs}" -gt 0 ] && exit 1
output=$(
(
certbot renew
EC=${?}
if [ "${EC}" -gt 0 ]; then
warning_msg "Non-zero exit code from certbot, skipping the certificate upload step."
else
# Check the three files required for a certificate upload are present
err=0
for f in ${THISDIR}/live/${ZONE}/cert.pem ${THISDIR}/live/${ZONE}/privkey.pem ${THISDIR}/live/${ZONE}/chain.pem; do
if [ ! -f ${f} ]; then
error_msg "Certificate upload missing file '${f}'."
((err++))
fi
done
[ "${err}" -gt 0 ] && exit 1
declare -a installs=(
$(grep -v '^#' "${CONFIG}/domains.install.txt")
)
for c in ${installs[@]}; do
cert_install \
${c} \
${THISDIR}/live/${ZONE}/cert.pem \
${THISDIR}/live/${ZONE}/privkey.pem \
${THISDIR}/live/${ZONE}/chain.pem
done
info_msg "Installed certificate."
# Delete invalid (out of date) certificates
declare -a invalid_certs=(
$(invalid_cert_list "${ZONE}")
)
for c in ${invalid_certs[@]}; do
cert_delete "${c}"
done
fi
) 2>&1
)
cat << EOF
Certbot Renew Output:
${output}
EOF
(
cat << EOF
Subject: Let's Encrypt Certificate Renew Log
To: $(grep -v '^#' "${CONFIG}/email_addresses.txt" | tr "\n" ",")
From: "${HOSTNAME^} Let's Encrypt Updater" <${USER}@${HOSTNAME}.${ZONE}>
Content‐Type: text/html
<h1>Output</h1>
<pre>
${output}
</pre>
EOF
echo "<h2>Certificate</h2>"
if [[ ("${VERIFYONLY}" -eq 0) && (-f "${THISDIR}/live/${ZONE}/cert.pem") ]]; then
echo "<pre>"
openssl x509 -noout -text -in ${THISDIR}/live/${ZONE}/cert.pem
echo "</pre>"
else
echo "<p>'${THISDIR}/live/${ZONE}/cert.pem' does not exist.</p>"
fi
) | sendmail -t
Conclusions
The key to success is to find a way to guarantee that the TXT field will be correctly read from DNS queries by certbot. Much care has been taken to ensure this works reliably, but has only been tested on our own domain name.
After much searching the web to find solutions we came across a number of existing libraries to automate the above. Thankfully, many of the same ideas. I would like to think ours is minimal and fully explained so that we've removed the fear of a black box. If you are more interested in a fully fledge solution take a look at ACME Client Implementations with many contributing authors across a wide range of services. acme.sh's cPanel code looks like it nicely covers the above.
Acknowledgements
Whilst I've taken the job of writing this up, the bulk of the initial investigation of this solution was delivered by my 13 year old son. He researched how to use certbot and only paused when I was needed for access permissions to cPanel. My contribution has been some expertise in Bash scripting, to play the role as customer, and to refine the means to test if DNS updates have successfully propagated. Now we have the solution, I need to record what we did for reference. He's not so keen on documentation...