Thursday, September 13, 2012

Quartz scheduler instance is still active but was recovered by another instance in the cluster

When scheduling tasks we should be sure clocks are synchronized across servers. I have seen so many issues because of clock synchronization that I am still surprised to see how small attention is put on server time synchronization. From security vulnerabilities exploits to serious business logic erros resulting in money loss the error always hits you hidden behind words like:
2012-09-12 19:13:09,193 WARN [org.springframework.scheduling.quartz.LocalDataSourceJobStore] - This scheduler instance (bhub-test11347491427777) is still active but was recovered by another instance in the cluster. This may cause inconsistent behavior.
And of course a simple ntpdate configuration will solve the issue. This ntpdate configuration can be deployed remotely in all your servers using Remoto-IT and a POB Recipe similar to the below:
#!/bin/bash -e
# ntp.sh

cd /etc/cron.daily
#assuming you have your configuration files on SVN, aren't you versioning your changes yet?
svn export http://svn.sample.com/environment/common/scripts/etc/cron.daily/ntpdate
chmod 755 ntpdate
ntpdate ntp.ubuntu.com pool.ntp.org
Here is how the ntpdate file looks like:
#!/bin/sh
#Please do use an internal NTP server for security reasons!
ntpdate ntp.ubuntu.com pool.ntp.org > /dev/null
Of course if the cron fails you should get an email so be sure MAIL_TO is working in cron.

No comments:

Followers