TroubleShootingQuattor

From Begrid Wiki
Jump to navigationJump to search

Troubleshooting and FAQ's for Quattor/RunCheck

undefined variable: DB_IP[mynodexx_2emygrid_2emydoamin]

Add the IP for the DNS (the _2e translates to . ) in DB_IP in the file cfg/sites/mysite/site/database.tpl

user-initiated error: ce1.grid.ing.ha.be : hardware not found in machine database

Add the mawhine in DB_IP in the file cfg/sites/mysite/site/database.tpl.

[panc] name: zlib version: 1.2.1.2-1.2 arch: x86_64

Add a version in cfg/sites/mysite/site/os_version.tpl for the machine.

During basic tests: "su: user betest000 does not exist"

The post installation failed you should check your ~/ks-post-install.log and look for relevant errors.

ccm-fetch fails with sslv3 alert certifiacate expired

Wait an hour for apache to generate a new one.

When post-ks faiedn how can I begin to debug it ?

    • The following programs be used to retry a step in the KS file:
      • ncm-ncd --configure <module>
      • (Several missing, please add them)

How / When are the profiles update ?

(The FAQ needs attention.)

>>> (BTW. How long does it take for the configuration files to be applied on the nodes ?)
>> what are you doing or trying to do?
> Well assume I made a mistake in my quattor (for example I just corrected my SITE_NAME).
> How do I push the update without reinstalling the node from scratch ?
run runcheck (if you check stdout, at the end you will see that it notifies some machines).
> Or does it even work automatically? (as I understood it, it would happen automatically, however I don't know how fast it would update).
it's started at random in a 20sec interval after the notifcation is received.
in /var/log/messages there should be an entry that it received th notification
eg
Apr 16 14:20:57 mon /usr/sbin/cdp-listend[1584]: Received UDP packet (ccm|1208348393) from 193.190.246.219
Apr 16 14:20:59 mon /usr/sbin/cdp-listend[1584]: /usr/sbin/ccm-fetch will be called in 20 seconds
Apr 16 14:21:17 mon /usr/sbin/cdp-listend[1584]: Calling /usr/sbin/ccm-fetch with unix time 1208348393 (after 20 seconds)

then ccm-fetch will pull in the profile, compare with what it has, and if soemthing is different, start the components that are involved (well, sort of). this start is also at random with a 5 minute interval.
this is then logged in /var/log/ncm-cdispd.log


if it isn't, please check if cdp-listend and ncm-cdispd are running. 



Back to Troubleshooting


Template:TracNotice