BEgridClient
BEgrid Quattor Client
THIS IS AN OLD PAGE FOR CB4: PLEASE DON'T USE IT !!! FOLLOW INSTRUCTION ON THE NEW CB5 http://quattor.begrid.be/trac/centralised-begrid-v5/wiki/BEgridClientv5
Now there is a new version of BEgrid Quattor Client. This is version 5 (CB5). Go to UpdateToCB5 to update from CB4 to CB5. Here is the CB5 Wiki
I. Description
This pages describes in detail how to setup a 'BEgrid Quattor Client', which is a server (!) that will allow installation and configuration of one or more clusters on a particular site, for integration in BEgrid. The actual configuration of the site, as well as all necessary software packages, are retrieved from the 'Centralised BEgrid (CB) Repository'.
This 'BEgrid Quattor Client' can be installed on a dedicated server, but can also be run as a (Xen) virtual server. One option, for which setup instructions are provided, is to use one machine for NAT and Local DNS (for the worknodes), and have the BEgrid Quattor Client as a virtual server on that same box.
The 'BEgrid Quattor Client' will fulfill the following tasks:
- Pxe-boot server
- Reversed proxy + cache webserver
- Fetch templates + search and replace + build
- machine templates
- ks+pxe profiles
- AII Server: automatically contact all cluster nodes to get new profiles after build, so that all nodes always have an uptodate configuration.
I.1. Requirements
- Some diskspace for the cache (15GB)
- A certificate from a user that can connect to the CB
- Firewall settings should be pretty tight: this server contains all host certificates ... !
I.2. Current situation
Origin:
- quattor.begrid.be is the official Centralised BEgrid server.
- http://quattor.begrid.be/begrid/Central_BEGrid_Repository
- The repository for BEgrid for all platforms
- http://quattor.begrid.be/begrid/install
- Extra installation files for BEgrid for all platforms
II. Practical
II.1. Managing the BEgrid client
II.1.1 Access for admins
- To get access to the Centralised BEgrid (CB), send your IP(range) to the contact person.
- To get access to the CB-SCDB and to the SWREP-repository, send your BEGRID-cert DN to contact person.
Use bulanza@helios.iihe.ac.beNOSPAM as contact person or begrid-tech@lists.belnet.beNOSPAM as technical mailing list.
II.1.2. Admin tools
To be installed on your normal pc/laptop.
Eclipse support
SCDB client setup in official guide here.
Specific info for the new Panc v7 here.
Setup steps to checkout the centralised repository:
- go to Window -> Show view -> Other -> SVN -> SVN Repository
- rightclick -> New -> Repository Location
- Url: https://quattor.begrid.be/repos/centralised-begrid-v5/trunk
- Root: https://quattor.begrid.be/repos/centralised-begrid-v5
- Press Finish (no login/passwd needed, only your certificate.)
- If there's an error, make sure that you are allowed to access the repository and that your .subversion/servers file is correctly set.
- rightclick on https://quattor.begrid.be/repos/centralised-begrid-v5/trunk -> Checkout As -> Simple -> Project
- Give it a name: eg centralised-begrid-v5
- Finish
- In Navigator, there should be a folder called centralised-begrid-v5
- This folder contains a file .project. Rightclick -> Team -> add to: svn ignore
- With newer version of eclipse hidden files are not shown. So to be able to access them, Windows -> Navigation -> Show View Menu -> Filters
- Check the box called .* resources and press OK to confirm
- Create a new folder build (this one will contain all locally build xml-files, and these should never be uploaded). Rightclick -> Team -> add to: svn ignore
- This folder contains a file .project. Rightclick -> Team -> add to: svn ignore
Script
- svncheck: python script to help fetch/build etc
Quattor Configuration changes
Private info
Don't add passwords or any other form of secret information in the repository. In every cluster-configuration there's one directory called private that will be overwritten with files on the final build machine (ie the client machine) that can contain this private information. Instead of this, place files with eg dummy global variables, so that you can build and test the profiles locally.
- e.g.: cfg/sites/begrid/private/passwd.tpl.
- Notice that the values to assign to ROOT_PASSWD and AII_OSINSTALL_ROOTPW variables are MD5 hashes. Set the output of the following command for both of them. Choose preferably a different password for each (since AII_OSINSTALL_ROOTPW is added to the ks file, and the ks file is served through plain http).
openssl passwd -1
- Other variables are set with plaintext passwords.
New files with dummy private information should be added in cfg/clusters/name_of_cluster/private !!
#comment ==== TODO ==== Things that might give problems: *AII: Probably the best thing to do is to standardise on one aii ks_template and ship it with the SCDB (the ant tasks to create the final ones aren't working yet) *It's very possible that the rpm checking against the repository file in current quattor.jar in SCDB is buggy. might renew it...
III. Installation instructions
III.1. Base install SL5
- Get latest SL5
- Get image from:
wget http://linuxsoft.cern.ch/scientific/50/i386/images/boot.iso
- Burn to CD (check with -scanbus):
cdrecord dev=x,x,x boot.iso
- Boot and install using http installation. Take eg fast CERN mirror:
linuxsoft.cern.ch scientific/50/i386/
- choose server, no firewall (to avoid complications (set it later!))
- if you are not using the XEN based setup, make sure that the /var (either through separate partition or as part of /) has enough diskpace available for the rpm caching. (At least 15GB of free space needed for that).
- Complete the install; choose a proper name/network config. This will depend on the way you want to use this server: as a 'BEgrid Client', or as a 'Xen master', on which you'll install a Xen client that is to become the 'BEgrid Client'
III.2. Base SL5 post-installation
- Install the rpmforge-rpm:
rpm -Uvh http://apt.sw.be/redhat/el5/en/i386/RPMS.dag/rpmforge-release-0.3.6-1.el5.rf.i386.rpm
[this is a backup mirror just in case the first one fails]
rpm -Uvh http://wftp.tu-chemnitz.de/pub/linux/dag/redhat/el5/en/i386/RPMS.dag/rpmforge-release-0.3.6-1.el5.rf.i386.rpm
- Change yum default repositories to CERN ones (faster and more reliable connection). Just run this line:
for i in <tt>ls /etc/yum.repos.d/</tt>;do sed -i 's#ftp://ftp.scientificlinux.org/linux#http://linuxsoft.cern.ch#' /etc/yum.repos.d/$i; done
- Stop 'nightly yum update':
service yum stop chkconfig --del yum
- Install ntp (if not yet done)
yum install ntp chkconfig --level 345 ntpd on echo "server ntp.belnet.be" >> /etc/ntp.conf echo "restrict ntp.belnet.be mask 255.255.255.255 nomodify notrap noquery" >> /etc/ntp.conf service ntpd start
Using Xen ?
- If you are running your BEgrid Client on a Xen Virtual Machine, follow the instructions in this link; if you're installing the BEGrid CLient itself, continue here ...
III.3. Webservice
We'll be serving profiles, and (through reverse proxying) rpm's for the installation of the local cluster. So, install and configure the Apache Webserver as follows:
- Install httpd for SL5
yum install httpd
- Configuration for the reverse proxy + cache:
- Taken from mod_cache http://httpd.apache.org/docs/2.2/mod/mod_cache.html and mod_proxy http://httpd.apache.org/docs/2.2/mod/mod_proxy.html
- Reverse proxy is the only one supported by Quattor: your profiles will point to the rpm repository at quattor.begrid.be, but in fact your local BEgrid Client will get the rpms, (in theory optionally) cache them, and provide them to node that is being installed.
- Using a disk cache is preferred to lower the load on the CB and the network (and it should be faster)
- Add following section(s) to /etc/httpd/conf/httpd.conf:
# # Reverse Proxy (Added for AII) # # Comment this line if modules are already loaded in your default httpd.conf LoadModule proxy_module modules/mod_proxy.so ProxyRequests Off <Proxy *> Order deny,allow Allow from all </Proxy> ProxyMaxForwards 15 ProxyReceiveBufferSize 0 ProxyTimeout 300 <Location /begrid/> ProxyPass http://quattor.begrid.be/begrid/ ProxyPassReverse / </Location> # # Disk Cache (Added for AII) # # Comment these lines if modules are already loaded in your default httpd.conf LoadModule cache_module modules/mod_cache.so LoadModule disk_cache_module modules/mod_disk_cache.so ## Directory to host the cache CacheRoot /var/www/cache ## Max size of total cache in kb (obsoleted by Apache 2.2, use htcacheclean instead as explained below) #CacheSize 15000000 CacheEnable disk /begrid ## CacheDirLevels*CacheDirLength must be smaller than 20 !! ## don't set this higher than necessary ## following setting will create 64*64=4096 subdirectories ## for all possible hashes 64^22 CacheDirLevels 2 CacheDirLength 1 ## in bytes (1GB, should be enough for openoffice) CacheMaxFileSize 1000000000 CacheMinFileSize 1 ## expire after 100 days CacheDefaultExpire 8640000 CacheMaxExpire 10000000
- Create the cache directory (unless it was already created, eg when you followed the 'Xen' procedure ...)
mkdir /var/www/cache;chown apache.apache /var/www/cache
- restart httpd and watch the output:
/etc/init.d/httpd restart
- Output 1: Stopping httpd: [FAILED]
This means that httpd was not running by default and should be added to the default startup processes:
chkconfig --add httpd chkconfig --level 3 httpd on
- Output 2: [warn] module <modul name> is already loaded, skipping
This means that the modules were already loaded in httpd.conf. This erro can be ignored or cleaned up by removing the duplicate LoadModule entries.
- Since Apache 2.2, the 'CacheSize' command is not anymore used. So to limit the size of the disk space allocated for caching, you will have to use htcacheclean. For that, create the following cron job in /etc/cron.hourly/htcacheclean-cron.sh
#!/bin/sh htcacheclean -v -n -p/var/www/cache -l15000000K
#comment **TODO: Look at: CacheMaxExpire
III.4. AII (Automated Installation Infrastructure)
- Install basics. Now meta-package for the CB-client installation. It installs all aii things + everything needed for panc and svncheck.
- Add a file /etc/yum.repos.d/cb-v4-sl5.repo following content:
[cb-v4] name=CB server - client repo - SL5 baseurl=http://quattor.begrid.be/begrid/install/apt/RPMS.cb-v4_i386_sl5/ enabled = 1 [quattor] name=Quattor repo - SL4 #baseurl=http://quattorsw.web.cern.ch/quattorsw/software/quattor/yum/1.3/i386/RPMS.quattor_sl4 baseurl=http://quattor.begrid.be/begrid/install/apt/RPMS.quattor_i386_sl4/ enabled = 1 [rpmforge] name = Red Hat Enterprise - RPMforge.net - dag #baseurl = http://apt.sw.be/redhat/el5/en//dag mirrorlist = http://apt.sw.be/redhat/el5/en/mirrors-rpmforge #mirrorlist = ///etc/yum.repos.d/mirrors-rpmforge enabled = 1 protect = 0
- and run:
yum install cb-client
- Create a proper /etc/aii-shellfe.conf
- Either copy /usr/share/doc/aii-1.0.43/eg/aii-shellfe.conf and edit to provide a correct cdburl
- Or create one ( assuming hostname -f does give a FQDN !)
cat <<EOF >/etc/aii-shellfe.conf # # aii-shellfe.conf, created for BeGrid # cdburl = https://$(hostname -f):444/profiles profile_prefix = profile_ cert_file = /etc/sindes/certs/apache.crt key_file = /etc/sindes/keys/apache.key ca_file = /etc/sindes/certs/ca.crt EOF cat /etc/aii-shellfe.conf
- If you are not using SINDES put these values instead:
cat <<EOF >/etc/aii-shellfe.conf # # aii-shellfe.conf, created for BeGrid # cdburl = http://$(hostname -f)/profiles profile_prefix = profile_ EOF cat /etc/aii-shellfe.conf
- modify /usr/share/doc/aii-1.0.43/eg/dhcpd.conf and copy it to /etc/dhcpd.conf
Example for IIHE
# # DHCPD Config far AII # # Uncommnent this line if ISC DHCP ver. 3 ddns-update-style ad-hoc; # write here your network name shared-network iihe.ac.be { deny unknown-clients; not authoritative; # Write here your domain name option domain-name "iihe.ac.be"; # Parameters for the installation via PXE using pxelinux filename "pxelinux.0"; # Uncommnent this line if ISC DHCP ver. 2 # option dhcp-class-identifier "PXEClient"; # Uncommnent this line if ISC DHCP ver. 3 option vendor-class-identifier "PXEClient"; option vendor-encapsulated-options 01:04:00:00:00:00:ff; # Complete with (at least) the gateway + DNS. # Hosts entries will be inserted # automatically by AII in this section subnet 193.190.246.0 netmask 255.255.255.0 { option routers 193.190.246.65; option domain-name-servers 193.190.246.229; } # remove the following subnet if you are not using # private network otherwise keep it and adapt it # your site subnet 192.168.0.0 netmask 255.255.0.0 { option routers 192.168.10.100; option domain-name-servers 192.168.10.100; } }
- add the dhcp deamon at the boot:
chkconfig --add dhcpd chkconfig --level 345 dhcpd on
- configure syslinux and tftp-server (last one uses hosts.* for acl):
mkdir /osinstall/nbp/i386_slc3_308 cd /osinstall/nbp/i386_slc3_308 wget http://linuxsoft.cern.ch/cern/slc308/i386/images/pxeboot/vmlinuz wget http://linuxsoft.cern.ch/cern/slc308/i386/images/pxeboot/initrd.img ln -s /osinstall/ks /var/www/html/ks
- in /etc/xinetd.d/tftp modify the following options (aii-server example in /usr/share/doc/aii-1.0.43/eg/tftp.example)
server_args = -s /osinstall/nbp disable = no
- restart the corresponding service
service xinetd restart
- the default firewall settings of SL5 block tftp traffic (and probably also eg http to port 444 for SINDES).
Either configure the firewall properly or disbales iptables altogether.
/etc/init.d/iptables stop chkconfig iptables off chkconfig --del iptables
- allow acknowledgment script to do its work:
cp /usr/sbin/aii-installack.cgi /var/www/cgi-bin chmod o+rx /var/www/cgi-bin/aii-installack.cgi
- get the BEgrid kickstart template from quattor CVS (use the same command to update it!)
wget -O /usr/lib/aii/osinstall/sl_ks_begrid.conf \ 'http://quattor.begrid.be/trac/centralised-begrid-v4/attachment/wiki/BEgridClient/sl_ks_begrid.conf?format=raw'
- if you use the machine profile names with a FQDN, you must do
echo "use_fqdn = 1" >> /etc/aii-shellfe.conf
- add apache to /etc/sudoers (MUST be done for private interfaces (with private fqdn) as well!!):
echo "apache f.q.d.n=(ALL) NOPASSWD: /usr/sbin/aii-shellfe" >> /etc/sudoers
Also comment in /etc/sudoers
Defaults requiretty
#comment *add the following to /usr/lib/perl/NCM/Template.pm (before ''$data .= $rep_url;'') <pre> ## add spma-proxy support my $ppath="/software/components/spma/proxy"; if ($cfg->elementExists($ppath) && $cfg->getValue($ppath) eq "yes") { $ppath="/software/components/spma/proxyhost"; if ($cfg->elementExists($ppath)) { my $hhost=$cfg->getValue($ppath); ## assume http if proxy is used? if ($rep_url =~ m/http:\/\/(.*?)\//) { $rep_url =~ s/$1/$hhost/; } else { $self->error("SPMA proxy is set, but no http access protocol to repository: $rep_url?"); } } else { $self->error("SPMA proxy is set, but no host defined?"); } }
III.5. svncheck
- This tool depends on pysvn. The default tarball contains a version of pysvn build against subversion 1.4.2 i386 that comes with SL50.
- get the client-script tarball
wget http://quattor.begrid.be/begrid/install/cb-v4-client.tar.gz
- extract it somewhere on the machine. e.g. /opt
tar xzfv cb-v4-client.tar.gz
- The Centralised-begrid (/opt/cb) folder has the following structure:
- /opt/cb/keys: this one holds the begrid CA certificate and a valid user .p12 file. (This is used to connect to the SCDB-server.)
- /opt/cb/subversion: some subversion specific parameters. edit the servers file:
- correct full path to key (.p12 file)
- plaintext passwd for the key (it does not prompt for the passwd)
- /opt/cb/tmp: will contain the checkout and build files.
- /opt/cb/private: here you can put private files (such as passwords and certificate in the templates. passwd.tpl; pub_key.tpl)
- svncheck does this by simple copy from this directory into cfg/clusters. So keep that structure.
- remove the template cluster given as example, otherwise runcheck will try to build it later ...
- /opt/cb/private/<clutername-glite-version>/passwd.tpl
- This file contains the passwords that wil be used for your site.
- You can pick any password you like.
- (Unless certain nodes are not configured with Quattor, in that case they must match whitch the non Quattor nodes).
- /opt/cb/private/<clutername-glite-version>/local_users.tpl
- ???
- Not needed for a CE or a WN.
- /opt/cb/private/<clutername-glite-version>/pub_key.tpl
- Contains the SSH key that will be used for remote ssh access to the nodes.
- More info on generating a key can be found here: http://sial.org/howto/openssh/publickey-auth/
- /opt/cb/private/<clutername-glite-version>/<your_ce_fqdn>.tpl
- Use the certificate obtained using
- https://gridra.belnet.be/cgi-bin/pub/pki?cmd=basic_csr
- or http://mon.iihe.ac.be/trac/t2b/wiki/Certificates_and_VOs
- /opt/cb/svncheck: this is the code written by Jean-François Roche (jfroche@jfroche.be):
- in config.conf you can specify most needed parameters.
- svn_repos: point it to the trunk of the centralised-begrid repository. building tags relies on this!
- cluster_regexp: a regexp to build only these clusters (not used ATM)
- modify the after_args optioon in the script1 to compile with the correct task.
after_args = compile.profiles.iihe-glite-test
- DON'T FORGET to change the email section
- ./runcheck -h for more info
III.6. Optional : Setup access to BEgrid SWREP repository
All software on the nodes will be installed from the BEgrid Centralised Repository. You can optionally ask permission, and configure your system so that you can add yourself new packages to this repository. This is eg needed if you use SINDES.
- send your DN to the contact people and ask for access to the SWREP repository
- configure and install additional rpms that will give you write access to the cebtral rpm repository
yum install swrep-soap-client cdb-soap-auth-x509 mkdir -p /etc/swrep
- authentication is done using your BEgrid certificate that you will also need for svncheck
- for the ca-file, cert-file and key-file fields you can use the same values as for svncheck. They only need to be in .pem format. For the conversio of the p12 to pem, use:
openssl pkcs12 -nocerts -in /path/to/p12 -out /path/to/key.pem openssl pkcs12 -nokeys -clcerts -in /path/to/p12 -out /path/to/cert.pem
- make the following configuration file /etc/swrep/swrep-soap-client.conf
server = quattor.begrid.be use-cert = 1 ca-file = /path/to/ca.crt cert-file = /path/to/cert.pem key-file = /path/to/key.pem
- you are now ready to use swrep-soap-client command
- HOWTO
- pull: add rpm from a remote website
swrep-soap-client pull x86_64_dcache_sl4 /dcache/17 http://www.dcache.org/downloads/1.7.0/sl4/x86_64/dcache-client-1.7.0-39.x86_64.rpm
- put: add rpm using a local file
swrep-soap-client put noarch_sindes /sindes /usr/src/redhat/RPMS/noarch/SINDES-ca-certificate-q3-0.1-2.noarch.rpm
- tips
- to avoid the Enter user-name question, do export SWREP_USER=<user>
- when adding lots of rpms, typing the passphrase for the key is not very user friendly. For that, create a temporary key file that has no password and that unencrypted key to connect. Don't forget to delete it afterwards!
openssl rsa -in /path/to/protected-key.pem -out ~/tmp-key swrep-soap-client --use-key ~/tmp-key <command>
III.7. Install SINDES
Follow the instructions here. Notice that this step is now mandatory!
(You migt want to read presentation on SINDES to know what SINDES is.)
Issues
Download problems
- If you have problems with retrieving files
- during the installation: images, rpms
- when running spma
- rpms that seem to fail with IO errors
- the main problem is probably the not so excellent http caching provide by apache
- could also be be because there was a network error during the caching
- easiest is to cleanup the cache
- be aware that the run after this will be slower due to the recaching of all data
/etc/init.d/httpd stop rm -Rf /var/www/cache/* /etc/init.d/httpd start
Links
- LAL software http://quattor.web.lal.in2p3.fr/packages/