BEgridClient

From Begrid Wiki
Revision as of 09:12, 9 June 2021 by Maintenance script (talk | contribs) (Created page with " PageOutline == BEgrid Quattor Client == '''THIS IS AN OLD PAGE FOR CB4: PLEASE DON'T USE IT !!! FOLLOW INSTRUCTION ON THE NEW CB5 http://quattor.begrid.be/trac/cen...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

PageOutline

BEgrid Quattor Client

THIS IS AN OLD PAGE FOR CB4: PLEASE DON'T USE IT !!! FOLLOW INSTRUCTION ON THE NEW CB5 http://quattor.begrid.be/trac/centralised-begrid-v5/wiki/BEgridClientv5

Now there is a new version of BEgrid Quattor Client. This is version 5 (CB5). Go to UpdateToCB5 to update from CB4 to CB5. Here is the CB5 Wiki

I. Description

This pages describes in detail how to setup a 'BEgrid Quattor Client', which is a server (!) that will allow installation and configuration of one or more clusters on a particular site, for integration in BEgrid. The actual configuration of the site, as well as all necessary software packages, are retrieved from the 'Centralised BEgrid (CB) Repository'.

This 'BEgrid Quattor Client' can be installed on a dedicated server, but can also be run as a (Xen) virtual server. One option, for which setup instructions are provided, is to use one machine for NAT and Local DNS (for the worknodes), and have the BEgrid Quattor Client as a virtual server on that same box.

The 'BEgrid Quattor Client' will fulfill the following tasks:

  • Pxe-boot server
  • Reversed proxy + cache webserver
  • Fetch templates + search and replace + build
    • machine templates
    • ks+pxe profiles
  • AII Server: automatically contact all cluster nodes to get new profiles after build, so that all nodes always have an uptodate configuration.

I.1. Requirements

  • Some diskspace for the cache (15GB)
  • A certificate from a user that can connect to the CB
  • Firewall settings should be pretty tight: this server contains all host certificates ... !

I.2. Current situation

Origin:

  • quattor.begrid.be is the official Centralised BEgrid server.

II. Practical

II.1. Managing the BEgrid client

II.1.1 Access for admins

  • To get access to the Centralised BEgrid (CB), send your IP(range) to the contact person.
  • To get access to the CB-SCDB and to the SWREP-repository, send your BEGRID-cert DN to contact person.

Use bulanza@helios.iihe.ac.beNOSPAM as contact person or begrid-tech@lists.belnet.beNOSPAM as technical mailing list.

II.1.2. Admin tools

To be installed on your normal pc/laptop.

Eclipse support

SCDB client setup in official guide here.
Specific info for the new Panc v7 here.


Setup steps to checkout the centralised repository:

  1. go to Window -> Show view -> Other -> SVN -> SVN Repository
  2. rightclick -> New -> Repository Location
    1. Url: https://quattor.begrid.be/repos/centralised-begrid-v5/trunk
    2. Root: https://quattor.begrid.be/repos/centralised-begrid-v5
    3. Press Finish (no login/passwd needed, only your certificate.)
    4. If there's an error, make sure that you are allowed to access the repository and that your .subversion/servers file is correctly set.
  3. rightclick on https://quattor.begrid.be/repos/centralised-begrid-v5/trunk -> Checkout As -> Simple -> Project
    1. Give it a name: eg centralised-begrid-v5
    2. Finish
  4. In Navigator, there should be a folder called centralised-begrid-v5
    1. This folder contains a file .project. Rightclick -> Team -> add to: svn ignore
      1. With newer version of eclipse hidden files are not shown. So to be able to access them, Windows -> Navigation -> Show View Menu -> Filters
      2. Check the box called .* resources and press OK to confirm
    2. Create a new folder build (this one will contain all locally build xml-files, and these should never be uploaded). Rightclick -> Team -> add to: svn ignore

Script

  • svncheck: python script to help fetch/build etc

Quattor Configuration changes

Private info

Don't add passwords or any other form of secret information in the repository. In every cluster-configuration there's one directory called private that will be overwritten with files on the final build machine (ie the client machine) that can contain this private information. Instead of this, place files with eg dummy global variables, so that you can build and test the profiles locally.

  • e.g.: cfg/sites/begrid/private/passwd.tpl.
  • Notice that the values to assign to ROOT_PASSWD and AII_OSINSTALL_ROOTPW variables are MD5 hashes. Set the output of the following command for both of them. Choose preferably a different password for each (since AII_OSINSTALL_ROOTPW is added to the ks file, and the ks file is served through plain http).
openssl passwd -1
  • Other variables are set with plaintext passwords.
New files with dummy private information should be added in cfg/clusters/name_of_cluster/private !!


#comment
==== TODO ====
 Things that might give problems:
*AII: Probably the best thing to do is to standardise on one aii ks_template and ship it with the SCDB (the ant tasks to create the final ones aren't working yet)
*It's very possible that the rpm checking against the repository file in current quattor.jar in SCDB is buggy. might renew it...

III. Installation instructions

III.1. Base install SL5

  • Get latest SL5
    • Get image from:
 wget http://linuxsoft.cern.ch/scientific/50/i386/images/boot.iso
    • Burn to CD (check with -scanbus):
 cdrecord dev=x,x,x boot.iso  
    • Boot and install using http installation. Take eg fast CERN mirror:
 linuxsoft.cern.ch
 scientific/50/i386/
    • choose server, no firewall (to avoid complications (set it later!))
    • if you are not using the XEN based setup, make sure that the /var (either through separate partition or as part of /) has enough diskpace available for the rpm caching. (At least 15GB of free space needed for that).
    • Complete the install; choose a proper name/network config. This will depend on the way you want to use this server: as a 'BEgrid Client', or as a 'Xen master', on which you'll install a Xen client that is to become the 'BEgrid Client'

III.2. Base SL5 post-installation

  • Install the rpmforge-rpm:
rpm -Uvh http://apt.sw.be/redhat/el5/en/i386/RPMS.dag/rpmforge-release-0.3.6-1.el5.rf.i386.rpm
[this is a backup mirror just in case the first one fails]
rpm -Uvh http://wftp.tu-chemnitz.de/pub/linux/dag/redhat/el5/en/i386/RPMS.dag/rpmforge-release-0.3.6-1.el5.rf.i386.rpm
  • Change yum default repositories to CERN ones (faster and more reliable connection). Just run this line:
for i in <tt>ls /etc/yum.repos.d/</tt>;do sed -i 's#ftp://ftp.scientificlinux.org/linux#http://linuxsoft.cern.ch#' /etc/yum.repos.d/$i; done
  • Stop 'nightly yum update':
  service yum stop
  chkconfig --del yum
  • Install ntp (if not yet done)
  yum install ntp
  chkconfig --level 345 ntpd on 
  echo "server ntp.belnet.be" >> /etc/ntp.conf
  echo "restrict ntp.belnet.be mask 255.255.255.255 nomodify notrap noquery" >> /etc/ntp.conf
  service ntpd start

Using Xen ?

  • If you are running your BEgrid Client on a Xen Virtual Machine, follow the instructions in this link; if you're installing the BEGrid CLient itself, continue here ...

III.3. Webservice

We'll be serving profiles, and (through reverse proxying) rpm's for the installation of the local cluster. So, install and configure the Apache Webserver as follows:

  • Install httpd for SL5
  yum install httpd
  • Configuration for the reverse proxy + cache:
    • Taken from mod_cache http://httpd.apache.org/docs/2.2/mod/mod_cache.html and mod_proxy http://httpd.apache.org/docs/2.2/mod/mod_proxy.html
    • Reverse proxy is the only one supported by Quattor: your profiles will point to the rpm repository at quattor.begrid.be, but in fact your local BEgrid Client will get the rpms, (in theory optionally) cache them, and provide them to node that is being installed.
    • Using a disk cache is preferred to lower the load on the CB and the network (and it should be faster)
    • Add following section(s) to /etc/httpd/conf/httpd.conf:
#
# Reverse Proxy  (Added for AII)
#
# Comment this line if modules are already loaded in your default httpd.conf
LoadModule proxy_module modules/mod_proxy.so

ProxyRequests Off
<Proxy *>
        Order deny,allow
        Allow from all
</Proxy>

ProxyMaxForwards 15
ProxyReceiveBufferSize 0
ProxyTimeout 300

<Location /begrid/>
        ProxyPass http://quattor.begrid.be/begrid/
        ProxyPassReverse /
</Location>

#
# Disk Cache (Added for AII)
#
# Comment these lines if modules are already loaded in your default httpd.conf
LoadModule cache_module modules/mod_cache.so
LoadModule disk_cache_module modules/mod_disk_cache.so

## Directory to host the cache
CacheRoot /var/www/cache

## Max size of total cache in kb (obsoleted by Apache 2.2, use htcacheclean instead as explained below)
#CacheSize 15000000

CacheEnable disk /begrid

## CacheDirLevels*CacheDirLength must be smaller than 20 !!
## don't set this higher than necessary
## following setting will create 64*64=4096 subdirectories
## for all possible hashes 64^22
CacheDirLevels 2
CacheDirLength 1

## in bytes (1GB, should be enough for openoffice)
CacheMaxFileSize 1000000000
CacheMinFileSize 1

## expire after 100 days
CacheDefaultExpire 8640000
CacheMaxExpire 10000000
    • Create the cache directory (unless it was already created, eg when you followed the 'Xen' procedure ...)
  mkdir /var/www/cache;chown apache.apache /var/www/cache
    • restart httpd and watch the output:
  /etc/init.d/httpd restart
    • Output 1: Stopping httpd: [FAILED]

This means that httpd was not running by default and should be added to the default startup processes:

  chkconfig --add httpd
  chkconfig --level 3 httpd on
    • Output 2: [warn] module <modul name> is already loaded, skipping

This means that the modules were already loaded in httpd.conf. This erro can be ignored or cleaned up by removing the duplicate LoadModule entries.

    • Since Apache 2.2, the 'CacheSize' command is not anymore used. So to limit the size of the disk space allocated for caching, you will have to use htcacheclean. For that, create the following cron job in /etc/cron.hourly/htcacheclean-cron.sh
 #!/bin/sh
 
 htcacheclean -v -n -p/var/www/cache -l15000000K


#comment
**TODO: Look at: 
   CacheMaxExpire

III.4. AII (Automated Installation Infrastructure)

  • Install basics. Now meta-package for the CB-client installation. It installs all aii things + everything needed for panc and svncheck.
  • Add a file /etc/yum.repos.d/cb-v4-sl5.repo following content:
[cb-v4]
name=CB server - client repo - SL5
baseurl=http://quattor.begrid.be/begrid/install/apt/RPMS.cb-v4_i386_sl5/
enabled = 1

[quattor]
name=Quattor repo - SL4
#baseurl=http://quattorsw.web.cern.ch/quattorsw/software/quattor/yum/1.3/i386/RPMS.quattor_sl4
baseurl=http://quattor.begrid.be/begrid/install/apt/RPMS.quattor_i386_sl4/
enabled = 1

[rpmforge]
name = Red Hat Enterprise  - RPMforge.net - dag
#baseurl = http://apt.sw.be/redhat/el5/en//dag
mirrorlist = http://apt.sw.be/redhat/el5/en/mirrors-rpmforge
#mirrorlist = ///etc/yum.repos.d/mirrors-rpmforge
enabled = 1
protect = 0
  • and run:
  yum install cb-client
  • Create a proper /etc/aii-shellfe.conf
    • Either copy /usr/share/doc/aii-1.0.43/eg/aii-shellfe.conf and edit to provide a correct cdburl
    • Or create one ( assuming hostname -f does give a FQDN !)
cat <<EOF >/etc/aii-shellfe.conf
#
# aii-shellfe.conf, created for BeGrid
#
cdburl = https://$(hostname -f):444/profiles
profile_prefix = profile_
cert_file = /etc/sindes/certs/apache.crt 
key_file = /etc/sindes/keys/apache.key
ca_file = /etc/sindes/certs/ca.crt
EOF
cat /etc/aii-shellfe.conf
    • If you are not using SINDES put these values instead:
cat <<EOF >/etc/aii-shellfe.conf
#
# aii-shellfe.conf, created for BeGrid
#
cdburl = http://$(hostname -f)/profiles
profile_prefix = profile_
EOF
cat /etc/aii-shellfe.conf
  • modify /usr/share/doc/aii-1.0.43/eg/dhcpd.conf and copy it to /etc/dhcpd.conf

Example for IIHE

# 
# DHCPD Config far AII
#
# Uncommnent this line if ISC DHCP ver. 3
ddns-update-style ad-hoc;
# write here your network name
shared-network iihe.ac.be {
    deny unknown-clients;
    not authoritative;
    # Write here your domain name
    option domain-name "iihe.ac.be";
    # Parameters for the installation via PXE using pxelinux
    filename                           "pxelinux.0";
    # Uncommnent this line if ISC DHCP ver. 2
    # option dhcp-class-identifier       "PXEClient";
    # Uncommnent this line if ISC DHCP ver. 3
    option vendor-class-identifier       "PXEClient";
    option vendor-encapsulated-options 01:04:00:00:00:00:ff;
    # Complete with (at least) the gateway + DNS.
    # Hosts entries will be inserted
    # automatically by AII in this section
    subnet 193.190.246.0 netmask 255.255.255.0 {
      option routers 193.190.246.65;
      option domain-name-servers 193.190.246.229;
    }
    # remove the following subnet if you are not using 
    # private network otherwise keep it and adapt it 
    # your site
    subnet 192.168.0.0 netmask 255.255.0.0 {
    option routers 192.168.10.100;
    option domain-name-servers 192.168.10.100;
    }
  }
  • add the dhcp deamon at the boot:
 chkconfig --add dhcpd
 chkconfig --level 345 dhcpd on
  • configure syslinux and tftp-server (last one uses hosts.* for acl):
  mkdir /osinstall/nbp/i386_slc3_308
  cd /osinstall/nbp/i386_slc3_308
  wget http://linuxsoft.cern.ch/cern/slc308/i386/images/pxeboot/vmlinuz
  wget http://linuxsoft.cern.ch/cern/slc308/i386/images/pxeboot/initrd.img
  ln -s /osinstall/ks /var/www/html/ks
  • in /etc/xinetd.d/tftp modify the following options (aii-server example in /usr/share/doc/aii-1.0.43/eg/tftp.example)
  server_args             = -s /osinstall/nbp
  disable                 = no
  • restart the corresponding service
  service xinetd restart
  • the default firewall settings of SL5 block tftp traffic (and probably also eg http to port 444 for SINDES).
  Either configure the firewall properly or disbales iptables altogether.
/etc/init.d/iptables stop
chkconfig iptables off
chkconfig --del iptables 


  • allow acknowledgment script to do its work:
  cp /usr/sbin/aii-installack.cgi /var/www/cgi-bin
  chmod o+rx /var/www/cgi-bin/aii-installack.cgi 
    • get the BEgrid kickstart template from quattor CVS (use the same command to update it!)
  wget -O /usr/lib/aii/osinstall/sl_ks_begrid.conf \
  'http://quattor.begrid.be/trac/centralised-begrid-v4/attachment/wiki/BEgridClient/sl_ks_begrid.conf?format=raw'
    • if you use the machine profile names with a FQDN, you must do
  echo "use_fqdn = 1" >> /etc/aii-shellfe.conf
  • add apache to /etc/sudoers (MUST be done for private interfaces (with private fqdn) as well!!):
  echo "apache  f.q.d.n=(ALL)     NOPASSWD: /usr/sbin/aii-shellfe" >> /etc/sudoers
  Also comment in /etc/sudoers
Defaults    requiretty
#comment
*add the following to /usr/lib/perl/NCM/Template.pm (before ''$data .= $rep_url;'')

<pre>
                 ## add spma-proxy support
                 my $ppath="/software/components/spma/proxy";
                 if ($cfg->elementExists($ppath) && $cfg->getValue($ppath) eq "yes") {
                     $ppath="/software/components/spma/proxyhost";
                     if ($cfg->elementExists($ppath)) {
                         my $hhost=$cfg->getValue($ppath);
                         ## assume http if proxy is used?
                         if ($rep_url =~ m/http:\/\/(.*?)\//) {
                             $rep_url =~ s/$1/$hhost/;
                         } else {
                             $self->error("SPMA proxy is set, but no http access protocol to repository: $rep_url?");
                         }
                     } else {
                         $self->error("SPMA proxy is set, but no host defined?");
                     }
                 }

III.5. svncheck

  • This tool depends on pysvn. The default tarball contains a version of pysvn build against subversion 1.4.2 i386 that comes with SL50.
  • get the client-script tarball
  wget http://quattor.begrid.be/begrid/install/cb-v4-client.tar.gz
  • extract it somewhere on the machine. e.g. /opt
  tar xzfv cb-v4-client.tar.gz
  • The Centralised-begrid (/opt/cb) folder has the following structure:
    • /opt/cb/keys: this one holds the begrid CA certificate and a valid user .p12 file. (This is used to connect to the SCDB-server.)
    • /opt/cb/subversion: some subversion specific parameters. edit the servers file:
      • correct full path to key (.p12 file)
      • plaintext passwd for the key (it does not prompt for the passwd)
    • /opt/cb/tmp: will contain the checkout and build files.
    • /opt/cb/private: here you can put private files (such as passwords and certificate in the templates. passwd.tpl; pub_key.tpl)
      • svncheck does this by simple copy from this directory into cfg/clusters. So keep that structure.
      • remove the template cluster given as example, otherwise runcheck will try to build it later ...
      • /opt/cb/private/<clutername-glite-version>/passwd.tpl
          • This file contains the passwords that wil be used for your site.
          • You can pick any password you like.
          • (Unless certain nodes are not configured with Quattor, in that case they must match whitch the non Quattor nodes).
      • /opt/cb/private/<clutername-glite-version>/local_users.tpl
          • ???
          • Not needed for a CE or a WN.
      • /opt/cb/private/<clutername-glite-version>/pub_key.tpl
      • /opt/cb/private/<clutername-glite-version>/<your_ce_fqdn>.tpl
    • /opt/cb/svncheck: this is the code written by Jean-François Roche (jfroche@jfroche.be):
      • in config.conf you can specify most needed parameters.
      • svn_repos: point it to the trunk of the centralised-begrid repository. building tags relies on this!
      • cluster_regexp: a regexp to build only these clusters (not used ATM)
      • modify the after_args optioon in the script1 to compile with the correct task.
  after_args = compile.profiles.iihe-glite-test
      • DON'T FORGET to change the email section
  • ./runcheck -h for more info

III.6. Optional : Setup access to BEgrid SWREP repository

All software on the nodes will be installed from the BEgrid Centralised Repository. You can optionally ask permission, and configure your system so that you can add yourself new packages to this repository. This is eg needed if you use SINDES.

  • send your DN to the contact people and ask for access to the SWREP repository
  • configure and install additional rpms that will give you write access to the cebtral rpm repository
yum install swrep-soap-client cdb-soap-auth-x509
mkdir -p /etc/swrep
  • authentication is done using your BEgrid certificate that you will also need for svncheck
    • for the ca-file, cert-file and key-file fields you can use the same values as for svncheck. They only need to be in .pem format. For the conversio of the p12 to pem, use:
openssl pkcs12 -nocerts -in /path/to/p12 -out /path/to/key.pem
openssl pkcs12 -nokeys -clcerts -in /path/to/p12 -out /path/to/cert.pem
  • make the following configuration file /etc/swrep/swrep-soap-client.conf
server = quattor.begrid.be
use-cert = 1
ca-file = /path/to/ca.crt
cert-file = /path/to/cert.pem
key-file = /path/to/key.pem
  • you are now ready to use swrep-soap-client command
    • HOWTO
    • pull: add rpm from a remote website
swrep-soap-client pull x86_64_dcache_sl4 /dcache/17 http://www.dcache.org/downloads/1.7.0/sl4/x86_64/dcache-client-1.7.0-39.x86_64.rpm
    • put: add rpm using a local file
swrep-soap-client put noarch_sindes /sindes /usr/src/redhat/RPMS/noarch/SINDES-ca-certificate-q3-0.1-2.noarch.rpm
    • tips
    • to avoid the Enter user-name question, do export SWREP_USER=<user>
    • when adding lots of rpms, typing the passphrase for the key is not very user friendly. For that, create a temporary key file that has no password and that unencrypted key to connect. Don't forget to delete it afterwards!
openssl rsa -in /path/to/protected-key.pem -out ~/tmp-key
swrep-soap-client --use-key ~/tmp-key <command>

III.7. Install SINDES

Follow the instructions here. Notice that this step is now mandatory!

(You migt want to read presentation on SINDES to know what SINDES is.)

Issues

Download problems

  • If you have problems with retrieving files
    • during the installation: images, rpms
    • when running spma
    • rpms that seem to fail with IO errors
  • the main problem is probably the not so excellent http caching provide by apache
    • could also be be because there was a network error during the caching
  • easiest is to cleanup the cache
    • be aware that the run after this will be slower due to the recaching of all data
/etc/init.d/httpd stop
rm -Rf /var/www/cache/*
/etc/init.d/httpd start

Links


Template:TracNotice