5.1 Basic run options

5.1.1 Continuation run

In some instances, REML estimation continuing from the ‘best’ estimates so far is necessary. This is invoked with the -c option.
If specified, WOMBAT will attempt to read its ‘starting’ values from the file BestPoint in the current working directory. N.B. If this is not available or if a READ error occurs, the run proceeds as a ‘new’ run, i.e. using the starting values given in the parameter file instead. The -c option selects estimation using the AI algorithm, unless another procedure is selected explicitly.


Table 5.1: ‘Basic’ run options for WOMBAT
Option

Purpose

-c

Specify a continuation run

-v

Specify verbose screen output

-d

Specify very detailed screen output

-t

Specify terse screen output

Default REML algorithms
--good

Tell that you have good starting values

--bad

Tell that you have bad starting values

Non-estimation runs
--setup

Select a run performing the set-up steps only

--best

Select a run printing out estimates for the currently best point only

--blup

Select a prediction (BLUP) run; direct solution

--solvit

Select a prediction (BLUP) run; solve iteratively

--mmeout

Like --solvit, but write MME out to file.

--simul

Select a run simulating data only

--subset

Write out parameter files for analyses of subsets of traits

--itsum

Iterative summation of partial covariance matrices

--invert

Invert a dense symmetric matrix; pivot on largest diagonal

--invrev

Invert a dense symmetric matrix; reverse pivoting on no. of off-diagonal elements

--inveig

Invert a dense symmetric matrix; use eigen decomposition

--invspa

Invert a sparse symmetric matrix

Miscellaneous
--expiry

Print out the expiry date for the program (to screen)

--limits

Print out the hard-coded program limits (to screen)

--times

Print out various intermediate times for a run

--wide

Write ‘wide’ output files

5.1.2 Level of screen output

The amount of screen output can be regulated by the following options
-t selects terse output
-v selects verbose output, useful to check the parameter file
-d selects detailed output, useful for debugging

5.1.3 Set-up steps

It is good practice, to start each analysis with a run which carries out the set-up steps only, checking that the summary information provided on the model of analysis and data structure in file SumModel.out (see 7.1.2) is as expected, i.e. that WOMBAT is fitting the correct model and has read the data file correctly. This is specified using --setup. This option also invokes verbose screen output.

5.1.4 Quality of starting values

For analyses comprising 18 or less covariance components to be estimated, WOMBAT defaults to the AI algorithm for maximisation. For other analyses, the default is for WOMBAT to begin with up to 3 iterates of the PX-EM algorithm, before switching to an AI algorithm. For reduced rank estimation, every tenth AI iterate is subsequently replaced by a PX-EM step. Two options are provided to modify this according to our perception of the quality of starting values for covariance components available, without the need to consider the ‘advanced’ options below.

If we are confident, that we have good starting values, this might cause an unnecessary computational overhead. Specifying --good reduces the maximum number of (PX-)EM iterates to 1. Similarly, for potentially bad starting values, estimation might converge more reliably if a few more initial (PX-)EM iterates were carried out. Specifying --bad sets this number to 8. If the increase in log likelihood during the initial (PX-)EM iterates is less than 2.0  , WOMBAT will switch to the AI algorithms immediately.

5.1.5 Intermediate results

WOMBAT writes out the currently best values of estimates of covariance components to the file BestPoint whenever the likelihood is increased. The option --best causes WOMBAT to read this file and write out the matrices of covariances and corresponding correlations in more readily legible format to the file BestSoFar.out. WOMBAT with this option can be used in a directory in which an estimation run is currently active, without interfering with any of the files used.

5.1.6 Prediction only

WOMBAT can be used for a simple BLUP run, using the ‘starting’ values given in the parameter files as assumed values for the true covariances. In this mode, no pruning of pedigrees is carried out. If -c is specified in addition, WOMBAT will try to read the values from the file BestPoint instead – this is useful to obtain ‘backsolutions’ after convergence of an estimation run. Solutions can be obtained either directly or iteratively:

5.1.7 Simulation only

Specifying --simul causes WOMBAT to sample values for all random effects fitted for the data and pedigree structure given by the respective files, from a multi-variate normal distribution with a mean of zero and covariance matrix as specified in the parameter file. Again -c can be used to acquire population values from the file BestPoint instead. No fixed effects are simulated, but the overall (raw) mean for each trait found in the data set is added to the respective records. Optionally, --simul can be followed directly (i.e. no spaces) by an integer number n  in the range of 1 to 999. If given, WOMBAT will generate n  simulated data sets (default n = 1  ).

Simulation uses two INTEGER values as seeds to initialise the pseudo-random number generator. These can be specified explicitly in a file RandomSeeds (see 6.4.4.3). Output file(s) have the standard name(s) SimData001.dat, SimData002.dat, ...  , SimDatan   .dat. These have the same layout as the original data file, with the trait values replaced by the simulated records; see 7.2.5.

This option is not available for models involving covariance option GIN (see 4.6.1.2).

5.1.8 Matrix inversion only

WOMBAT can be used as a ‘stand-alone’ program to invert a real, symmetric matrix. Both a ‘full’ inverse and a ‘sparse’ inverse are accommodated. ‘Full’ inversion of a dense matrix is limited to relatively small matrices - use run time option --limit to find the maximum size allowed in WOMBAT.

The matrix is expected to be supplied in a file, with one record per non-zero element in the upper triangle. Each record has to contain three space separated items: row number, column number and the matrix entry. There are several different ‘modes’:

1.
Using  --invert filename selects ‘standard’ generalised matrix inversion with pivoting on the largest diagonal; if the matrix if not of full rank, rows and columns corresponding to pivots with absolute value less than the operational zero are set to 0.
2.
Option --invrev filename is similar, but inversion is carried  out processing rows and columns in ascending order of the number of off-diagonal elements in the original matrix. The rationale for this is that the computational effort required is proportional to the square of the number of off-diagonal elements. Hence, this option tends to provide faster inversion, but is more succeptible to numerical instabilities than --invert.
3.
Option --inveig filename first performs an eigen-value and -vector decomposition of the matrix. If eigenvalues less than a specified value (default: 0 ) are found, these are set to the minimum, and a modified matrix is written out to a file filename.new before obtaining the generalised inverse from the inverse of the diagonal matrix of eigenvalues (ignoring any 0 entries) and the matrix of eigenvectors. For a matrix not of full rank and the default minimum of 0, this yield a generalised inverse of the same rank as option --invert, but the generalised inverse is different. A value different from 0 can be selected by appending it to the option (no spaces).

EXAMPLE:

      wombat -v inveig0.0001 matrix.dat

specifies inversion of the matrix stored in matrix.dat, obtaining a generalised inverse by setting any eigenvalues less than 0.0001  to this value, prior to inversion.

This option should only be used for relatively small matrices.

HINT: Use this feature (with a minimum eigenvalue > 0) to modify an ‘invalid’ matrix of starting values for multivariate analyses.

4.
Option --invspa filename chooses sparse matrix  inversion. This is suitable for the inversion of large, sparse matrices where selected elements of the inverse are sufficient. Calculated and written out are all elements of the inverse corresponding to the non-zero elements in the Cholesky factor of the original matrix; any other elements of the inverse are ignored. The methodology is the same as that used in the Expectation-Maximisation steps in REML estimation. The relevant options to select the ordering strategy etc. (see 5.2.1) are recognised (except for --metisall), but must be given on the command line before --invspa.

The inverse is written out to filename.inv with one row per non-zero element, containing row number, column number and the element of the matrix (space separated).

EXAMPLE:

      wombat -v –-amd –-invspa matrix.dat

specifies sparse matrix inversion of the matrix stored in matrix.dat, using approximate minimum degree ordering and requesting ‘verbose’ output. The output file containing the inverse is matrix.dat.inv.

5.1.9 Analyses of subsets of traits

For multivariate problems involving more than a few traits, it is often desirable to carry out analyses considering subsets of traits and, subsequently, combine the resulting estimates. This may be a preliminary step to a higher-dimensional multivariate analysis to obtain ‘good’ starting values. Alternatively, analyses for different subsets of traits may involve different data sets – selected to maximise the amount of information available to estimate specific covariances – and we may simply want to obtain pooled, possibly weighted estimates of the covariance matrices for all traits which utilise results from all partial analyses and are within the parameter space. WOMBAT provides two run time options to assist with these tasks :

Option --subsetn  is aimed at making preliminary analyses less tedious. It will cause WOMBAT to read the parameter file for a multivariate analysis involving q  traits and write out the parameter files for all q  uni- (n = 1  ) or q(q− 1)∕2  bi-variate (n = 2  ) analyses possible. These are based on the data (and pedigree) file for the ‘full’ multivariate analysis, using the facility for automatic renumbering of traits and subset selection, i.e. no edits of these parameter files are necessary.

N.B.: This option does not carry through information from a SPECIAL block.

On analysis, encountering the syntax for trait renumbering (see 4.6.2) causes WOMBAT to write out an additional file with the estimates for the partial analysis (Standard name EstimSubsetn + ...+ m  .dat with n  to m  the trait numbers in the partial analysis, e.g. EstimSubset2+7.dat; see 7.2.6). In addition, the name of this output file is added to a file called SubSetsList.

HINT: WOMBAT will add a line to SubSetsList on each run – this may cause redundant entries. Inspect & edit if necessary before proceeding to combining estimates !

Option --itsum selects a run to combine estimates from partial analyses, using the ’iterative summing of expanded part matrices’ approach of Mäntysaari [14] (see also Koivula et al. [12]), modified to allow for differential weighing of individual analyses. For this run, a file SubSetsList is assumed to exist and list the names of files containing results from analyses of subsets, and, optionally, the weightings to be applied (see 7.3.7). Pooled covariance matrices are written to a file name PDMatrix.dat as well as a file named PDBestPoint (see 7.2.7).

HINT: Use of --itsum is not limited to combining bi-variate analyses or the use of files with standard names (EstimSubsetn +...+ m  .dat), but all input files must have the form as generated by WOMBAT.

To use --itsum to combine estimates from analyses involving different data sets, be sure to a) number the traits in individual analyses appropriately (i.e. 1,...,q  with q  the total number of traits, not the number of traits in a partial analysis), and b) to use the syntax described in 4.6.2 to renumber traits – this will ‘switch on’ the output of subset results files.

Copy PDBestPoint to BestPoint and run WOMBAT with option --best to obtain a listing with values of correlations and variance ratios for the pooled results.

These options are not available for analyses involving random regression models, correlated random effects or permanent environmental effects fitted as part of the residuals.

5.1.10 Miscellaneous

Option --expiry will print the expiry date for your copy of WOMBAT to the screen.

Option --limits can be used to  find at the upper limits imposed on analyses feasible in WOMBAT, as ‘hard-coded’ in the program. N.B. Often, these are larger than your computing environment (memory available) allows.

Option --times causes WOMBAT to  print out values for the CPU time used in intermediate steps.

Option --wide will generate formatted output  files which are wider than 80 columns.