The Final Wish - To Manipulate Data

This section can be separated into three distinct areas

mathematical functions
workspace operations
data analysis functions

Mathematical functions

As well as providing intrinsic arithmetic operations for different data types Open GENIE provides several basic functions for performing mathematical operations. These are coded generically so that the same function can be applied to any data type which is capable of undergoing the operation. For example, the Sin() function may be used on a single number, an array or a workspace. Normally, the result of the operation is of the same data type as the value operated upon. Where the operand contains several values (e.g. an array), each value is operated on individually. For example we can create an array of real numbers, and square root them

>> my_numbers = Dimensions(10,20) # First create a 10 x 20 array
>> fill my_numbers 1.0 1.0 # Fill the array with some data
>> printn my_numbers
[1.0 2.0 3.0 4.0 5.0 ...] Array(10 20 )
>> printn sqrt(my_numbers) # print the square roots.
[1.0 1.414213 1.732050 2.0 2.236067 ...] Array(10 20 )

The basic mathematical functions are

trigonometric functions - with the angle in radians

Arccos(), Arcsin(), Arctan(),Cos(), Sin(), Tan(). For example,

# print result in degrees
>> y = Arccos(0.5) * 180 / $pi
# take the arccos of all elements in an array
>> a = Dimensions(10) # create a 10 element array
>> fill a 0.5 # Set all elements to 0.5
>> y = Arccos(a)

transcendental functions

Exp(), Ln(), Log(). For example,

# antilog a data array
# which is in logs to base 10
>> data = Exp(w.y*Ln(10.0))

Other functions

Abs(), Sqrt(). For example:

# Calculate the square root of Pi
>> printn Sqrt($PI)
1.77245310234149

Workspace operations

Workspace operations and transformations form the heart of Open GENIE. These allow analysis of the data taking into account its underlying form. For example, the Units() command can be used to convert the units of a time-of-flight (TOF) spectrum. This is only possible if some assumptions are made about the fields which are present in the workspaces containing the experimental data. For the Units() command to work, a workspace also requires fields giving parameters such as the primary flight path and incident angle (a TOF spectrum).

For example for a 2-D TOF Spectrum needs

Workspace Field Description Variable type
X(1-D) array of X values RealArray
Y(2-D) 2-D array of Y values RealArray
E(2-D) 2-D array of errors for Y field RealArray
L1 primary flight path (m) Real
L2(1-D) secondary flight path (m) Real array
Twotheta(1-D) scattering angle (degrees) Real array
Delta hold off in microseconds Real
Emode energy mode Integer 0=inelastic, 1=incident, 2=transmitted
Efixed fixed energy (if applicable) Real
Xlabel Units for X values String
Ylabel Units for Y values String
Ut(1-D) User parameters (this array may be of any length and caters for information not already named in a field) RealArray

 

For more on the necessary fields required for operations then see the Open GENIE reference manual.

Template routines are called whenever an intrinsic operation involving workspaces is specified. In most cases the errors are propagated through the calculation. The template routines on workspaces are:

Unary operations

Trigonometric and transcendental functions. For example

w2=ln(w1) is equivalent to w2=workspace_ln(w1)

and

Workspace_arccos(w) is equivalent to acos(w),

Other unary functions

Workspace_sqrt, sqrt(w); Workspace_abs, |w|; and, Workspace_negated, -w.

Binary Operations

Workspace_add(), w1+w2; Workspace_append (joins one end of one histogram to another), w1&w2, Workspace_divide, w1/w2; Workspace_raised_to, w1^w2; Workspace_subtract, w1-w2; Workspace_modulo, w1|w2; Workspace_multiply, w1*w2.

The characteristics (i.e. final length, dimensionality of arrays) of the left hand workspace define those of the resultant workspace.

Comparison Operations

Workspace_equal, w1=w2;
Workspace_not_equal, w1!=w2.

The definition of the default routines are stored in the file "workspace_user.gcl", which is included in the library directory of the Open GENIE distribution.

Data Analysis Functions

Arguably the most important purpose of Open GENIE is as a tool for scientific data analysis. Not just for preliminary data analysis, but as a framework which provides, or from which can be called, all the tools necessary to perform a complete analysis of the data.

Currently, a large amount of data analysis is performed with a set of disparate FORTRAN programs which have to support their own routines to access data, to drive a user interface and to work with a variety of different graphics packages. What is critical and is very much at the heart of the philosophy of Open GENIE is that the scientist must have control over is how the data is processed.

A scientist may trust the Open GENIE peak fitting routines, alternatively and probably more the case at the moment, the scientists will want to continue to use their own analysis code in a C or FORTRAN subroutine for most things. Open GENIE can be used to provide the user interface, graphics and file access. This mechanism is described fully in the next section of this manual on modules, and these allow the code to be run as if it was a compiled part of Open GENIE. Routines which have been found to be useful in several different areas of analysis are hard coded in C/C++ or FORTRAN for efficiency.

The current routines available are:

Focus()

Focus a range of time of flight spectra from detectors at different scattering angles with given parameters. This command, although written specifically for focusing multiple banks of detectors consecutively can be used for single scanning detectors data. To focus the spectra in d-spacing (with the :D switch), or in momentum transfer (with the :Q switch) the time of flight parameters are required, which can be specified from a file (the default are the values from the raw file). Note: the default is the :D switch where spectra are summed in d-space. For example:

# Focus a time of flight spectrum in D-spacing
# get the TOF parameters from the raw data file
>>w = focus( 1:17, "irs12838.raw" )
# Focus default file, every third period
>> assign 3045
>> refl = focus(3:90@3, _, mydet)

Integrate()

Integrate a 1-D or 2-D spectra. For example:

# print the integrals of two spectra
>> printn integrate(1:2, 38000, 78000, "tfx00345.raw")
INTEGRATE: Integrating between 38000 and 7800
Workspace []
(
error = [2341.4700 1027.1312 ] Array(2 )
sum = [5482585.0 1054968.0 ] Array(2 )
)

The result of the integrate command is returned as a workspace with two fields: the ‘SUM’ field contains either a single value or an array of integrals, and the ‘ERROR’ field contains the propagated error of the result.

Peak()

Provides interactive peak fitting of a workspace allowing selection of the bounds of the peak and the form of fit to use. The available fits for use are:

  1. Gaussian - a Gaussian-peak sitting with a straight-line background,
  2. Gaussian convoluted with an exponential - a Gaussian convoluted with a sharp-edged exponential, sitting on a straight-line background,
  3. Lorentzian - a Lorentzian-peak, sitting on a straight-line background,
  4. Lorentzian convoluted with exponential - a Lorentzian convoluted with a sharp-edged exponential, sitting on a straight-line background,
  5. Voigt - a Voigt convoluted with a sharp-edged exponential, sitting on a straight-line background,
  6. Polynomial - fits an nth degree polynomial,
  7. Voigt convoluted with an exponential - a Voigt convoluted with an exponential.

Once fitted the peak parameters of the fit are displayed and made available as a result, so that they can be used later. For example:

# Fit peaks from a genie intermediate file
>> fit = peak(get(3,"mydata.in3"))
>> printn fit

Note: See the Peakfit() and Peakgen() in the Open GENIE reference manual for more control and detail.

Rebin()

The rebin command performs an interpolation from one histogram into another histogram with the same integrated count, but with modified X-boundaries. Some numerical precision is lost during a re-binning operation and there is usually a degree of peak broadening, so multiple re-binning is to be avoided. The /Lin and /Log qualifiers allow a Linear or Logarithmic rebinning between the specified boundaries to be performed. For linear rebinning the boundaries are specified in steps, for logarithmic bin widths these are calculated such that for any given binning range

Xn+1 - Xn = Step * Xn

For example, given the command

>> w = rebin:log(w, 10.0, 0.1, 15.0)

Here the step is 0.1, and the range of the rebinning is between 10 and 15. The bin boundaries generated will be [10, 11, 12.1, 13.31, 14.641, 15].

A workspace can also be rebinned according to another workspace. For example:

# Read a spectrum and re-bin the same as an
# already loaded vanadium workspace
>> w = s(1)
>> w = rebin(w, vanadium.x)

Units()

This command converts the units of time-of-flight spectra. Using the switches they can be converted respectively to:

/C or /Channel to Channel numbers (one way)
/D   To D-Spacing (A)
/E   To Energy (meV)
/LAM   To Wavelength (Å)
/Q   To Momentum Transfer (Å-1)
/SQ   To (Momentum Transfer)2 (A-2)
/T   To Time of Flight (ms)
/LA1   To primary flight path wavelength
/W   To energy transfer E1-E2 (meV)
/WN   To energy transfer E1-E2 (cm-1)
/TAU   To reciprocal time-of-flight (ms/m)

For example:

# Read in a spectrum whilst converting to Wavelength
>> w = Units:Lam(s(1))

The t.o.f parameters must be set correctly in the input workspace (or set using the Set/Par command) in order to perform the conversion, and if the keyword syntax is used, i.e. Units/C w the workspace will be converted destructively.

The Open GENIE Module Interface

This section will answer the following questions:

What is a module?
When would I want to use modules?
How do I write a module
How do I load and run a module
Where can I find out about other (pre-compiled) modules

What is a module?

A module is a compiled FORTRAN program that can be loaded into, and become part of, a running Open GENIE process. Subroutines in the module, provided they follow a given set of rules, can be called to manipulate arbitrary Open GENIE variables. A module is usually implemented as a shared library or dynamic linked library.

When would I want to use modules?

Modules are used when:

A numerically intensive task must be performed for which GCL would be too slow and for which there is no in-built Open GENIE command
You already have a working FORTRAN program for the task and do not wish to rewrite the application in GCL
You wish to use Open GENIE as an ‘add on’ to another program to provide, for example, the input to the program, or to display the output.

How do I write a module?

A module consists of subroutines only - there is NO main program. These subroutines can be divided into two types: those that perform the actual calculations, and those that pass data to/from Open GENIE and the calculation routines; the latter called "wrappers" as they provide a jacket for the original calculation routines.

1. Write FORTRAN Subroutines to do your tasks

The subroutines can take any parameters you desire. Our example will be based on the subroutine "myfunc.for" in the examples directory:

2. Write a FORTRAN "wrapper" subroutine to interface between Open GENIE and your calculation subroutine

Open GENIE variables are passed to and from the module using a temporary workspace. Data is placed into the input workspace with chosen field names, and then accessed in FORTRAN by using this field name and an access routine appropriate for the data type being sent. For example, the frame count could be stored in the variable "NRAW" and accessed using the module_get_int() function in FORTRAN. Returning data is similar - the various module_put() FORTRAN subroutines allow the construction of a returned workspace containing various data types with named variables. The wrapper subroutine must obey the following rules:

It must be a SUBROUTINE taking just two parameters, called PARS_GET and PARS_PUT. These parameters should be declared EXTERNAL and not be accessed in anyway, except via "module_..." commands.
It must allocate the variables needed by the user’s SUBROUTINE and transfer data to/from them using "module_get_..." and "module_put_.." calls.
As input/output devices are re-assigned by Open Genie, READ(5,*) commands should be avoided and screen output should be handled via the "module_print" and "module_information" commands. Inside the module Open GENIE variables are accessed by workspace field names. For example, in the program below the array X is loaded from the field labelled ‘XVALS’, where it has been placed by a GCL procedure. There is no restriction on how you label a variable, but often you will use the variable name itself. All "module_get_.." routines must specify Pars_Get as the first parameter, and similarly all "module_put..." routines specify Pars_Put (these variables are, in fact, just references to the input/output parameter workspaces of the Module/Execute command.

For a full list of module_.() functions see the FORTRAN module interface in the Open GENIE reference manual.

The wrapper for "myfunc" is here.

3. Make the Module Library

You need to compile the file containing the function and its associated wrapper using the Module/Compile command - this will produce a ".so" file to load into Open GENIE.

>> Module/Compile "myfunc.for" Symbols = "do_myfunc"

This command first runs a FORTRAN compiler on the code, then it produces a shared library from the object code. If the compilation fails for any reason you will see various error messages on the screen and the creation of the shared library will be aborted. On a successful run, you will see a message similar to the following:

** Module now compiled - load with: Module/Load "myfunc.so"

The Symbols parameter must be set to a comma separated list of wrapper subroutines that you wish to call from Open GENIE (via Module/Execute).

4. Load the module into Open GENIE

To load the module, use Module/Load command with the name of the ".so" file created by the Module/Compile. You can also supply a comment that will be displayed when you later type Module/List.

>> Module/List "myfunc.so" "My function module!"

You only need to reload a module if you change the source code and have executed another Module/Compile command.

5. Package function arguments into a workspace

You now need to create a workspace to hold the parameters for the Subroutine you wish to call in the module. The fields in the workspace must have the same names as those you specified in the second argument of your "module_get..." calls above. In our case, we only have one variable and we called it "XVALS", so assuming X is the array we wish to send:

>> X = Dimensions(10) # create a new array
>> Fill X 1.0 1.0 # Fill array with numbers 1.0 to 10.0
>> Pars = Fields() #create an empty workspace for module arguments
>> Pars.Xvals = X # add X to workspace as field "Xvals"

6. Execute the module and get the result

This is achieved by:

>> MY_RESULT = Module:Execute("my_func", Pars)

where "my_func" is the name of the FORTRAN wrapper routine you wrote, and Pars is the packaged arguments workspace created above. The values specified in "module_put.." calls will appear in the returned workspace MY_RESULT; in the above case MY_RESULT will contain one entry called YVALS, so we can type:

>> Printn MY_RESULT.YVALS

and see squared X values!

A Real-Life Sample Module

The following module was created by Spencer Howells at ISIS. The GCL procedure is available here as g2s.gcl and the FORTRAN program that performs the actual calculation is g2s.for.

The C Module Interface

The C module interface is used in almost exactly the same way as the FORTRAN interface, and provides similar routines - in fact most have the same name, but with a prefix of cmodule_ rather than module_. Invoking from the GCL command line is similar, except that an extra qualifier of /C must be specified for COMPILE and EXECUTE operations (it could also be specified on a load for consistency, but is not strictly necessary). For example,

>> Module/Compile/C "myprog.c"
>> Module/Load "myprog.so"
>> VALS=Module:Execute:C("myfunc", PARS)

The major difference between the two is in the arguments - the C module routines can allocate memory for returned array structures using malloc(), unlike in FORTRAN77 where a pre-allocated array must always be passed. To indicate that you wish a routine to allocate the memory for you, set the array pointer to NULL before passing the address of it to the routine. For example:

GenieWorkspace *pars_get, *pars_put; /* passed from GENIE */
const char* name = "twotheta"; /* parameter to get */
fort_real* val_array =NULL; /*tell routine to malloc() memory*/
fort_int len;
/* then get array the array*/ 
cmodule_get_real_array(pars_get, name, &val_array, &len);

/* do some manipulations and return new values */
cmodule_put_real_array(pars_put, name, val_array, len);
free(val_array); /* release storage */

If a pointer is not NULL, the routine will assume it points to valid storage of size "len" - on return, "len" will be set to the number of elements of the array actually set, or 0 on an error. A C module must contain the following header files

#include <genie_cmodule.h>
#include <genie_cmodule_ver.h>

to set up the correct definitions and typedefs.

The MODULE/COMPILE/C Open GENIE command will insert the correct path for picking up these header files, but they are contained in the $GENIE_DIR/library directory if you wish to peruse them

The C module function must be declared as shown in the file genie_cmodule.h, and this also shows how the variables are obtained and set by calling the respective functions, with either pars_get or pars_put and the variable name.

An example of a program to illustrate calling a module from C is here c_example.c, and the corresponding example of a GCL procedure to call the C module is here c_example.gcl.