The Final Wish - To Manipulate Data
This section can be separated into three distinct areas
- mathematical functions
- workspace operations
- data analysis functions
As well as providing intrinsic arithmetic operations for different data types Open GENIE provides several basic functions for performing mathematical operations. These are coded generically so that the same function can be applied to any data type which is capable of undergoing the operation. For example, the Sin() function may be used on a single number, an array or a workspace. Normally, the result of the operation is of the same data type as the value operated upon. Where the operand contains several values (e.g. an array), each value is operated on individually. For example we can create an array of real numbers, and square root them
>> my_numbers = Dimensions(10,20) # First create a 10 x 20 array
>> fill my_numbers 1.0 1.0 # Fill the array with some data
>> printn my_numbers
[1.0 2.0 3.0 4.0 5.0 ...] Array(10 20 )
>> printn sqrt(my_numbers) # print the square roots.
[1.0 1.414213 1.732050 2.0 2.236067 ...] Array(10 20 )
The basic mathematical functions are
Arccos(), Arcsin(), Arctan(),Cos(), Sin(), Tan(). For example,
# print result in degrees
>> y = Arccos(0.5) * 180 / $pi
# take the arccos of all elements in an array
>> a = Dimensions(10) # create a 10 element array
>> fill a 0.5 # Set all elements to 0.5
>> y = Arccos(a)
Exp(), Ln(), Log(). For example,
# antilog a data array
# which is in logs to base 10
>> data = Exp(w.y*Ln(10.0))
Abs(), Sqrt(). For example:
# Calculate the square root of Pi
>> printn Sqrt($PI)
1.77245310234149
Workspace operations and transformations form the heart of Open GENIE. These allow analysis of the data taking into account its underlying form. For example, the Units() command can be used to convert the units of a time-of-flight (TOF) spectrum. This is only possible if some assumptions are made about the fields which are present in the workspaces containing the experimental data. For the Units() command to work, a workspace also requires fields giving parameters such as the primary flight path and incident angle (a TOF spectrum).
For example for a 2-D TOF Spectrum needs
Workspace Field | Description | Variable type |
X(1-D) | array of X values | RealArray |
Y(2-D) | 2-D array of Y values | RealArray |
E(2-D) | 2-D array of errors for Y field | RealArray |
L1 | primary flight path (m) | Real |
L2(1-D) | secondary flight path (m) | Real array |
Twotheta(1-D) | scattering angle (degrees) | Real array |
Delta | hold off in microseconds | Real |
Emode | energy mode | Integer 0=inelastic, 1=incident, 2=transmitted |
Efixed | fixed energy (if applicable) | Real |
Xlabel | Units for X values | String |
Ylabel | Units for Y values | String |
Ut(1-D) | User parameters (this array may be of any length and caters for information not already named in a field) | RealArray |
For more on the necessary fields required for operations then see the Open GENIE reference manual.
Template routines are called whenever an intrinsic operation involving workspaces is specified. In most cases the errors are propagated through the calculation. The template routines on workspaces are:
Trigonometric and transcendental functions. For example
w2=ln(w1) is equivalent to w2=workspace_ln(w1)
and
Workspace_arccos(w) is equivalent to acos(w),
Other unary functions
Workspace_sqrt, sqrt(w); Workspace_abs, |w|; and, Workspace_negated, -w.
Workspace_add(), w1+w2; Workspace_append (joins one end of one histogram to another), w1&w2, Workspace_divide, w1/w2; Workspace_raised_to, w1^w2; Workspace_subtract, w1-w2; Workspace_modulo, w1|w2; Workspace_multiply, w1*w2.
The characteristics (i.e. final length, dimensionality of arrays) of the left hand workspace define those of the resultant workspace.
Workspace_equal, w1=w2;
Workspace_not_equal, w1!=w2.
The definition of the default routines are stored in the file "workspace_user.gcl", which is included in the library directory of the Open GENIE distribution.
Arguably the most important purpose of Open GENIE is as a tool for scientific data analysis. Not just for preliminary data analysis, but as a framework which provides, or from which can be called, all the tools necessary to perform a complete analysis of the data.
Currently, a large amount of data analysis is performed with a set of disparate FORTRAN programs which have to support their own routines to access data, to drive a user interface and to work with a variety of different graphics packages. What is critical and is very much at the heart of the philosophy of Open GENIE is that the scientist must have control over is how the data is processed.
A scientist may trust the Open GENIE peak fitting routines, alternatively and probably more the case at the moment, the scientists will want to continue to use their own analysis code in a C or FORTRAN subroutine for most things. Open GENIE can be used to provide the user interface, graphics and file access. This mechanism is described fully in the next section of this manual on modules, and these allow the code to be run as if it was a compiled part of Open GENIE. Routines which have been found to be useful in several different areas of analysis are hard coded in C/C++ or FORTRAN for efficiency.
The current routines available are:
Focus a range of time of flight spectra from detectors at different scattering angles with given parameters. This command, although written specifically for focusing multiple banks of detectors consecutively can be used for single scanning detectors data. To focus the spectra in d-spacing (with the :D switch), or in momentum transfer (with the :Q switch) the time of flight parameters are required, which can be specified from a file (the default are the values from the raw file). Note: the default is the :D switch where spectra are summed in d-space. For example:
# Focus a time of flight spectrum in D-spacing
# get the TOF parameters from the raw data file
>>w = focus( 1:17, "irs12838.raw" )
# Focus default file, every third period
>> assign 3045
>> refl = focus(3:90@3, _, mydet)
Integrate a 1-D or 2-D spectra. For example:
# print the integrals of two spectra
>> printn integrate(1:2, 38000, 78000, "tfx00345.raw")
INTEGRATE: Integrating between 38000 and 7800
Workspace []
(
error = [2341.4700 1027.1312 ] Array(2 )
sum = [5482585.0 1054968.0 ] Array(2 )
)
The result of the integrate command is returned as a workspace with two fields: the SUM field contains either a single value or an array of integrals, and the ERROR field contains the propagated error of the result.
Provides interactive peak fitting of a workspace allowing selection of the bounds of the peak and the form of fit to use. The available fits for use are:
Once fitted the peak parameters of the fit are displayed and made available as a result, so that they can be used later. For example:
# Fit peaks from a genie intermediate file
>> fit = peak(get(3,"mydata.in3"))
>> printn fit
Note: See the Peakfit() and Peakgen() in the Open GENIE reference manual for more control and detail.
The rebin command performs an interpolation from one histogram into another histogram with the same integrated count, but with modified X-boundaries. Some numerical precision is lost during a re-binning operation and there is usually a degree of peak broadening, so multiple re-binning is to be avoided. The /Lin and /Log qualifiers allow a Linear or Logarithmic rebinning between the specified boundaries to be performed. For linear rebinning the boundaries are specified in steps, for logarithmic bin widths these are calculated such that for any given binning range
Xn+1 - Xn = Step * Xn
For example, given the command
>> w = rebin:log(w, 10.0, 0.1, 15.0)
Here the step is 0.1, and the range of the rebinning is between 10 and 15. The bin boundaries generated will be [10, 11, 12.1, 13.31, 14.641, 15].
A workspace can also be rebinned according to another workspace. For example:
# Read a spectrum and re-bin the same as an
# already loaded vanadium workspace
>> w = s(1)
>> w = rebin(w, vanadium.x)
This command converts the units of time-of-flight spectra. Using the switches they can be converted respectively to:
/C or /Channel to Channel numbers (one way)
/D To D-Spacing (A)
/E To Energy (meV)
/LAM To Wavelength (Å)
/Q To Momentum Transfer (Å-1)
/SQ To (Momentum Transfer)2 (A-2)
/T To Time of Flight (ms)
/LA1 To primary flight path wavelength
/W To energy transfer E1-E2 (meV)
/WN To energy transfer E1-E2 (cm-1)
/TAU To reciprocal time-of-flight (ms/m)
For example:
# Read in a spectrum whilst converting to Wavelength
>> w = Units:Lam(s(1))
The t.o.f parameters must be set correctly in the input workspace (or set using the Set/Par command) in order to perform the conversion, and if the keyword syntax is used, i.e. Units/C w the workspace will be converted destructively.
The Open GENIE Module Interface
This section will answer the following questions:
What is a module?
A module is a compiled FORTRAN program that can be loaded into, and become part of, a running Open GENIE process. Subroutines in the module, provided they follow a given set of rules, can be called to manipulate arbitrary Open GENIE variables. A module is usually implemented as a shared library or dynamic linked library.
When would I want to use modules?
Modules are used when:
How do I write a module?
A module consists of subroutines only - there is NO main program. These subroutines can be divided into two types: those that perform the actual calculations, and those that pass data to/from Open GENIE and the calculation routines; the latter called "wrappers" as they provide a jacket for the original calculation routines.
1. Write FORTRAN Subroutines to do your tasks
The subroutines can take any parameters you desire. Our example will be based on the subroutine "myfunc.for" in the examples directory:
2. Write a FORTRAN "wrapper" subroutine to interface between Open GENIE and your calculation subroutine
Open GENIE variables are passed to and from the module using a temporary workspace. Data is placed into the input workspace with chosen field names, and then accessed in FORTRAN by using this field name and an access routine appropriate for the data type being sent. For example, the frame count could be stored in the variable "NRAW" and accessed using the module_get_int() function in FORTRAN. Returning data is similar - the various module_put() FORTRAN subroutines allow the construction of a returned workspace containing various data types with named variables. The wrapper subroutine must obey the following rules:
For a full list of module_.() functions see the FORTRAN module interface in the Open GENIE reference manual.
The wrapper for "myfunc" is here.
3. Make the Module Library
You need to compile the file containing the function and its associated wrapper using the Module/Compile command - this will produce a ".so" file to load into Open GENIE.
>> Module/Compile "myfunc.for" Symbols = "do_myfunc"
This command first runs a FORTRAN compiler on the code, then it produces a shared library from the object code. If the compilation fails for any reason you will see various error messages on the screen and the creation of the shared library will be aborted. On a successful run, you will see a message similar to the following:
** Module now compiled - load with: Module/Load "myfunc.so"
The Symbols parameter must be set to a comma separated list of wrapper subroutines that you wish to call from Open GENIE (via Module/Execute).
4. Load the module into Open GENIE
To load the module, use Module/Load command with the name of the ".so" file created by the Module/Compile. You can also supply a comment that will be displayed when you later type Module/List.
>> Module/List "myfunc.so" "My function module!"
You only need to reload a module if you change the source code and have executed another Module/Compile command.
5. Package function arguments into a workspace
You now need to create a workspace to hold the parameters for the Subroutine you wish to call in the module. The fields in the workspace must have the same names as those you specified in the second argument of your "module_get..." calls above. In our case, we only have one variable and we called it "XVALS", so assuming X is the array we wish to send:
>> X = Dimensions(10) # create a new array
>> Fill X 1.0 1.0 # Fill array with numbers 1.0 to 10.0
>> Pars = Fields() #create an empty workspace for module arguments
>> Pars.Xvals = X # add X to workspace as field "Xvals"
6. Execute the module and get the result
This is achieved by:
>> MY_RESULT = Module:Execute("my_func", Pars)
where "my_func" is the name of the FORTRAN wrapper routine you wrote, and Pars is the packaged arguments workspace created above. The values specified in "module_put.." calls will appear in the returned workspace MY_RESULT; in the above case MY_RESULT will contain one entry called YVALS, so we can type:
>> Printn MY_RESULT.YVALS
and see squared X values!
A Real-Life Sample Module
The following module was created by Spencer Howells at ISIS. The GCL procedure is available here as g2s.gcl and the FORTRAN program that performs the actual calculation is g2s.for.
The C Module Interface
The C module interface is used in almost exactly the same way as the FORTRAN interface, and provides similar routines - in fact most have the same name, but with a prefix of cmodule_ rather than module_. Invoking from the GCL command line is similar, except that an extra qualifier of /C must be specified for COMPILE and EXECUTE operations (it could also be specified on a load for consistency, but is not strictly necessary). For example,
>> Module/Compile/C "myprog.c"
>> Module/Load "myprog.so"
>> VALS=Module:Execute:C("myfunc", PARS)
The major difference between the two is in the arguments - the C module routines can allocate memory for returned array structures using malloc(), unlike in FORTRAN77 where a pre-allocated array must always be passed. To indicate that you wish a routine to allocate the memory for you, set the array pointer to NULL before passing the address of it to the routine. For example:
GenieWorkspace *pars_get, *pars_put; /* passed from GENIE */
const char* name = "twotheta"; /* parameter to get */
fort_real* val_array =NULL; /*tell routine to malloc() memory*/
fort_int len;
/* then get array the array*/
cmodule_get_real_array(pars_get, name, &val_array, &len);/* do some manipulations and return new values */
cmodule_put_real_array(pars_put, name, val_array, len);
free(val_array); /* release storage */
If a pointer is not NULL, the routine will assume it points to valid storage of size "len" - on return, "len" will be set to the number of elements of the array actually set, or 0 on an error. A C module must contain the following header files
#include <genie_cmodule.h>
#include <genie_cmodule_ver.h>
to set up the correct definitions and typedefs.
The MODULE/COMPILE/C Open GENIE command will insert the correct path for picking up these header files, but they are contained in the $GENIE_DIR/library directory if you wish to peruse them
The C module function must be declared as shown in the file genie_cmodule.h, and this also shows how the variables are obtained and set by calling the respective functions, with either pars_get or pars_put and the variable name.
An example of a program to illustrate calling a module from C is here c_example.c, and the corresponding example of a GCL procedure to call the C module is here c_example.gcl.
The basic commands of Open GENIE plus multiline commands, for example IF - ELSE - ENDIF statements can be combined into gcl procedures. Below is a list of examples, these are arranged, so that the programs can be easily run by Open GENIE test routines, or by someone trying to get an idea of what Open GENIE can do. Most programs have their own directory, with the data required taken from files stored in this area.
If program fails to link, change into the example directory and type
> make links
this will usually find the appropriate data files and set up the appropriate "data" directory link.
It is best that these programs are load and run from the appropriate directory. For example, using the the file dopamine.gcl in the directory /usr/local/genie/examples/simple_graphics. Firstly change to this directory, and then load the program using
>>Load "dopamine.gcl"
and run these programs using the name of the file, in this case
>>dopamine
Before running most of the programs you will have to execute a device/open command.
focus_example.gcl - this example puts the necessary defaults into a workspace, that is required for focusing data from an IRIS raw data file.
integrate_example.gcl - integrates a single spectra and a set of single spectra
rebin_example.gcl - rebins the data to the same range, but different number of time channels, then integrates the spectra. The procedure then rebins to a given X range, to a set of given X values, and to linear steps in a given range and then integrated. The data is then rebinned using logarithmic steps and integrated.
contour_test.gcl - draws a filled colour contour plot, and then overplots two sets of contour lines,
add.gcl - procedure to analyse IRIS diffraction data. Developed by P. Marshall
This directory holds the data used for some of the procedures in the examples section.
dssdemo.gcl - this demonstrates the use of different plot styles, with error bars and line plots of linear and logarithmic scales.
contains various examples of programs concerned with the input and output of data from genie.
ascii_file.gcl - demonstrates the use of Open GENIE routines to read/write ASCII data using the ASCIIFILE command.
ascii_io.gcl - demonstrates the use of Open GENIE routines to output and input data in a fixed easy reading ASCII file format using the Get and Put command
intermediate_file.gcl - gives an example of the use of the Open GENIE routines to manipulate intermediate files.
raw_file.gcl - example of the use of Open GENIE routines to read and manipulate spectra from ISIS raw files
This directory gives examples of wrapping routines - c_example.gcl and fortran_example.gcl - for the calling of a module from within Open GENIE written in gcl, and examples of a such modules for C and FORTRAN- c_example.c and fortran_example.f
In this section examples are given of C, C++, and FORTRAN files, which can be used like the get() function of Open GENIE.
pktest.gcl - a procedure which generates data from a quadratic equation and then fits a cubic equation to the data, with the linear part of the function inputted by the user and then held constant. The data, the fit and the residuals are then plotted.
diffuse.gcl, phon.gcl - examples of publication graphics by Mark Harris. Loads ASCII data from a file creating a graph and plotting observed and theoretical data as markers and lines respectively.
report.gcl - this procedure demonstrates the ability to read data using an external FORTRAN program, and the use of CONTOUR and CELL_ARRAY commands in making a colour contour plot on data taken on MARI.
boxes.gcl - this procedure draws boxes on the plot window with corners chosen by the operator.
dopamine.gcl - this procedure reads in a .PRO file. It then plots two ranges of observed data as points with the calculated data plotted as a line. The difference between the calculated and data plot below. The data file OPT17.PRO was obtained from a Rietveld fit of a structural model of deuterated dopamine hydrobromide to HRPD data.
funcplot.gcl - this plots a function of your own specification by writing a module "on the fly" and then using this module to generate the data points
test_mutitplot.gcl - a procedure which produces a multi-plot from a spectra, and then Examples of multiplots constructs a multiplot from a 2-D array
textcirc.gcl - draws the string "A simple text string" in a circle.
slides.gcl - this procedure is a slide demonstration of Open GENIEs main features.
contour_test.gcl - displays a contour plot of a set of neutron inelastic spectra taken on IRIS.
glad.gcl - draws a colour intensity map of S(Q, w) data taken on MARI, which also allows you to see slices of the intensity variation with constant energy, and momentum.