Data Analysis Functions

Arguably the most important purpose of Open GENIE is as a tool for scientific data analysis. Not just for preliminary data analysis, but as a framework which provides, or from which can be called, all the tools necessary to perform a complete analysis of the data. The end result of an Open GENIE analysis will be numbers and visualizations of those numbers (plots!) suitable for interpretation with words in a final scientific report.

Currently, a large amount of data analysis is performed with a set of disparate FORTRAN programs which have to support their own routines to access data, to drive a user interface and to work with a variety of different graphics packages. Usually, it is not in the interest of the scientists themselves (or of the resulting science) that this load is carried by those who are most expert in the science. What is critical and is very much at the heart of the philosophy of Open GENIE is that the scientist must have control over is how the data is processed.

It may be, that a scientist is willing to trust the Open GENIE peak fitting routines (written by an acknowledged expert in the field), in this case, part of the analysis can be delegated to Open GENIE. Alternatively, and probably more the case at the moment, the scientists will want to continue to use their own analysis code for most things but can leave Open GENIE to provide the user interface, graphics and file access.

How is all this currently achieved by open GENIE?

Firstly, Open GENIE provides a powerful mechanism for allowing a scientist do define a new function which takes input from Open GENIE and returns output to Open GENIE transparently but in the meantime calls a C or FORTRAN subroutine to do the work. This mechanism is described fully in the Module Reference section of this manual and allows the code to run as if it is a compiled part of Open GENIE; it is fast, and any crashes within the code are safely caught by Open GENIE.

Secondly, Open GENIE provides routines which have been found to be useful in several different areas of analysis. These too are hard coded in C/C++ or FORTRAN for efficiency, they fall into two categories:

High level analysis procedures - Routines which have some understanding of the aims of the analysis and the physical meaning of the data.
Low level analysis procedures - More generic routines which can be used for a wide variety of purposes.

The difference between these two categories is blurred, but the distinction we use here can be described as follows:

High level procedures usually take an Open GENIE Workspace containing a fairly complex grouping of experimental parameters and data (See Workspace operations for more details about workspaces). The analysis functions tend to be fairly data specific and may call on several lower level functions. An example is the Focus() command which focuses neutron scattering spectra from a Time-of-Flight instrument.

On the other hand, the low level procedures generally take relatively few parameters and do not take workspaces, an example here is the Peakfit() routine which can be used to fit a peak in any data.

High Level Analysis Procedures

There are not yet many high level data analysis procedures. We have begun to add those which appear to be fairly common amongst groups of neutron scattering users at the ISIS facility and we will start to add these in earnest as the analysis needs become clear. If you have routines that you feel should be included, please contact us at genie@isise.rl.ac.uk and we will look at how best to integrate them for you, if you can supply them coded as a module, even better. The current routines are:

Focus() Focus a range of spectra with given parameters
Integrate() Integrate one or more spectra
Peak() Interactive peak fitting of a workspace, see Graphics commands.
Rebin() Rebin a workspace given the specified binning scheme
Sumspec() Sum spectra in a given range from the specified file
Units() Convert the units of a Time of Flight workspace as specified

Low Level Analysis Procedures

The low level analysis procedures also include may Open GENIE basic operations, for example multiplication of arrays as well as mathematical functions such as taking logs of a data array. All we document here are those which have been written specially for data analysis and don't fit in one of the other groups.

Peakfit() Fits various peaks to data supplied in arrays
Peakgen() Generates peak data from an X-array and peak parameters

High Level Analysis Procedures

Focus()

Focus a range of spectra with given parameters

FOCUS() spectra=Range or IntegerArray [file=String] [detpars=Workspace] [specpars=Workspace] Focus a set of TOF spectra correcting for detector efficiency.
[:D] Convert to D-spacing before summing
[:Q] Convert to momentum transfer before summing
[:T] Sum in time of flight
[:VERBOSE] print out TOF parameters used in conversions

example:

	# Focus a time of flight spectrum in D-spacing
	# get the TOF parameters from the raw data file
	>> w = focus( 3:10, "irs12838.raw" )
	# Focus default file, every third period
	>> assign 3045
	>> refl = focus( 3:90@3, _, mydet )

Note: Focussing is done in D-spacing by default.

Focus

This command provides for focussing groups of neutron time of flight spectra from detectors at different scattering angles. This command, although written specifically for focussing multiple banks of detectors consecutively can be used for single scanning detectors data as long as the data from consecutive runs has been stored as numbered spectra in a data format Open GENIE understands, for example an old GENIE-V2 intermediate file (see Supported File Formats). If the data has been stored in an Open GENIE intermediate file, the parameters can be stored in workspace format, see the example below.

If the detector efficiency parameters and/or T-O-F parameters are available from some other source, the Focus() command can be extended easily to read these parameters using the Alias() command as in the following example.

Example

	# First alias the focus command so we can still get at it
	alias "original_focus" "focus"
	# Now define a new FOCUS procedure which can
	# read the parameters from a convenient file.
	PROCEDURE FOCUS
	    PARAMETERS SPECTRA FILE=String
	    QUALIFIERS /D /Q /T
	    RESULT res
	    LOCAL Tofpars Detpars
	    # read these from a convenient intermediate files, could also use
	    # the Asciifile() procedures to read any ascii file format
	    Tofpars = get("PARS", "parms.in3")
	    Detpars = get("EFF", "detpars.in3")
	    IF Q; res = original_focus:q(spectra, file, detpars, tofpars); ENDIF
	    IF D; res = original_focus:d(spectra, file, detpars, tofpars); ENDIF
	    IF T; res = original_focus:t(spectra, file, detpars, tofpars); ENDIF
	ENDPROCEDURE

The new focus command can be used transparently in place of the original command.

Parameters:

/D, /Q, /T

Specify whether to focus in D-spacing, momentum transfer or by time.

Spectra (Range)

This parameter specifies the spectra to be focussed from the input file as a Range.

Open GENIE supports unique data types called Interval and Range, a Range or Interval can be used to specify a range of indices respectively to access multi-spectra data. For more information on specifying intervals, please see a description of the Interval syntax.

For the most general ability to specify a group of arbitrary spectra an Integer array of spectrum identifiers can be given as the the "Spectra" parameter.

File (String) [ default = default input file ]

The name of the input data file. This parameter will default to the default input file set by the Set/File/Input command if it is not specified.

Detpars (Workspace) [ default = no detector efficiency correction ]

Parameters for correcting for detector efficiency. These are giiven in a workspace with the following field names:

Field name Type Description
PRESSURE Real Gas pressure (atms)
GAS_SIGMA0 Real Gas cross section at lambda 0 (cm-2)
PATHLENGTH Real Gas path length (cm)
lWAVELENGTH Real Characteristic wavelength (cm)
WEIGHT Real Molecular weight of container atoms (g mol-1)
DENSITY Real Density of container (g cm-3)
THICKNESS Real Container wall thickness (cm)
WALL_SIGMA0 Real Wall cross section at LAM0 (cm-2)
TYPE String Detector description, eg "He3 gas"

Specpars (Workspace) [ default = values from raw file ]

Time of flight parameters. These are given in a workspace with the following field names:

Field name Type Description
TWOTHETA Realarray Two theta scattering angles per spectrum (degrees)
EMODE Integer mode 0=inelastic, 1=incident, 2=transmitted
EFIXED Real or Realarray Fixed energy value or values if EMODE=2
L1 Real Primary flight path (m)
L2 Realarray Secondary flight paths (m)

The arrays need to be of size "nspec" where "nspec" is the total number of spectra being focussed.

RESULT = (Restype)

The final focussed spectrum.

Integrate()

Integrate one or more spectra

INTEGRATE() wksp=Workspace [xmin=Real] [xmax=Real] Integrate a workspace (1d or 2d)
INTEGRATE() spectra=Interval or IntegerArray [xmin=Real] [xmax=Real] [file=String] Integrate spectra from a file

example:

	# print the integrals of two spectra
	>> printn integrate(1:2, "tfx00345.raw")
	INTEGRATE: Integrating between 38000 and 78000
	  Workspace []
	  (
	    error = [2341.4700 1027.1312 ] Array(2 )
	    sum = [5482585.0 1054968.0 ] Array(2 )
	  )

Note: The sum is always in total counts

Integrate

The Integrate() command is used to perform integration of one or more spectra. Even f the Y units are normalized the integrate command will give a result in absolute counts.

Parameters:

Wksp (Workspace)

A one or two dimensional workspace to integrate.

Spectra (Range)

This parameter specifies the spectra to be integrated from the input file as a Range.

Open GENIE supports unique data types called Interval and Range, a Range or Interval can be used to specify a range of indices respectively to access multi-spectra data. For more information on specifying intervals, please see a description of the Interval syntax.

For the most general ability to specify a group of arbitrary spectra an Integer array of spectrum identifiers can be given as the the "Spectra" parameter.

Xmin, Xmax (Real) [default = spectra minimum & maximum respectively]

Optional interval over which to integrate. If the interval is not on a bin boundary, the counts are automatically corrected proportionally to the amount of the bin included.

File (String)

Allows specification of a file from which to get the spectra when used with the second form of the command. Otherwise, the input file is taken to be the file set by the Set/File/Input command.

RESULT = (Workspace)

The result of the integrate command is returned as a workspace with two fields (see example above). The "SUM" field contains either a single value or an array of integrals. The "ERROR" field contains the propagated error on the result, similarly, either a single value or an array of values.

Rebin()

Rebin a workspace given the specified binning scheme.

REBIN
[/LIN]
[/LOG]
wksp=Workspace bound1=Real step1=Real bound2=Real [step2=Real] [bound3=Real] [step3=Real] [bound4=Real] Rebin a workspace in linear or logarithmic steps
REBIN wksp=Workspace xmin=Real xmax=Real Rebin over a range
REBIN wksp=Workspace xarray=Realarray Rebin to the given X values.

example:

	# Read a spectrum and re-bin the same as an
	# already loaded vanadium workspace
	>> w = s(1)
	>> w = rebin( w, vanadium.x )

Note: You can combine rebins by concatenating the results with the & operator.

Rebin

The rebin command performs what can loosely be described as an interpolation from one histogram into another histogram with the same integrated count but with modified X-boundaries. Some numerical precision is lost during a re-binning operation and there is usually a degree of peak broadening so multiple re-binning is to be avoided.

The essential element in all the Rebin() command formats is to provide a new set of X-values for the spectra being re-binned. The first format allows ranges of linear or logarithmic steps to be specified explicitly. The second, simply acts to trim or expand a workspace to fit the given range, often useful when a series of spectra with different offsets need combining. The final form ensures the compatibility of two spectra by re-binning one spectrum to the X-values of the other.

Linear and logarithmic re-binning can be provided together by concatenating the different rebin ranges with the "&" operator (an end-on join of two spectra, see workspace operations). For example,

	w = rebin:lin(s(1), 303, 700) & rebin:log(s(1), 700, 0.01, 800)

(note that for an append to work, the ending value of the first range must be identical to the starting value of the next range).

Parameters:

/Lin

Linear rebinning between the boundaries specified in steps given by the "Stepn" parameters.

/Log

Gives logarithmic bin widths. The bin widths are calculated such that for for any given binning range

Xn+1 - Xn = Step * Xn

where Step is the step value chosen for the range. For example, given the command

w = rebin:log(w, 10.0, 0.1, 15.0)

The bin boundaries generated will be [10, 11, 12.1, 13.31, 14.641, 15].

Wksp (Workspace)

The workspace to be re-binned

Boundn, Stepn (Real)

Lower bound and step value (Dn above). There must always be one more bound specified than step value.

Xmin, Xmax (Real)

Interval over which to re-bin. If the interval is not on a bin boundary, the counts are automatically corrected proportionally to the amount of the bin included. This command leaves all complete bin boundaries within the range the same as before. It is used for extending or contracting the X-range of the spectra in a data safe way.

Xarray (Realarray)

An X-array to completely specify the desired bin boundaries. This could be an array from a comparable spectrum or it may be automatically generated by the Fill() command as an arithmetic or geometric progression.

RESULT = (Restype)

The re-binned workspace. If the rebin command is used in the keyword form (ie Rebin w ...), workspace will be destructively re-binned. In the functional form, the input workspace will remain unaltered.

Sumspec()

Sum spectra in a given range from the specified file. (See also Focus/T)

SUMSPEC [spectra=Range or IntegerArray] [file=String] Sum spectra in Time-of-Flight

Note: Obsolete command, use Focus:T()

Sumspec

See the Focus() command for a description of the parameters, this command is functionally equivalent to Focus/T and is only kept for compatibility.

Units()

Convert the units of a Time of Flight workspace as specified

UNITS() wksp=Workspace [xmin=Real] [xmax=Real] Convert workspace units
[/C] or [/Channel] To Channel numbers (one way)
[/D] To D-Spacing (A)
[/E] To Energy (meV)
[/LAM] To Wavelength (A)
[/Q] To Momentum Transfer (A-1)
[/SQ] To (Momentum Transfer)2 (A-2)
[/T] To Time of Flight (us)
[/LA1] To primary flight path wavelength (A)
[/W] To energy transfer E1-E2 (meV)
[/WN] To energy transfer E1-E2 (cm-1)
[/TAU] To reciprocal time-of-flight (us/m)

example:

	# Read in a spectrum whilst converting to Wavelength
	>> w = units:Lam( s(1) )

Note: TOF Parameters must be set correctly in the workspace (or set Set/Par).

Units

The Units() command provides for units conversion of Time-of-Flight spectra. The conversions usually rely on specific parameters being set correctly in the input workspace. These may be read in correctly from the data file, or they may have to be set in the workspace before the Units() command is carried out. To find the parameters see the section on workspace operations.

A good way of ensuring parameters are read correctly is to overload the S() command, see Alias() and the example given for overriding the Focus() command.

Parameters:

Wksp (Workspace)

The Time-of-Flight workspace to convert.

Xmin, Xmax (Real) [default = converted spectra minimum & maximum respectively]

Optional interval to select the section of the workspace to convert (specified in the final units). For energy only, the default is 1000-5000meV.

RESULT = (Workspace)

The converted workspace. If the keyword syntax is used, the workspace will be converted destructively (e.g. Units/C w).

Low Level Analysis Procedures

Peakfit()

Fits various peaks to data supplied in arrays.

PEAKFIT() x=Realarray y=Realarray e=Realarray
[pars=Realarray] [ipropt=Integerarray]
Fits a selction of peaks to supplied data.
[:GAUSS] Gaussian
[:GEXP] Gaussian convolved with exponential
[:LOREN] Lorentzian
[:LEXP] Lorentzian convolved with exponential
[:VOIGT] Voigt
[:VEXP] Voigt convolved with exponential
[:POLY] Polynomial of a specified degree

example:

	# Read some data and fit a peak to it
	# remembering to convert from histogram
	# to point data.
	>> w=s(1)
	>> res = Peakfit:Lexp(centre_bins(w.x), w.y, w.e)

Note: Centre_bins() reduces X-array length by one and picks bin centres.

 

Peakfit

The peakfit command is designed to give full access to a good selection of peak fitting routines. It is designed so that it can be used in conjunction with the Peakgen() command which allows a regeneration of the fitted peak, usually to allow a graphical comparison of the goodness of the fit.

Parameters:

X, Y, E (Realarray)

One dimensional X, Y, and error data arrays of the same length. Note that a lot of data in Open GENIE is in histogram format so it is important to ensure that the X-array is converted before attempting to fit a peak. The Centre_bins() function is provided for this purpose.

Pars (Realarray) [ default = best guess line fit ]

Optional input parameter that allows an initial guess or fixing of one or more of the peak parameters. This operates in conjunction with the corresponding values in "Ipropt". The length of the pars array differs depending on the number of peak parameters required for a particular fit (see below for descriptions of the individual fitting routines).

Ipropt (Integerarray) [ default = no initial estimates ]

Controls the treatment of the corresponding parameter in "Pars". This array consists of a set of integer flags.

0 = no initial estimate
1 = use parameter as a guess
2 = fix parameter value as given

By using these two parameter arrays, it is possible to completely control the operation of the peak fitting routines, However, for simple fitting it is not necessary to supply a "Pars" or "Ipropt" array.

RESULT = (Workspace)

The result of the Peakfit() command is returned as a workspace containing the following fields:

Field name Type Description
IGOOD Integer Status value for goodness of fit:
< -1 Terrible
-1 Maybe OK
>= 0 Good
PARS Realarray Best estimates of the parameters
SIGPAR Realarray 1-sigma error-bars for PARS (may be negative if IGOOD < 0)

Peakfit:Gauss

Estimates the parameters of a Gaussian-peak sitting on a straight-line background, given a pertinent set of data:

    y(x) = Mx + C + A*gaus(X0,SIGMA) ,

where

                           1             [ - (x-X0)**2  ]  
    gaus(X0,SIGMA) = ---------------- exp|--------------| .
                     SIGMA*sqrt(2*pi)    [ 2 * SIGMA**2 ]

Parameters:

Pars (Realarray)

Pars[1] = M
Pars[2] = C
Pars[3] = A
Pars[4] = X0
Pars[5] = SIGMA

Peakfit:Gexp

Estimates the parameters of a peak, consisting of a Gaussian convolved with a sharp-edged exponential, sitting on a straight-line background, given a pertinent set of data:

    y(x) = Mx + C + A*gsexp(X0,SIG,TAU) ,

where

                               1           [ - (x-X0)**2 ]  
    gsexp(X0,SIG,TAU) =  -------------- exp|-------------| 
                         SIG*sqrt(2*pi)    [  2 * SIG**2 ]

Convolved with

    {        0           if x/TAU < 0
    {                                  .
    { exp(-x/TAU)/|TAU|  if x/TAU > 0

Parameters:

Pars (Realarray)

Pars[1] = M
Pars[2] = C
Pars[3] = A
Pars[4] = X0
Pars[5] = SIG
Pars[6] = TAU

Peakfit:Loren

Estimates the parameters of a Lorentzian-peak, sitting on a straight-line background, given a pertinent set of data:

    y(x) = Mx + C + A*lorentz(X0,GAMMA) ,

where

                                  GAMMA  
    lorentz(X0,GAMMA) = ---------------------------  .
                        pi [ (x-X0)**2 + GAMMA**2 ]

Parameters:

Pars (Realarray)

Pars[1] = M
Pars[2] = C
Pars[3] = A
Pars[4] = X0
Pars[5] = GAMMA

Peakfit:Lexp

Estimates the parameters of a peak, consisting of a Lorentzian convolved with a sharp-edged exponential, sitting on a straight-line background, given a pertinent set of data:

    y(x) = Mx + C + A*lzexp(X0,GAM,TAU) ,

where

                                    GAM  
    lzexp(X0,GAM,TAU) =   -------------------------
                          pi [ (x-X0)**2 + GAM**2 ]

Convolved with

    {        0           if x/TAU < 0
    {                                  .
    { exp(-x/TAU)/|TAU|  if x/TAU > 0

Parameters:

Pars (Realarray)

Pars[1] = M
Pars[2] = C
Pars[3] = A
Pars[4] = X0
Pars[5] = GAM
Pars[6] = TAU

Peakfit:Voigt

Estimates the parameters of a Voigt-peak sitting on a straight-line background, given a pertinent set of data:

    y(x) = Mx + C + A*voigt(X0,SIG,GAM) ,

where

                               1           [ - (x-X0)**2 ]
    voigt(X0,SIG,GAM) =  -------------- exp|-------------|
                         SIG*sqrt(2*pi)    [  2 * SIG**2 ]

Convolved with

             GAM
     --------------------  .
     pi [ x**2 + GAM**2 ]

Parameters:

Pars (Realarray)

Pars[1] = M
Pars[2] = C
Pars[3] = A
Pars[4] = X0
Pars[5] = SIG
Pars[6] = GAM

Peakfit:Vexp

Estimates the parameters of a peak, consisting of a Voigt convolved with a sharp-edged exponential, sitting on a straight-line background, given a pertinent set of data:

    y(x) = Mx + C + A*vtexp(X0,SIG,TAU) ,

where

                               1           [ - (x-X0)**2 ]
    voigt(X0,SIG,GAM) =  -------------- exp|-------------|
                         SIG*sqrt(2*pi)    [  2 * SIG**2 ]

Convolved with

             GAM
     --------------------  .
     pi [ x**2 + GAM**2 ]

Convolved with

    {        0           if x/TAU < 0
    {                                  .
    { exp(-x/TAU)/|TAU|  if x/TAU > 0

Parameters:

Pars (Realarray)

Pars[1] = M
Pars[2] = C
Pars[3] = A
Pars[4] = X0
Pars[5] = SIG
Pars[6] = GAM
Pars[7] = TAU

 

 

Peakfit:Poly

Estimates the polynomial coefficients of a peak by attempting to fit an nth degree polynomial, the degree of the polynomial fit is given by the number of parameters supplied to pars, for this routine "Pars" is not optional.

    y(x) = p1 + p2*x + p3*x^2 + p4*x^3 + ...

For example

    >> pars = dimensions(3)			# create an array
    >> fill pars 0.0				# fill with 0.0
    >> fit = Peakfit:Poly(x, y, e, pars)	# quadratic fit.

Parameters:

Pars (Realarray)

Pars[1] = p1
Pars[2] = p2
Pars[3] = p3
Pars[4] = p4
Pars[5] = p5
Pars[6] = p6

Peakgen()

Generates peak data from an X-array and peak parameters.

PEAKGEN() x=Realarray pars=Realarray Fits a selction of peaks to supplied data.
[:GAUSS] Gaussian
[:GEXP] Gaussian convolved with exponential
[:LOREN] Lorentzian
[:LEXP] Lorentzian convolved with exponential
[:VOIGT] Voigt
[:VEXP] Voigt convolved with exponential
[:POLY] Polynomial

example:

	# Produce data to show goodness of a previous fit
	>> y_fit = Peakgen:Lexp(centre_bins(w.x), res.pars)
	>> y_goodness = w.y - y_fit  # difference for plotting

Note: Centre_bins() reduces X-array length by one and picks bin centres.

 

Peakgen

The peakgen command is designed to be used in conjunction with the Peakfit() command, it allows a regeneration of the fitted peak, usually to allow a graphical comparison of the goodness of the fit.

Note that it is not essential to have done a Peakfit() command for the Peakgen() command to work. By supplying an X-array and suitable parameters, any of the peaks may be generated.

Parameters:

X (Realarray)

One dimensional X array, usually the same X-array which was supplied to the corresponding Peakfit() command.

Pars (Realarray)

The parameters defining the peak. See the Peakfit() command for a description of the individual parameters for each type of peak.

RESULT = (Realarray)

The result of the Peakgen() command is an array of Y-values, the same length as the supplied X-array which if plotted will give the required peak.