## Operators

All common arithmetic operators are supported. Boolean operators are also fully supported. Boolean expressions are evaluated to be either 1 or 0 (true or false respectively).

An indicates that the operator can be used with the specific type of variable. Refer to the grammar section for detailed information about operator precedence.

Description |
Function |
Double |
String |

Power | ^ | ||

Boolean Not | ! | ||

Unary Plus, Unary Minus | +x, -x | ||

Modulus | % | ||

Division | / | ||

Multiplication | * | ||

Addition, Subtraction | +, - | (only +) | |

Less or Equal, More or Equal | <=, >= | ||

Less Than, Greater Than | < ,> | ||

Not Equal, Equal | !=, == | ||

Boolean And | && | ||

Boolean Or | || |

## Standard Functions

Each of the following alphanumerical standard functions can be applied to objects of the types indicated.

Description |
Function |
Double |
String |

Sine | sin(x) | ||

Cosine | cos(x) | ||

Tangent | tan(x) | ||

Arc Sine | asin(x) | ||

Arc Cosine | acos(x) | ||

Arc Tangent | atan(x) | ||

Arc Tangent (with 2 parameters) | atan2(y, x) | ||

Hyperbolic Sine | sinh(x) | ||

Hyperbolic Cosine | cosh(x) | ||

Hyperbolic Tangent | tanh(x) | ||

Inverse Hyperbolic Sine | asinh(x) | ||

Inverse Hyperbolic Cosine | acosh(x) | ||

Inverse Hyperbolic Tangent | atanh(x) | ||

Natural Logarithm | ln(x) | ||

Logarithm base 10 | log(x) | ||

Exponential | exp(x) | ||

Absolute Value / Magnitude | abs(x) | ||

Integer Value | int(x) | ||

Rounded Value | round(x, scale) | ||

Random number (between 0 and 1) | rand() | ||

Modulus | mod(x, y) = x % y | ||

Square Root | sqrt(x) | ||

Sum | sum(x, y, z) | ||

If | if(condition, x, y) | ||

String of | str(x) | ||

String length | len(s) | ||

String s1 contains s2 (returns 1 or 0) | contains(s1, s2) | ||

Binomial coefficients | binom(n, i) | integers |

## Biology And Chemistry Functions

In addition to the standard functions DataWarrior provides a few special functions being useful in the context of DataWarrior, drug discovery and chem- or bio-informatics. These functions may calculate values from alphanumerical data as the standard functions above or the may work on special data types as on chemical structures, reactions or descriptors. Available special functions are listed and explained below.

Description |
Function |

Ligand Efficiency (HTS) | ligeff1(ra, conc in μmol/l, structure) |

Ligand Efficiency (IC50) | ligeff2(ic50 in nmol/l, structure) |

Chemical Similarity (A) | chemsim(descriptor, idcode) |

Chemical Similarity (B) | chemsim(descriptor1, descriptor2) |

Frequency of Occurance | frequency(s, column-name) |

The **ligeff1()** and **ligeff2()** functions calculate
ligand efficiencies as relative free binding energy in **kcal/mol per non-H atom**. While the first
function **ligeff1()** requires the remaining activity of an HTS result, the second syntax
**ligeff2()** needs IC50 values to work on. Ligand efficiency values are a much more reasonable
basis for selecting leads of an HTS campaign than remaining activities, because this avoids the strong
bias towards high molecular weight compounds, which is an implicit drawback of selecting those compounds
as leads, which have a remaining activity below a certain threshold. Also during lead optimization one
should compare target affinities based on ligand efficiencies rather than pure IC50 values.
*"For the purposes of HTS follow-up, we recommend considering optimizing the hits or leads with
the highest ligand efficiencies rather than the most potent..."* (Ref.: A. L. Hopkins et al., *Drug
Disc. Today*, **9** (2004), pp. 430-431).

To give an example: A compound with 30 atoms (400 MW) that binds with a *K*_{d}=10 nM
has a ligand efficiency value of 0.36 kcal/mol per non-H atom. Another compound with 38 non-H atoms (500 MW)
and the same ligand efficiency would have a 100 fold higher activity with *K*_{d}=0.106 nM.
Let us assume an HTS screening revealed two hit compounds A and B with equal activities of IC_{50}=10
nm, but different molecular weights of 400 and 500, respectively. Based on activities both compounds look
equally attractive. Considering, however, that a synthetic introduction of a new group with 8 non-H
atoms into compound A would match compound B in terms of weight, but would increase the activity by a factor
of 100, if its ligand efficiency value can be maintained, it becomes clear that compound A is the by far
more attractive alternative.

The remaining activity *ra* supplied to the **ligeff1()** function should be roughly between
0 and 100. The second parameter to this function is the assay concentration *conc* of the potential
inhibitor in μmol/l. The third parameter is the molecular *structure* from which the number of
non-hydrogen-atoms is determined automatically. In order to avoid misinterpretations one should
understand the way the **ligeff1()** function works:
**1) ***ra* values below 1.0 are set
to 1.0. Those above 99.0 are set to 99.0.
**2) **IC50 values are calculated from these range limited
*ra* values as ic50 = conc / (100/ra - 1.0)
**3) **Assuming that the ic50 values are equivalent to the Kd
the free energy of the ligand binding is calculated as dG = -RT * ln(ic50) with R=1.986 cal/(mol*K)
and T=300K
**4) **The ligand efficiency is then calculated as ligeff = dG/N_{non-hydrogenatoms}.

The consequences from the calculation are: Calculated ic50 values cover 4 log units with those values
at the lower and upper end of this range having the highest uncertainty, i.e. the higher the noise
of the screening the higher is the uncertainty of ic50 values and also of ligand efficiency values,
especially those at the lower and higher end of the scale.

If one can use the second function **ligeff2()** based on measured IC50 values, one avoids the error
potential of the calculation of IC50 values from remaining activities. Ligand efficiency values from this
function are therefore much more reliable and only contain the error margin of the original IC50 value.

The **chemsim()** function calculates similarities
between two chemical structures or reactions. This function is available in two variations:
**A **Use syntax A to calculate the similarities of one column's chemistry objects against
one reference compound or reaction. The first parameter of this function defines the kind of similarity
to be calculated. It must be the name of a descriptor column from the popup menu. The second
parameter is the idcode of the reference structure. The following example calculates the
3D-pharmacophore similarity of the compounds in column *Structure* to pyridin (gFx@@eJf`@@@ is
the idcode of pyridin).

Example: chemsim(PP3DMM2_of_Structure,"gFx@@eJf`@@@")

**B **Alternatively, you may use syntax B to calculate the similarities between two diffent columns
containing chemistry objects. In this case you may need to calculate chemical descriptors first. Be
aware that the descriptors supplied to the chemsim() function need to be of the same type. This example
calculates the similarities between a *Reactant* and a *Product* column.

Example: chemsim(FragFp_of_Reactant,FragFp_of_Product)

## Grammar

Operators are ordered from lowest to highest precedence (from top to bottom).