All common arithmetic operators are supported. Boolean operators are also fully supported. Boolean expressions are evaluated to be either 1 or 0 (true or false respectively).
An indicates that the operator can be used with the specific type of variable. Refer to the grammar section for detailed information about operator precedence.
|Unary Plus, Unary Minus||+x, -x|
|Addition, Subtraction||+, -||(only +)|
|Less or Equal, More or Equal||<=, >=|
|Less Than, Greater Than||< ,>|
|Not Equal, Equal||!=, ==|
Each of the following alphanumerical standard functions can be applied to objects of the types indicated.
|Arc Tangent (with 2 parameters)||atan2(y, x)|
|Inverse Hyperbolic Sine||asinh(x)|
|Inverse Hyperbolic Cosine||acosh(x)|
|Inverse Hyperbolic Tangent||atanh(x)|
|Logarithm base 10||log(x)|
|Absolute Value / Magnitude||abs(x)|
|Rounded Value||round(x, scale)|
|Random number (between 0 and 1)||rand()|
|Modulus||mod(x, y) = x % y|
|Sum||sum(x, y, z)|
|If||if(condition, x, y)|
|String s1 contains s2 (returns 1 or 0)||contains(s1, s2)|
|Binomial coefficients||binom(n, i)||integers|
Biology And Chemistry Functions
In addition to the standard functions DataWarrior provides a few special functions being useful in the context of DataWarrior, drug discovery and chem- or bio-informatics. These functions may calculate values from alphanumerical data as the standard functions above or the may work on special data types as on chemical structures, reactions or descriptors. Available special functions are listed and explained below.
|Ligand Efficiency (HTS)||ligeff1(ra, conc in μmol/l, structure)|
|Ligand Efficiency (IC50)||ligeff2(ic50 in nmol/l, structure)|
|Chemical Similarity (A)||chemsim(descriptor, idcode)|
|Chemical Similarity (B)||chemsim(descriptor1, descriptor2)|
|Frequency of Occurance||frequency(s, column-name)|
The ligeff1() and ligeff2() functions calculate
ligand efficiencies as relative free binding energy in kcal/mol per non-H atom. While the first
function ligeff1() requires the remaining activity of an HTS result, the second syntax
ligeff2() needs IC50 values to work on. Ligand efficiency values are a much more reasonable
basis for selecting leads of an HTS campaign than remaining activities, because this avoids the strong
bias towards high molecular weight compounds, which is an implicit drawback of selecting those compounds
as leads, which have a remaining activity below a certain threshold. Also during lead optimization one
should compare target affinities based on ligand efficiencies rather than pure IC50 values.
"For the purposes of HTS follow-up, we recommend considering optimizing the hits or leads with the highest ligand efficiencies rather than the most potent..." (Ref.: A. L. Hopkins et al., Drug Disc. Today, 9 (2004), pp. 430-431).
To give an example: A compound with 30 atoms (400 MW) that binds with a Kd=10 nM has a ligand efficiency value of 0.36 kcal/mol per non-H atom. Another compound with 38 non-H atoms (500 MW) and the same ligand efficiency would have a 100 fold higher activity with Kd=0.106 nM. Let us assume an HTS screening revealed two hit compounds A and B with equal activities of IC50=10 nm, but different molecular weights of 400 and 500, respectively. Based on activities both compounds look equally attractive. Considering, however, that a synthetic introduction of a new group with 8 non-H atoms into compound A would match compound B in terms of weight, but would increase the activity by a factor of 100, if its ligand efficiency value can be maintained, it becomes clear that compound A is the by far more attractive alternative.
The remaining activity ra supplied to the ligeff1() function should be roughly between 0 and 100. The second parameter to this function is the assay concentration conc of the potential inhibitor in μmol/l. The third parameter is the molecular structure from which the number of non-hydrogen-atoms is determined automatically. In order to avoid misinterpretations one should understand the way the ligeff1() function works:
1) ra values below 1.0 are set to 1.0. Those above 99.0 are set to 99.0.
2) IC50 values are calculated from these range limited ra values as ic50 = conc / (100/ra - 1.0)
3) Assuming that the ic50 values are equivalent to the Kd the free energy of the ligand binding is calculated as dG = -RT * ln(ic50) with R=1.986 cal/(mol*K) and T=300K
4) The ligand efficiency is then calculated as ligeff = dG/Nnon-hydrogenatoms.
The consequences from the calculation are: Calculated ic50 values cover 4 log units with those values at the lower and upper end of this range having the highest uncertainty, i.e. the higher the noise of the screening the higher is the uncertainty of ic50 values and also of ligand efficiency values, especially those at the lower and higher end of the scale.
If one can use the second function ligeff2() based on measured IC50 values, one avoids the error potential of the calculation of IC50 values from remaining activities. Ligand efficiency values from this function are therefore much more reliable and only contain the error margin of the original IC50 value.
The chemsim() function calculates similarities
between two chemical structures or reactions. This function is available in two variations:
A Use syntax A to calculate the similarities of one column's chemistry objects against one reference compound or reaction. The first parameter of this function defines the kind of similarity to be calculated. It must be the name of a descriptor column from the popup menu. The second parameter is the idcode of the reference structure. The following example calculates the 3D-pharmacophore similarity of the compounds in column Structure to pyridin (gFx@@eJf`@@@ is the idcode of pyridin).
B Alternatively, you may use syntax B to calculate the similarities between two diffent columns containing chemistry objects. In this case you may need to calculate chemical descriptors first. Be aware that the descriptors supplied to the chemsim() function need to be of the same type. This example calculates the similarities between a Reactant and a Product column.
Operators are ordered from lowest to highest precedence (from top to bottom).