 |
We calculate a "similarity score" between a pattern and the
similarity group of sequences from which it was derived and obtain the
average and the standard deviation .
We use the entire database as the other group and calculate the corresponding
values. The statistical significance of the difference can be formulated
in terms of the Student t value as follows: :

Thetvalue has a very simple meaning grapic meaning: the separation
of the s (score) distribution between the two groups (In chromatography
one uses an identical expression for calculating peak separation). The
t value is directly applicable for optimizing patterns. If we want to describe
the value in a publication, one has to look up the significance level in
the corresponding Student table.
|