Metric | Formula | Description | Ref. |
---|---|---|---|
Average Kullback-Leibler (AKL) | \( AKL\left(X,Y\right)=10-\frac{\sum \limits_{b=A}^T{f}_x(b)\times \mathit{\log}\frac{f_x(b)}{f_y(b)}+\sum \limits_{b=A}^T{f}_y(b)\times \mathit{\log}\frac{f_y(b)}{f_x(b)}}{2} \) | X and Y are two aligned columns of two matrices in comparison. fx(b) is the frequency of base b ∈ {A, C, G, T} in column X and likewise for fy(b) in column Y. AKL(X, Y) is the similarity score at an alignment position for two columns X and Y. | 21 |
Average Log-likelihood Ratio (ALLR) | \( ALLR=\frac{\sum \limits_{b=A}^T{n}_{bX}\times \mathit{\log}\left(\frac{f_{bY}}{p_b}\right)+\sum \limits_{b=A}^T{n}_{bY}\times \mathit{\log}\left(\frac{f_{bX}}{p_b}\right)}{\sum \limits_{b=A}^T\left({n}_{bX}+{n}_{bY}\right)} \) | nbX is the count of base b ∈ {A, C, G, T} in column X and likewise for nbY in column Y. fb = nb/N is the frequency of base b where N is the total count of all bases in a column. pb is the prior probability for base b. | 24 |
Pearson Correlation Coefficient (PCC) | \( PCC\left(X,Y\right)=\frac{\sum \limits_{b=A}^T\left({X}_b-\overline{X}\right)\times \left({Y}_b-\overline{Y}\right)}{\sqrt{\sum \limits_{b=A}^T{\left({X}_b-\overline{X}\right)}^2\times \sum \limits_{b=A}^T{\left({Y}_b-\overline{Y}\right)}^2}} \) | Xb is the count of base b ∈ {A, C, G, T} in column X and likewise for Yb in column Y. \( \overline{X} \) is the average count of bases in column X and likewise for \( \overline{Y} \) in column Y. | 24 |
χ2 Distance | \( {\chi}^2=\sum \limits_{b=A,C,G,T}\frac{{\left({N}_{g,i}{f}_{b,i}-{N}_{f,i}{g}_{b,i}\right)}^2}{N_{f,i}{N}_{g,i}\left({f}_{b,i}+{g}_{b,i}\right)} \) | fb,  i is the entries of overlapping parts at position i in matrix f of the two matrices f and g in comparison gb,  i is the entries of overlapping parts in matrix g Nf, i = ∑bfb, i, and Ng, i = ∑bgb, i. | 16 |