
It basically uses the algorithm in the unix "comm" command (hence the name) to compute:
where
maps a sequence
that consists of letters in
to a feature vector of size
. In this feature vector each entry denotes how often the k-mer appears in that
.
Note that this representation enables spectrum kernels of order 8 for 8bit alphabets (like binaries) and order 32 for 2-bit alphabets like DNA.
For this kernel the linadd speedups are implemented (though there is room for improvement here when a whole set of sequences is ADDed) using sorted lists.
Definition at line 43 of file CommUlongStringKernel.h.
| CCommUlongStringKernel::CCommUlongStringKernel | ( | INT | size = 10, |
|
| bool | use_sign = false, |
|||
| ENormalizationType | normalization_ = FULL_NORMALIZATION | |||
| ) |
constructor
| size | cache size | |
| use_sign | if sign shall be used | |
| normalization_ | type of normalization |
Definition at line 16 of file CommUlongStringKernel.cpp.
| CCommUlongStringKernel::CCommUlongStringKernel | ( | CStringFeatures< ULONG > * | l, | |
| CStringFeatures< ULONG > * | r, | |||
| bool | use_sign = false, |
|||
| ENormalizationType | normalization_ = FULL_NORMALIZATION, |
|||
| INT | size = 10 | |||
| ) |
constructor
| l | features of left-hand side | |
| r | features of right-hand side | |
| use_sign | if sign shall be used | |
| normalization_ | type of normalization | |
| size | cache size |
Definition at line 25 of file CommUlongStringKernel.cpp.
| CCommUlongStringKernel::~CCommUlongStringKernel | ( | ) | [virtual] |
Definition at line 36 of file CommUlongStringKernel.cpp.
initialize kernel
| l | features of left-hand side | |
| r | features of right-hand side |
Reimplemented from CStringKernel< ST >.
Definition at line 74 of file CommUlongStringKernel.cpp.
| void CCommUlongStringKernel::cleanup | ( | ) | [virtual] |
clean up kernel
Reimplemented from CKernel.
Definition at line 136 of file CommUlongStringKernel.cpp.
| bool CCommUlongStringKernel::load_init | ( | FILE * | src | ) | [virtual] |
load kernel init_data
| src | file to load from |
Implements CKernel.
Definition at line 154 of file CommUlongStringKernel.cpp.
| bool CCommUlongStringKernel::save_init | ( | FILE * | dest | ) | [virtual] |
save kernel init_data
| dest | file to save to |
Implements CKernel.
Definition at line 159 of file CommUlongStringKernel.cpp.
| virtual EKernelType CCommUlongStringKernel::get_kernel_type | ( | ) | [virtual] |
return what type of kernel we are
Implements CKernel.
Definition at line 100 of file CommUlongStringKernel.h.
| virtual const CHAR* CCommUlongStringKernel::get_name | ( | ) | [virtual] |
return the kernel's name
Implements CKernel.
Definition at line 106 of file CommUlongStringKernel.h.
initialize optimization
| count | count | |
| IDX | index | |
| weights | weights |
Reimplemented from CKernel.
Definition at line 319 of file CommUlongStringKernel.cpp.
| bool CCommUlongStringKernel::delete_optimization | ( | ) | [virtual] |
delete optimization
Reimplemented from CKernel.
Definition at line 346 of file CommUlongStringKernel.cpp.
compute optimized
| idx | index to compute |
Reimplemented from CKernel.
Definition at line 355 of file CommUlongStringKernel.cpp.
| void CCommUlongStringKernel::merge_dictionaries | ( | INT & | t, | |
| INT | j, | |||
| INT & | k, | |||
| ULONG * | vec, | |||
| ULONG * | dic, | |||
| DREAL * | dic_weights, | |||
| DREAL | weight, | |||
| INT | vec_idx, | |||
| INT | len, | |||
| ENormalizationType | p_normalization | |||
| ) |
merge dictionaries
| t | t | |
| j | j | |
| k | k | |
| vec | vector | |
| dic | dictionary | |
| dic_weights | dictionary weights | |
| weight | weight | |
| vec_idx | vector index | |
| len | length | |
| p_normalization | normalization |
Definition at line 143 of file CommUlongStringKernel.h.
add to normal
| idx | where to add | |
| weight | what to add |
Reimplemented from CKernel.
Definition at line 248 of file CommUlongStringKernel.cpp.
| void CCommUlongStringKernel::clear_normal | ( | ) | [virtual] |
| void CCommUlongStringKernel::remove_lhs | ( | ) | [virtual] |
remove lhs from kernel
Reimplemented from CKernel.
Definition at line 41 of file CommUlongStringKernel.cpp.
| void CCommUlongStringKernel::remove_rhs | ( | ) | [virtual] |
remove rhs from kernel
Reimplemented from CKernel.
Definition at line 61 of file CommUlongStringKernel.cpp.
| virtual EFeatureType CCommUlongStringKernel::get_feature_type | ( | ) | [virtual] |
return feature type the kernel can deal with
Reimplemented from CStringKernel< ST >.
Definition at line 189 of file CommUlongStringKernel.h.
get dictionary
| dsize | dictionary size will be stored in here | |
| dict | dictionary will be stored in here | |
| dweights | dictionary weights will be stored in here |
Definition at line 197 of file CommUlongStringKernel.h.
compute kernel function for features a and b idx_{a,b} denote the index of the feature vectors in the corresponding feature object
| idx_a | index a | |
| idx_b | index b |
Implements CKernel.
Definition at line 164 of file CommUlongStringKernel.cpp.
| DREAL CCommUlongStringKernel::normalize_weight | ( | DREAL | value, | |
| INT | seq_num, | |||
| INT | seq_len, | |||
| ENormalizationType | p_normalization | |||
| ) | [protected] |
normalize weight
| value | value | |
| seq_num | sequence number | |
| seq_len | length of sequence | |
| p_normalization | type of normalization |
Definition at line 222 of file CommUlongStringKernel.h.
| virtual EFeatureClass CStringKernel< ST >::get_feature_class | ( | ) | [virtual, inherited] |
return feature class the kernel can deal with
Implements CKernel.
Definition at line 63 of file StringKernel.h.
get kernel matrix
| dst | destination where matrix will be stored | |
| m | dimension m of matrix | |
| n | dimension n of matrix |
Definition at line 79 of file Kernel.cpp.
get kernel matrix real
| m | dimension m of matrix | |
| n | dimension n of matrix | |
| target | the kernel matrix |
Definition at line 216 of file Kernel.cpp.
| SHORTREAL * CKernel::get_kernel_matrix_shortreal | ( | int & | m, | |
| int & | n, | |||
| SHORTREAL * | target | |||
| ) | [virtual, inherited] |
get kernel matrix shortreal
| m | dimension m of matrix | |
| n | dimension n of matrix | |
| target | target for kernel matrix |
Reimplemented in CCustomKernel.
Definition at line 146 of file Kernel.cpp.
| bool CKernel::load | ( | CHAR * | fname | ) | [inherited] |
load the kernel matrix
| fname | filename to load from |
Definition at line 322 of file Kernel.cpp.
| bool CKernel::save | ( | CHAR * | fname | ) | [inherited] |
save kernel matrix
| fname | filename to save to |
Definition at line 327 of file Kernel.cpp.
| CFeatures* CKernel::get_lhs | ( | ) | [inherited] |
| CFeatures* CKernel::get_rhs | ( | ) | [inherited] |
| INT CKernel::get_num_vec_lhs | ( | ) | [inherited] |
| INT CKernel::get_num_vec_rhs | ( | ) | [inherited] |
| bool CKernel::has_features | ( | ) | [inherited] |
| void CKernel::remove_lhs_and_rhs | ( | ) | [virtual, inherited] |
remove lhs and rhs from kernel
Definition at line 358 of file Kernel.cpp.
| void CKernel::set_cache_size | ( | INT | size | ) | [inherited] |
| int CKernel::get_cache_size | ( | ) | [inherited] |
| void CKernel::list_kernel | ( | ) | [inherited] |
list kernel
Definition at line 389 of file Kernel.cpp.
| bool CKernel::has_property | ( | EKernelProperty | p | ) | [inherited] |
| EOptimizationType CKernel::get_optimization_type | ( | ) | [inherited] |
| virtual void CKernel::set_optimization_type | ( | EOptimizationType | t | ) | [virtual, inherited] |
| bool CKernel::get_is_initialized | ( | ) | [inherited] |
| bool CKernel::init_optimization_svm | ( | CSVM * | svm | ) | [inherited] |
initialize optimization
| svm | svm model |
Definition at line 644 of file Kernel.cpp.
| void CKernel::compute_batch | ( | INT | num_vec, | |
| INT * | vec_idx, | |||
| DREAL * | target, | |||
| INT | num_suppvec, | |||
| INT * | IDX, | |||
| DREAL * | alphas, | |||
| DREAL | factor = 1.0 | |||
| ) | [virtual, inherited] |
computes output for a batch of examples in an optimized fashion (favorable if kernel supports it, i.e. has KP_BATCHEVALUATION. to the outputvector target (of length num_vec elements) the output for the examples enumerated in vec_idx are added. therefore make sure that it is initialized with ZERO. the following num_suppvec, IDX, alphas arguments are the number of support vectors, their indices and weights
Reimplemented in CCombinedKernel, CWeightedDegreePositionStringKernel, and CWeightedDegreeStringKernel.
Definition at line 568 of file Kernel.cpp.
| DREAL CKernel::get_combined_kernel_weight | ( | ) | [inherited] |
| void CKernel::set_combined_kernel_weight | ( | double | nw | ) | [inherited] |
| INT CKernel::get_num_subkernels | ( | ) | [virtual, inherited] |
get number of subkernels
Reimplemented in CCombinedKernel, CWeightedDegreePositionStringKernel, and CWeightedDegreeStringKernel.
Definition at line 583 of file Kernel.cpp.
| void CKernel::compute_by_subkernel | ( | INT | vector_idx, | |
| DREAL * | subkernel_contrib | |||
| ) | [virtual, inherited] |
compute by subkernel
| vector_idx | index | |
| subkernel_contrib | subkernel contribution |
Reimplemented in CCombinedKernel, CWeightedDegreePositionStringKernel, and CWeightedDegreeStringKernel.
Definition at line 588 of file Kernel.cpp.
get subkernel weights
| num_weights | number of weights will be stored here |
Reimplemented in CCombinedKernel, CWeightedDegreePositionStringKernel, and CWeightedDegreeStringKernel.
Definition at line 593 of file Kernel.cpp.
set subkernel weights
| weights | subkernel weights | |
| num_weights | number of weights |
Reimplemented in CCombinedKernel, CWeightedDegreePositionStringKernel, and CWeightedDegreeStringKernel.
Definition at line 599 of file Kernel.cpp.
| bool CKernel::get_precompute_matrix | ( | ) | [inherited] |
| bool CKernel::get_precompute_subkernel_matrix | ( | ) | [inherited] |
| virtual void CKernel::set_precompute_matrix | ( | bool | flag, | |
| bool | subkernel_flag | |||
| ) | [virtual, inherited] |
set precompute matrix
| flag | flag | |
| subkernel_flag | subkernel flag |
Reimplemented in CCombinedKernel.
| void CKernel::set_property | ( | EKernelProperty | p | ) | [protected, inherited] |
| void CKernel::unset_property | ( | EKernelProperty | p | ) | [protected, inherited] |
| void CKernel::set_is_initialized | ( | bool | p_init | ) | [protected, inherited] |
| void CKernel::do_precompute_matrix | ( | ) | [protected, inherited] |
DREAL* CCommUlongStringKernel::sqrtdiag_lhs [protected] |
sqrt diagonal of left-hand side
Definition at line 253 of file CommUlongStringKernel.h.
DREAL* CCommUlongStringKernel::sqrtdiag_rhs [protected] |
sqrt diagonal of right-hand side
Definition at line 255 of file CommUlongStringKernel.h.
bool CCommUlongStringKernel::initialized [protected] |
if kernel is initialized
Definition at line 257 of file CommUlongStringKernel.h.
CDynamicArray<ULONG> CCommUlongStringKernel::dictionary [protected] |
dictionary
Definition at line 260 of file CommUlongStringKernel.h.
CDynamicArray<DREAL> CCommUlongStringKernel::dictionary_weights [protected] |
dictionary weights
Definition at line 262 of file CommUlongStringKernel.h.
bool CCommUlongStringKernel::use_sign [protected] |
if sign shall be used
Definition at line 265 of file CommUlongStringKernel.h.
type of normalization
Definition at line 267 of file CommUlongStringKernel.h.
INT CKernel::cache_size [protected, inherited] |
KERNELCACHE_ELEM* CKernel::kernel_matrix [protected, inherited] |
SHORTREAL* CKernel::precomputed_matrix [protected, inherited] |
bool CKernel::precompute_subkernel_matrix [protected, inherited] |
bool CKernel::precompute_matrix [protected, inherited] |
CFeatures* CKernel::lhs [protected, inherited] |
CFeatures* CKernel::rhs [protected, inherited] |
DREAL CKernel::combined_kernel_weight [protected, inherited] |
bool CKernel::optimization_initialized [protected, inherited] |
EOptimizationType CKernel::opt_type [protected, inherited] |
ULONG CKernel::properties [protected, inherited] |
CParallel CSGObject::parallel [static, inherited] |
Definition at line 105 of file SGObject.h.
CIO CSGObject::io [static, inherited] |
Definition at line 106 of file SGObject.h.
CVersion CSGObject::version [static, inherited] |
Definition at line 107 of file SGObject.h.