Base

Base methods and classes for scikit-mine

class skmine.base.MDLOptimizer[source]

Base interface for all models applying the Minimum Description Length principle.

abstract generate_candidates(*args, **kwargs)[source]

Generate new candidates, to be sent for later evaluation.

Calling this function is equivalent to sending a new message given an encoding scheme, while calling .evaluate is equivalent to receiving this message, and evaluating the gain of information it provides.

Returns

A set of new candidates

Return type

object or Iterable[object]

abstract evaluate(candidate, *args, **kwargs)[source]

Evaluate the gain, i.e the gain of information when accepting the candidate.

Parameters

candidate (object) – A candidate to evaluate

Returns

Should return a tuple, with first two values corresponding to new data size and model size in the case of accepting the candidate.

Data size and model size should be returned separately as we encourage usage of (two-part) crude MDL.

Return type

tuple (data_size, model_size, ..)

class skmine.base.BaseMiner[source]

Base class for all miners in scikit-mine.

abstract fit(D, y)[source]

Fit method to be implemented.

abstract discover(*args, **kwargs)[source]

discover method to be implemented.

get_params(deep=False)[source]

Get parameters for this estimator.

Returns

params – Parameter names mapped to their values.

Return type

mapping of string to any

set_params(**params)[source]

Set the parameters of this estimator. The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters

**params (dict) – Estimator parameters.

Returns

self – Estimator instance.

Return type

object

class skmine.base.DiscovererMixin[source]

Mixin for all pattern discovery models in scikit-mine

fit_discover(D, y=None, **kwargs)[source]

Fit to data, the extract patterns

Parameters

D ({array-like, sparse matrix, dataframe} of shape (n_samples, n_features)) –

Returns

patterns discovered by a mining algorithm

Return type

pd.Series