SLIM for market basket analysisΒΆ
In this example, we are going to train a SLIM model on a transactional database
SLIM uses the Minimum Description Length principle to make pattern mining easier, as the resulting patterns will be a lossless compression of the original data
You end up having less data to consider, and your life just gets easier
[2]:
import skmine
print("This tutorial was tested with the following version of skmine :", skmine.__version__)
This tutorial was tested with the following version of skmine : 1.0.0
[2]:
import pandas as pd
from skmine.itemsets import SLIM
SLIM can be used to perform Market Basket Analysis
Here we define a set of transactions containing items bought in a store
[3]:
D = [
    ['bananas', 'milk'],
    ['milk', 'bananas', 'cookies'],
    ['cookies', 'butter', 'tea'],
    ['tea'],
    ['milk', 'bananas', 'tea'],
]
D
[3]:
[['bananas', 'milk'],
 ['milk', 'bananas', 'cookies'],
 ['cookies', 'butter', 'tea'],
 ['tea'],
 ['milk', 'bananas', 'tea']]
[4]:
slim = SLIM(pruning=True)
slim.fit_transform(D)
[4]:
| itemset | usage | |
|---|---|---|
| 0 | [bananas, milk] | 3 | 
| 1 | [tea] | 3 | 
| 2 | [cookies] | 2 | 
| 3 | [butter] | 1 | 
What if a new user comes to the store and buy some items ? We add its shopping cart to the data, like so
[5]:
D.append(['jelly', 'bananas', 'cookies'])
D
[5]:
[['bananas', 'milk'],
 ['milk', 'bananas', 'cookies'],
 ['cookies', 'butter', 'tea'],
 ['tea'],
 ['milk', 'bananas', 'tea'],
 ['jelly', 'bananas', 'cookies']]
Just retraining SLIM will give us a freshly updated summary of our market baskets
[6]:
SLIM().fit_transform(D)
[6]:
| itemset | usage | |
|---|---|---|
| 0 | [bananas, milk] | 3 | 
| 1 | [bananas, jelly] | 1 | 
| 2 | [cookies] | 3 | 
| 3 | [tea] | 3 | 
| 4 | [butter] | 1 |