SLIM for market basket analysisΒΆ
In this example, we are going to train a SLIM model on a transactional database
SLIM uses the Minimum Description Length principle to make pattern mining easier, as the resulting patterns will be a lossless compression of the original data
You end up having less data to consider, and your life just gets easier
[2]:
import skmine
print("This tutorial was tested with the following version of skmine :", skmine.__version__)
This tutorial was tested with the following version of skmine : 1.0.0
[2]:
import pandas as pd
from skmine.itemsets import SLIM
SLIM can be used to perform Market Basket Analysis
Here we define a set of transactions containing items bought in a store
[3]:
D = [
['bananas', 'milk'],
['milk', 'bananas', 'cookies'],
['cookies', 'butter', 'tea'],
['tea'],
['milk', 'bananas', 'tea'],
]
D
[3]:
[['bananas', 'milk'],
['milk', 'bananas', 'cookies'],
['cookies', 'butter', 'tea'],
['tea'],
['milk', 'bananas', 'tea']]
[4]:
slim = SLIM(pruning=True)
slim.fit_transform(D)
[4]:
itemset | usage | |
---|---|---|
0 | [bananas, milk] | 3 |
1 | [tea] | 3 |
2 | [cookies] | 2 |
3 | [butter] | 1 |
What if a new user comes to the store and buy some items ? We add its shopping cart to the data, like so
[5]:
D.append(['jelly', 'bananas', 'cookies'])
D
[5]:
[['bananas', 'milk'],
['milk', 'bananas', 'cookies'],
['cookies', 'butter', 'tea'],
['tea'],
['milk', 'bananas', 'tea'],
['jelly', 'bananas', 'cookies']]
Just retraining SLIM will give us a freshly updated summary of our market baskets
[6]:
SLIM().fit_transform(D)
[6]:
itemset | usage | |
---|---|---|
0 | [bananas, milk] | 3 |
1 | [bananas, jelly] | 1 |
2 | [cookies] | 3 |
3 | [tea] | 3 |
4 | [butter] | 1 |