SLIM for market basket analysisΒΆ

In this example, we are going to train a SLIM model on a transactional database

SLIM uses the Minimum Description Length principle to make pattern mining easier, as the resulting patterns will be a lossless compression of the original data

You end up having less data to consider, and your life just gets easier

import skmine

print("This tutorial was tested with the following version of skmine :", skmine.__version__)
This tutorial was tested with the following version of skmine : 1.0.0
import pandas as pd
from skmine.itemsets import SLIM

SLIM can be used to perform Market Basket Analysis

Here we define a set of transactions containing items bought in a store

D = [
    ['bananas', 'milk'],
    ['milk', 'bananas', 'cookies'],
    ['cookies', 'butter', 'tea'],
    ['milk', 'bananas', 'tea'],
[['bananas', 'milk'],
 ['milk', 'bananas', 'cookies'],
 ['cookies', 'butter', 'tea'],
 ['milk', 'bananas', 'tea']]
slim = SLIM(pruning=True)
itemset usage
0 [bananas, milk] 3
1 [tea] 3
2 [cookies] 2
3 [butter] 1

What if a new user comes to the store and buy some items ? We add its shopping cart to the data, like so

D.append(['jelly', 'bananas', 'cookies'])
[['bananas', 'milk'],
 ['milk', 'bananas', 'cookies'],
 ['cookies', 'butter', 'tea'],
 ['milk', 'bananas', 'tea'],
 ['jelly', 'bananas', 'cookies']]

Just retraining SLIM will give us a freshly updated summary of our market baskets

itemset usage
0 [bananas, milk] 3
1 [bananas, jelly] 1
2 [cookies] 3
3 [tea] 3
4 [butter] 1