Machine Learning in Swift – Simple recommendation algorithm for Fitbit

Hi, this post is not a typical Deep Learning approach to a problem. Instead of this, it will show you, that sometimes simple data analysis can allow to write simple yet efficient algorithm. It will show you how to build a dynamic list of options driven by user behavior. Main idea is to recommend this option, which is most likely to be selected by user in particular moment. Sounds confusing? Let me explain it using an example. If you use apps like Fitbit or Endomondo for tracking your activity, you probably understand, how annoying is the process of choosing proper activity from the list. They use 2 different approaches, as for me, both are far from perfection.

Fitbit allows users to change the order of the activities list. As a result, you will most likely set the one you’re using the most at the top. The problem is that the list is constant. If I’m riding a bike a lot, I will probably set this option first. However, when I’m going for a run in days without a bike ride, I need to tap to find it. The same happens with a gym workout. Another approach was implemented by Endomondo. As far as I noticed, they always suggest the activity, which was selected most recently. A slightly better idea, but still not perfect.

If I usually ride a bike after work, on Monday afternoon this option should be first when i open my activity tracking app. If I usually go for a run on Saturday morning, my app should show running as a top option when I open it on Saturday after breakfast. You will probably agree that it would be cool right?

I started with analyzing my own records to find out if there are any patterns (i knew there are). Here are results from last 2 weeks:

# bike: Monday  8:35 AM, Monday 3:30 PM, Wednesday 8:16 AM, Wednesday 3:30 PM, Friday 8:27 AM, Friday 3:53 PM, Monday 8:43 AM, Monday 4:12 PM, Wednesday 7:51 AM, Wednesday 3:15 PM, Friday 8:02 AM, Friday 3:30 PM

# running: Tuesday 3:48 PM, Thursday 2:38 PM, Saturday 11:30 AM, Tuesday 4:03 PM, Thursday 3:56 PM, Saturday 12:07 PM

Ok, looks like I have a pretty boring schedule, easy to predict. Anyway, I talked to a couple of friends and their activities are also mostly stuck to some schedule. It convinced me, that implementing some simple algorithm to suggest most likely activity is possible. Let’s format this data a little bit to be easier to use in code.

1. Time will be the number of minutes from midnight divided by 100 to keep scale with a day of the week. (example: 8:00 AM -> 480 minutes / 100 -> 4.8)

2. Day of week (1-7)

3. Activity 0-7 from Fitbit list (run, weights, treadmill, workout, elliptical, bike, interval workout)

Data after mentioned change looks like this: 

[time, day of week, activity]: [5.15, 1, 5], [9.30, 1, 5], [4.96, 3, 5], [9.30, 3, 5], [5.07, 5, 5], [9.53, 5, 5], [5.23, 1, 5], [9.72, 1, 5], [4.71, 3, 5], [9.15, 3, 5], [4.82, 5, 5], [9.30, 5, 5], [9.48, 2, 0], [8.78, 4, 0], [7.27, 6, 0], [9.63, 2, 0], [9.54, 4, 0], [6.90, 6, 0]

Let’s see that on a plot:

Ok, now it’s easy to see some patterns, blue dots represent bike rides and green triangles show running. Now we have to find a good recommendation algorithm. Deep Neural Network? Honestly, that was one of my first thoughts. But we don’t have enough data and there would be lot of possible problems caused by this choice for this particular case. Random Forest? Sounds better, but still seems to be too sophisticated for this simple case. Machine Learning doesn’t have to stick to already popular solutions, we can try a different approach. What i noticed is that similar events are close to each other on the plot (same day, similar time). As a result it makes sense to just cluster events around current day and time position to check which is the most popular. With this approach it’s worth to remember, that one cycle (one week) has to be spent on gathering data for our algorithm. But after those 7 days, we have initially enough data to suggest proper (in most cases) options to choose. Ok, enough talking, let’s dig into swift code:

import Foundation

let data : [[NSNumber]] = [ [5.15, 1, 5],
                            [9.30, 1, 5],
                            [4.96, 3, 5],
                            [9.30, 3, 5],
                            [5.07, 5, 5],
                            [9.53, 5, 5],
                            [5.23, 1, 5],
                            [9.72, 1, 5],
                            [4.71, 3, 5],
                            [9.15, 3, 5],
                            [4.82, 5, 5],
                            [9.30, 5, 5],
                            [9.48, 2, 0],
                            [8.78, 4, 0],
                            [7.27, 6, 0],
                            [9.63, 2, 0],
                            [9.54, 4, 0],
                            [6.90, 6, 0] ]

enum EventType: Int {
    case run
    case weights
    case treadmill
    case workout
    case elliptical
    case bike
    case interval
}

struct Event {
    var time: Float
    var day: Int
    var type: EventType
    
    init(params: [NSNumber]) {
        self.time = params[0].floatValue
        self.day = params[1].intValue
        self.type = EventType(rawValue: params[2].intValue)!
    }
}

let events = data.map { (array) -> Event in
    return Event(params: array)
}

func suggest(time: Float, day: Int) -> EventType? {
    
    let timeRange: Float = 1.2 // 1.2 -> 120 mins -> 2h
    let closeEvents = events.filter { (event) -> Bool in
        if event.day == day && event.time >= time-timeRange && event.time <= time+timeRange{
            return true
        }
        return false
    }
    
    guard closeEvents.count > 0 else {
        return nil // no event type to suggest
    }

    var eventsCount = [Int](repeating: 0, count: 7) //there are 7 event types
    for event in closeEvents {
        eventsCount[event.type.rawValue] += 1 //increase counter for particular event type
    }
    
    //we find index of the most popular event type and use to init EventType, then just return
    return EventType(rawValue:eventsCount.index(of: eventsCount.max()!)!)
}

suggest(time: 5.08, day: 1) //bike
suggest(time: 10.08, day: 2) //run

Code is pretty simple and definitely not secure, neither efficient, I am aware of that. The main idea was to write a proof of concept and validate it. I tested algorithm with data from my couple previous weeks of Fitbit records and it worked perfectly. But i’m aware that my activity schedule is pretty predictable. Anyway i hope this post convinced you, to not always start with sophisticated solutions, sometimes good solution doesn’t demand tons of computing power.

Leave a Reply

Your email address will not be published. Required fields are marked *