dasthan

Alamut

Archive for the month “September, 2013”

Binary Indexed Trees

Given the frequencies of occurrences of a particular data set, our task is to compute the cumulative frequencies of the data set. Naive Solutions take about O(n) time. But Binary Indexed Trees, which are quite commonly used for computing Cumulative Frequencies efficiently, takes about O(log n).

In case you need a comparison of the asymptotic growth of O(n) and O(log n), take a look at the graph here. This would make huge difference if there are many queries asking for cumulative frequencies.

There is quite a lot of information regarding Binary Indexed Trees out there. Here is my implementation of it.

#ifndef BIT_HPP
#define BIT_HPP
#include<vector>
template<class T>
struct BITNode{
    T freq;
    T cuf;
    BITNode(){
        freq = 0;
        cuf = 0;
    }
};
template<class T>
class BIT{
private:
    typedef std::vector<BITNode<T> > VN;
    void _setElem(int idx,T freq,T cuf){
        elems[idx - 1].freq = freq;
        elems[idx - 1].cuf = cuf;
    }

    BITNode<T> _getElem(unsigned int idx){
        return elems[idx - 1];
    }

    void _propUp(int idx, T change){
        if(change == 0){
            return;
        }

        while(idx <= size){
            BITNode<T> old = _getElem(idx);
            _setElem(idx,old.freq,old.cuf + change);
            idx += (idx & -idx);
        }
    }
public:
    VN elems;
    int size;
    BIT(int size = 0){
        elems.resize(size);
        for(int i = 0; i < size; i++){
            elems[i].freq = 0;
            elems[i].cuf = 0;
        }
         this->size = size;
    }

    void calc(int idx, T val){
        T cuf = val;
        int targetIdx = idx - (idx & -idx);
        int i = idx - 1;
        while(i > targetIdx){
            cuf += _getElem(i).cuf;
            i -= (i & -i);
        }
        _setElem(idx,val,cuf);
    }

    void update(int idx, T val){
        if(idx > size){
            for(int i = size; i < idx; i++){
                BITNode<T> tmp;
                elems.push_back(tmp);
            }
            for(int i = size + 1; i < idx + 1; i++){
                calc(i,0);
            }
            size = idx;
        }
        BITNode<T> old = _getElem(idx);
        T change = val - old.freq;
        _setElem(idx,val,old.cuf);
        _propUp(idx,change);
    }
    T sum(int idx){
        T sum = 0;
        while(idx > 0){
            sum += _getElem(idx).cuf;
            idx -= (idx & -idx);
        }
        return sum;
    }

    void display(){
        for(int i = 0; i < size;i++){
            std::cout << i+1 <<": ("<<elems[i].freq<<", "
            <<elems[i].cuf<<"): "<<sum(i+1)<<std::endl;
        }
    }
};

Latest Updates on this code will be available in my github repo.

A very similar data structure is Segment Tree. An elegant implementation of segment tree can be found here. Segment Trees can also be employed for efficient queries for cumulative frequencies. But BIT is much easier to code.

In case you need to know when to use Segment Trees or Binary Indexed Trees, read Quote of Raziman Thottungal Valapu’s answer to Data Structures: How does one decide when to use a Segment Tree or Fenwick Tree? on Quora.

Post Navigation