Fenwick tree

Fenwick tree
Binary indexed tree
Fenwick tree; Binary indexed tree
	Creation of a binary indexed tree for the array [1, 2, 3, 4, 5] by elementwise insertion
Type	tree
Invented	1989
Invented by	Boris Ryabko
Time complexity in big O notation
Algorithm
Algorithm	Average
Space	O(n)
Search	O(logn)
Insert	O(logn)
Delete	O(n)

A Fenwick tree or binary indexed tree (BIT) is a data structure that can efficiently update elements and calculate prefix sums in a table of numbers.

This structure was proposed by Boris Ryabko in 1989[1] with a further modification published in 1992.[2] It has subsequently become known under the name Fenwick tree after Peter Fenwick, who described this structure in his 1994 article.[3]

When compared with a flat array of numbers, the Fenwick tree achieves a much better balance between two operations: element update and prefix sum calculation. A flat array of $n$ numbers can either store the elements or the prefix sums. In the first case, computing prefix sums requires linear time; in the second case, updating the array elements requires linear time (in both cases, the other operation can be performed in constant time). Fenwick trees allow both operations to be performed in $O(\log n)$ time. This is achieved by representing the numbers as a tree with $n+1$ nodes where the value of each node in the tree is the prefix sum of the array from the index of the parent (inclusive) up to the index of the node (exclusive). The tree itself is implicit and can be stored as an array of $n$ numbers, with the implicit root node omitted from the array. The tree structure allows the operations of element retrieval, element update, prefix sum, and range sum to be performed using only $O(\log n)$ node accesses.

Motivation

Given a table of elements, it is sometimes desirable to calculate the running total of values up to each index according to some associative binary operation (addition on integers being by far the most common). Fenwick trees provide a method to query the running total at any index, in addition to allowing changes to the underlying value table and having all further queries reflect those changes.

Fenwick trees are particularly designed to implement the arithmetic coding algorithm, which maintains counts of each symbol produced and needs to convert those to the cumulative probability of a symbol less than a given symbol. Development of operations it supports were primarily motivated by use in that case.

Using a Fenwick tree it requires only $O(\log n)$ operations to compute any desired cumulative sum, or more generally the sum of any range of values (not necessarily starting at zero). It is also possible to construct extensions of this data structure to quickly compute cumulative sums on $d$ -dimensional arrays in $O(\log ^{d}n)$ time.[4]

Description

A Fenwick tree is most easily understood by considering a one-based array $A[n]$ with $n$ elements. The corresponding Fenwick tree has $n+1$ nodes with an implicit node 0 at the root. Each level $k$ of the tree contains nodes with indices corresponding to sums of $k$ distinct powers of 2 (with $k=0$ representing an empty sum 0). For example, level $k=1$ contains nodes $1=2^{0},2=2^{1},4=2^{2},...$ and level $k=2$ contains nodes $3=2^{1}+2^{0},5=2^{2}+2^{0},6=2^{2}+2^{1},...$ The parent of a given node can be found by clearing the last set bit (LSB) in its index, corresponding to the smallest power of 2 in its sum. For example, the parent of 6 = 110₂ is 4 = 100₂ .

The below diagram shows the structure of a 16-node Fenwick tree, corresponding to a 15-element array A:

Depiction of a 16-node Fenwick tree containing range sums of a 15-node array A

Let $A(i,j]=\{A[k]\}_{k=i+1}^{j}$ . The value of a node at index $i$ corresponds to the range sum of elements in $A(\mathrm {parent} (i),i]$ , that is, the values of A starting after the parent's index up to the current node's index, inclusive. The elements $A(\mathrm {parent} (i),i]$ are considered to be the "range of responsibility" for the current node[3] and consist of $\mathrm {lsb} (i)=(i\ \&\ (-i))$ (where & denotes bitwise AND) elements. Note that the indices in this range do not directly correspond to children of $i$ : for example, the range of responsibility for node 2 is $A(0,2]=\{A[1],A[2]\}$ but node 1 is not a child of node 2. The root node 0 contains the sum of the empty range $A(0,0]=\{\}$ with value 0.

Typically, the Fenwick tree is implemented as an implicit data structure using a flat array analogous to implementations of a binary heap. In this representation, the root node 0 is omitted and the array indices directly correspond to node indices in the tree (using 1-based indexing).

The initial process of building the Fenwick tree over a table of values runs in $O(n)$ time. Other efficient operations include locating the index of a value if all values are positive, or all indices with a given value if all values are non-negative. Also supported is the scaling of all values by a constant factor in $O(n)$ time.

Range query

To find the prefix sum up to any given index (the "range query" operation), add up the values of nodes along the path from the current index to the root of the tree (which is trivially an empty sum, 0). The number of values to add (excluding the implicit root) is equal to the number of 1-bits in the index, and is at most $\lceil \log _{2}n\rceil$ , giving a time complexity of $O(\log n)$ .

For example, say one wishes to find the sum of the first eleven values. Eleven is 1011₂ in binary. This contains three 1-bits, so three node values must be added: those at nodes 11=1011₂, 10=1010₂, and 8=1000₂ (the implicit root node of 0000₂ can be ignored). These contain the sums of A[11], A[9,10], and A[1,8], respectively (where A[i,j] = {A[i], A[i+1], ..., A[j]}) .

Point update

To update the value in a Fenwick tree at index $i$ (corresponding to assigning $A[i]$ a new value) (the "point update" operation):

Compute the delta ${\textstyle \delta =A[i]_{\mathrm {new} }-A[i]_{\mathrm {old} }}$ .
Then, while index $i\leq n$ $i\leq n$ :
- Update the index by adding LSB: $i\leftarrow i+(i\ \&\ (-i))$
- Add $\delta$ to the value at node $i$ .

Intuitively, this can be thought of as updating each node (starting with $i$ and iterating in increasing order) whose range of responsibility includes $A[i]$ .

For example, updating value 11=1011₂ in a 16-element array requires updating 1011₂, (1011₂ + 1₂ = 1100₂), and (1100₂ + 100₂ = 10000₂). These contain the sums of A[11], A[9,12], and A[1,16], respectively. Again, the maximum number of nodes that need to be updated is limited by the number of bits in the size of the array, $\lceil \log _{2}n\rceil$ , and the time complexity is thus $O(\log n)$ .

Building the tree

A naive way to construct the tree would be to initialize the tree values to 0 and perform $n$ point updates, yielding a time complexity of $O(n\log n)$ . However, an alternate approach uses dynamic programming to build the tree up incrementally along increasing indices. Initialization proceeds as follows:

Copy array $A[n]$ into array $T[n]$ containing tree values
For each index i from 1 to n:
- Let $j=i+(i\ \&\ (-i))$ . This is the first node larger than $i$ that contains $A[i]$ in its range of responsibility.
- If $j\leq n$ , update value of node at j: $T[j]\leftarrow T[j]+T[i]$

It can be shown that at the point that index $i$ is reached in the loop, $T[i]$ is already the correct Fenwick value for node $i$ . If $i$ is a leaf node (with only one element in its range of responsibility) this holds trivially. If $i$ is an internal node, the loop ensures that it has been updated with all range sums within its own range, so it will contain the correct Fenwick value. For example, at step $i=4$ , node 4 (with initial value $A[4]$ ) will have been incremented by $T[2]$ (which contains the sum $A[1]+A[2]$ ) and $T[3]$ (which contains the value $A[3]$ ), and thus will contain the range sum $A(0,4]$ of the first 4 elements of the tree.

Since each update happens at most once per index, the resulting algorithm is $O(n)$ time complexity. This can also be done in-place in the original array, eliminating the copy and additional storage for $T[n]$ .

An analogous operation can be performed to convert a Fenwick tree back into the original frequency array $A[n]$ , by iterating backwards from n to 1 and subtracting the value of $j=i+(i\ \&\ (-i))$ from the value of $i$ at each timestep.

A less efficient $O(n)$ algorithm for constructing the tree works in two passes: first convert $A[n]$ to a prefix sum (which takes $O(n)$ time), then iterating backwards from n..1, compute each node's range sum by computing the difference of prefix sums: $\Sigma A(\mathrm {parent} (i),i]=\Sigma A(0,i]-\Sigma A(0,\mathrm {parent} (i)]$ .[5]

Extension to multiple dimensions

A 2D Fenwick tree (2D BIT) can be constructed as a 1D Fenwick tree where each node of the tree contains a 1D Fenwick tree,[4] and stores range sums for submatrices of a 2D matrix A[m,n]. The tree representation remains implicit and the data is stored in an m x n array where position (i, j) in the array corresponds to the jth node of the ith Fenwick tree.

Point update and range query are implemented similarly to the 1D case, but using nested loops (iterating along the column dimension (sub-tree) in the inner loop, and iterating along the row dimension (main tree) in the outer loop).

This idea can be extended to tensors with arbitrary number of dimensions d, with time complexity $O(\log ^{d}n)$ for range-query and point-update (assuming the tensor is size n along each axis).

Implementation

Basic implementation in C

// SIZE should be 1 + a power of 2.
int A[SIZE];

// Least Significant Bit of i having a value of 1
#define LSB(i) ((i) & -(i))

// Returns the sum of the first i elements (indices 0 to i)
// Equivalent to range_sum(0, i)
int prefix_sum(int i) {
	int sum = A[0];
	for (; i != 0; i -= LSB(i))
		sum += A[i];
	return sum;
}

// Add delta to element with index i (zero-based)
void add(int i, int delta) {
	if (i == 0) {
		A[0] += delta;
		return;
	}
	for (; i < SIZE; i+= LSB(i))
		A[i] += delta;
}

Useful functions in C

// Returns the sum of elements from i + 1 to j
// Equivalent to prefix_sum(j) - prefix_sum(i), but slightly faster
int range_sum(int i, int j) {
	int sum = 0;
	for (; j > i; j -= LSB(j))
		sum += A[j];
	for (; i > j; i -= LSB(i))
		sum -= A[i];
	return sum;
}

// Convert A[] in place to Fenwick tree form
void init(void) {
	for (int i = 1; i < SIZE; ++i) {
		int j = i + LSB(i);
		if (j < SIZE)
			A[j] += A[i];
	}
}

// Convert back to array of per-element counts
void fini(void) {
	for (int i = SIZE - 1; i > 0; --i) {
		int j = i + LSB(i);
		if (j < SIZE)
			A[j] -= A[i];
	}
}

// Return a single element's value
int get(int i) {
	return range_sum(i, i + 1);
}

// Set (as opposed to adjust) a single element's value
void set(int i, int value) {
	add(i, value - get(i));
}

// Find the largest i with prefix_sum(i) <= value.
// NOTE: Requires that all values are non-negative!
unsigned int rank_query(int value) {
	int i = 0, j = SIZE - 1;
	// j is a power of 2.

    value -= A[0];
	for (; j > 0;  j >>= 1) {
		if (i + j < SIZE && A[i + j] <= value) {
			value -= A[i + j];
			i += j;
		}
	}
	return i;
}

Implementation in C++

class FenwickTree {
private:
    vector<int> data;

    int getParent(int i) const {
        return i - (i & (-i));
    }

    int getNext(int i) const {
        return i + (i & (-i));
    }

public:
    FenwickTree(int n) : data(n+1, 0) {
    }

    int getSum(int i) const {
        int sum = 0;
        ++i;
        while (i > 0) {
            sum += data[i];
            i = getParent(i);
        }
        return sum;
    }

    void update(int i, int v) {
        ++i;
        while (i < data.size()) {
            data[i] += v;
            i = getNext(i);
        }
    }
};

Applications

Keeping track of positions problem

Assume one have a set of elements, with an integer displacement. Then Array[i] would store relative displacement to previous element, where i is an element number. By taking range sum between i and j, one would get the relative distance between object i and j.

A concrete real example of this would be tags in html file, where one keep track of line numbers of every element when a element is moved. To get a position, one simply take the range sum between two elements.

References

Boris Ryabko (1989). "A fast on-line code" (PDF). Soviet Math. Dokl. 39 (3): 533–537.
Boris Ryabko (1992). "A fast on-line adaptive code" (PDF). IEEE Transactions on Information Theory. 28 (1): 1400–1404.
Peter M. Fenwick (1994). "A new data structure for cumulative frequency tables". Software: Practice and Experience. 24 (3): 327–336. CiteSeerX 10.1.1.14.8917. doi:10.1002/spe.4380240306. S2CID 7519761.
Pushkar Mishra (2013). "A New Algorithm for Updating and Querying Sub-arrays of Multidimensional Arrays". arXiv:1311.6093. doi:10.13140/RG.2.1.2394.2485. S2CID 1561192. {{cite journal}}: Cite journal requires |journal= (help)
"Fenwick tree: initialization in O(N)". Codeforces. Retrieved 2023-01-23.

External links

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.

[1] Boris Ryabko (1989). "A fast on-line code" (PDF). Soviet Math. Dokl. 39 (3): 533–537.

[2] Boris Ryabko (1992). "A fast on-line adaptive code" (PDF). IEEE Transactions on Information Theory. 28 (1): 1400–1404.

[:0-3] Peter M. Fenwick (1994). "A new data structure for cumulative frequency tables". Software: Practice and Experience. 24 (3): 327–336. CiteSeerX 10.1.1.14.8917. doi:10.1002/spe.4380240306. S2CID 7519761.

[:1-4] Pushkar Mishra (2013). "A New Algorithm for Updating and Querying Sub-arrays of Multidimensional Arrays". arXiv:1311.6093. doi:10.13140/RG.2.1.2394.2485. S2CID 1561192. {{cite journal}}: Cite journal requires |journal= (help)

[5] "Fenwick tree: initialization in O(N)". Codeforces. Retrieved 2023-01-23.

Tree data structures
Search trees (dynamic sets/associative arrays)	2–3 2–3–4 AA (a,b) AVL B B+ B* B^x (Optimal) Binary search Dancing HTree Interval Order statistic (Left-leaning) Red–black Scapegoat Splay T Treap UB Weight-balanced
Heaps	Binary Binomial Brodal Fibonacci Leftist Pairing Skew van Emde Boas Weak
Tries	Ctrie C-trie (compressed ADT) Hash Radix Suffix Ternary search X-fast Y-fast
Spatial data partitioning trees	Ball BK BSP Cartesian Hilbert R k-d (implicit k-d) M Metric MVP Octree PH Priority R Quad R R+ R* Segment VP X
Other trees	Cover Exponential Fenwick Finger Fractal tree index Fusion Hash calendar iDistance K-ary Left-child right-sibling Link/cut Log-structured merge Merkle PQ Range SPQR Top