Maximum Number Of Keys In B Tree Of Height K

B-tree is a tree data structure that keeps data always in the sorted order. The time complexity for the operations like searching, insertions, deletions, and sequential access is logarithmic amortized time. In other way, B-tree is a generalization of a binary search tree in which more than two paths may diverge from a single node and all leaf nodes must be at the same level. B-trees are most commonly used in databases and filesystems. B Trees are balanced search tree which are designed to work on secondary storage devices like magnetic disks, hard disks etc. As branchimg factor of Btress can be very large so they can store large amount of data easily. Some algorithms are being implemented to copy the fixed number of pages from disk in to the primary memory like RAM and also write down on to the disks those pages which have been altered. Generally branching factor in between 50 and 2000 are being used to store a large B Tree on disks. If a database contains n key values then worst case running time of searching an un indexed and unsorted key value will be of order of n that is O(n) but if the same data is indexed with a B-tree, the same search operation will run in O(log n) which is always asymptotic less than O(n). To perform a search for a single key on a set of one million keys (1,000,000), As it is well known fact that a linear search will require at most 1,000,000 comparisons but If the same data is indexed with a B-tree of minimum degree (10), then only 114 comparisons will be required in the worst case. So indexing large amounts of data can significantly improve search performance. Although other balanced tree structures can be used, a B-tree also optimizes costly disk accesses that are of concern when dealing with large data sets and operation of searching, inserting, deleting etc. is being done rapidly. In this paper, the mathematical derivation of the maximum number of keys in B Tree of height k and minimum degree m is described, tested and found to be absolutely correct.

1.    Introduction: As B Trees[1,2,7,8,9] are  balanced search trees which are designed to store large amount of data efficiently. The performance of B Tree is not   being computed by dynamic set operations but also by how many discs access are being performed. The number of times of disc access is directly proportional to the height of B Tree. As it is a well known fact that B Tree is a rooted tree which must have the following properties:-
 The fields in every node y contains n[y] which is total number of keys currently stored in node y and n[y] must be stored in a particular node in increasing order .

Each internal node y has n[y]+1 pointers but pointer fields of leaf nodes are undefined because leaf node does not have any child. Also all the leaf nodes should have same level in B Tree. For every B Tree a positive integer m>=2  exist which is known as minimum degree of B Tree and minimum number of keys that a node can store except the root node=m-1 because root node may have minimum 1 key. The maximum   number  of keys that can be stored in a particular node of B Tree of minimum degree  =2m-1. If B Tree has minimum degree 2 then minimum number of keys that a node can store except the root node=m-1=2-1=1. The maximum   number  of keys that can be stored in a particular node of B Tree of minimum degree 2  =2m-1=2*2-1=3 . As it is well known fact that in most of the applications of B Tree data handled at a particular instant of time is very large and that data don’t fit into main memory at once. So some algorithms copy the selected pages from the disk in the main memory and if there is some changes on the page then some algorithms write back the changed page on the disk. The design of B tree algorithm is being done in such a way that only fixed number of pages exist in the main memory at a particular instant of the time.  
B Tree with the minimum degree 2 and with 4 numbers of nodes and with height 1 is shown below-

2.    Related Work: In this paper, a generalized mathematical formula is being derived  for maximum number of keys that can be stored in B Tree of height k and with the minimum degreem[2,3,4]. As m is the minimum degree of B Tree so the maximum number of keys in a particular node=2m -1. So root node can also have maximum 2m -1 keys.

The  maximum number of nodes   at 0th level of B Tree=1
The  maximum number of  nodes at 1st  level of B Tree= 2m
The  maximum number of  nodes at 2nd   level of B Tree= (2m)2
The  maximum number of  nodes at 3rd   level of B Tree= (2m)3

Similarly, The  maximum number of  nodes at 4th   level of B Tree= (2m)4. As the height of B Tree is k  so The  maximum number of  nodes at kth    level of B Tree= (2m)k.As any node in B Tree can have maximum 2m-1 key. So maximum number of keys at a particular level can be calculated can be calculated by multiplying maximum number of  nodes at a particular level with 2m-1.
So The  maximum number of keys at 0th level of B Tree=2m-1
The  maximum number of  keys at 1st  level of B Tree= 2m(2m-1).
The  maximum number of  keys at 2nd   level of B Tree= (2m)2(2m-1)
The  maximum number of  keys at 3rd   level of B Tree= (2m)3(2m-1)
The  maximum number of  keys at 4th   level of B Tree= (2m)4(2m-1)
Similarly, The  maximum number of  keys at kth    level of B Tree= (2m)k(2m-1)
So the maximum number of keys in B Tree can be easily calculated by algebraic sum of the maximum number of keys at a particular level.
If n is the total number of keys in  B Tree of height k and with the minimum degree m.
So
n<=   maximum number of keys at 0th level +  maximum number of  keys at 1st  level + maximum number of  keys at 2nd   level + maximum number of  keys at 3rd   level  + _ _  _ _ _ __ _ _ _ _ _ _ __ _ _ _ _ __ _ _ __ _ _ _ _ __ _ _+  maximum number of  keys at kth level.
By putting values on R.H.S. We  get
n<=  (2m-1) + 2m(2m-1) +(2m)2(2m-1) + (2m)3(2m-1) + (2m)4(2m-1) +_  _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ +(2m)k(2m-1)
which is a geometrical progression
n<=(2m)k+1 -1
which is the final expression for n in terms of m and k.
The above expression can be written as in logarithmic form as
(log(n+1) /log(2m)) -1 <=k
if k=0 , then B Tree has only root node , by putting k=0 in the final derivation
n<=2m-1
which is always true.
Similarly if we put k=1
We get n<=(2m)2 -1 which is also true.

3.    Conclusion: A mathematical derivation is being done to calculated the maximum number of keys that can be stored in  B Tree of height k and with minimum degree m. The final derivation is also tested by putting different values of k and found to be absolutely correct.


4.    References
1.    E. Horowitz and S. Sahni, (1982) "Fundamental of Computer Algorithms", Computer Science.
2.    T. Cormen, C. Leiserson and R. Rivest, (1991)"Introduction to Algorithms", Published by MIT.
3.    A.V. Aho, J.E. Hopcroft and J.D. Ullman, (1974)"The Design and Analysis of Computer Algorithms".
4.    A.V. Aho, J.E. Hopcroft and J.D. Ullman, (1984) "Data Structures and Algorithms", Addison-Wesley.
5.    S. Baase, "Computer Algorithms: Introduction to Design and Analysis"(1988), Addison-Wesley.
6.    L. Banachowski, A. Kreczmar and W. Rytter, (1991) "Analysis of Algorithms and Data Structures",Addison-Wesley.
7.    R. Baeza-Yates, "Teaching Algorithms"(1995), IV Iberoamerican Congress on Computer Science Education, Canela, Brazil, July.
8.    J. Nievergelt and K. Hinrichs, (1993) "Algorithms and Data Structures with Applications to Graphics and Geometry", Prentice-Hall.
9.    G- Brassard, and P. Bratley, (1988) "Algorithmics: Theory and Practice", Prentice-Hall.
10.    E.M. Reingold, J. Nievergelt and N. Deo, (1977) "Combinatorial Algorithms: Theory and Practice",Prentice-Hall.
11.    H. Will, (1986) "Algorithms and Complexity", Prentice-Hall.
12.    M. Crochemore and W. Rytter, (1994) "Text Algorithms", Oxford Press.
13.    J.S. Vitter and P. Flajolet, (1990) "Average-case analysis of Algorithms and Data Structures", Handbook of Theoretical Computer Science (volume A), Elsevier and MIT Press, pp 431-524.
14.    F.P. Preparata and M.I. Shamos. (1985) "Computational Geometry: An Introduction", Springer, Verlag.