This paper presents the complexity analysis and empirical results of a distributed selection algorithm. The algorithm uses the statistical properties of the data file. The objective of the algorithm is to minimize the number of communication messages required for the whole selection process. The algorithm is designed to select the y th smallest key from a very large file which is physically distributed over many sites (stations). The size of the file is so large that it is not feasible or efficient to transfer all data to a single node as no node has sufficient memory space for internal sorting. The selection work will be shared by all sites involved and the load balancing is also ensured by the algorithm. The complexity of the algorithm is O(log log N/P) for a network with P stations and a file with N records.
|Title of host publication||International Series on Advances in High Performance Computing|
|Publisher||Computational Mechanics Publ, Ashurst, United Kingdom|
|Number of pages||9|
|Publication status||Published - 1 Jan 1997|