Abstract
This paper presents an efficient distributed multiple selection algorithm designed to select multiple keys simultaneously from different data sets which are distributed to many computers in a peer-to-peer system. The communication time is usually much longer than the computation time and is thus a major criterion for measuring the performance of a distributed algorithm. The objective of this algorithm is to reduce the number of communication messages. The algorithm makes use of statistical knowledge and results in a smaller communication overhead compared with existing algorithms. In this algorithm, each computer will select keys as pivots (candidates for the answers) according to statistical knowledge and transmit them to other computers in the system. Each computer will compare each pivot with key values in its local file and respond by transmitting ranks to the originating computer. The originating computer will calculate the global ranks and verify whether the pivots are the answers. Each computer will broadcast once sequentially in each round. This broadcasting process will be repeated until all answers are found. Complexity analyses are presented to provide theoretical upper bounds of this algorithm. The correctness of the theoretical predication is verified by experiments with 40 computers connected using Web technologies.
Original language | English |
---|---|
Pages (from-to) | 234-248 |
Number of pages | 15 |
Journal | Journal of Systems and Software |
Volume | 78 |
Issue number | 3 |
DOIs | |
Publication status | Published - 1 Dec 2005 |
Keywords
- Multiple selection algorithm
- complexity analysis
- distributed selection
- intranet
- peer-to-peer systems
- web technologies