Login |
 
 

Parallel Data Mining Unification

Nowadays, collections of data are growing enormously and need to processed properly to gain new knowledge and information, while
computing infrastructures are already available in many levels, from multi-core machine until grid computing. These naturally ask for new
approaches to perform data mining on distributed/parallel systems.

Current research in this area mostly treats the problems from the engineering point of view such as how to distribute a particular data
mining process efficiently on a parallel computer, how to optimize the communication or coordination among sub-process, and so on. Most
parallelization solutions fit only for a particular case for a particular computing infrastructure.

An unifying formalism for parallel data mining algorithms will form a generic parallelization model that is applicable to all data mining
algorithms. Ultimately this should lead to a framework which allows to distribute data mining algorithms efficiently without forcing each
individual implementation to re-consider issues such as communication and load balancing on different computing infrastructures.