Login |
 
 

Parallel Data Mining

Overview

Achievement: SWS : 2, Credits 4

Course type : Seminar

Language: English

Overview

Introduction Slides

 

Content

Data mining allows for the extraction of knowledge from huge amounts of data. Today, collections of data grow rapidly in terms of both dimension and the number of transactions. Mining this vast, increasing amount of data requires more and more time and resources; three common methods are used to solve this problem: faster hardware, optimizing/tuning the mining algorithms and parallelizing the mining process.

Parallel Data Mining is a way of identifying independent parts of serial data mining processes that can be executed independently. Each independent sub-process is subsequently executed on different machines, cores, threads, etc.

The seminar focuses on parallelization methods for data mining algorithms, ranging from data partitioning, candidate/search space partitioning or a combination of these methods. It concentrates on different ways of parallelization and discusses their advantages and disadvantages and potential use cases.

Literature

See the Introduction Slides

Course criteria

1.Seminar Presentation :

– Check 1-2 weeks before with your advisor

– 30 minutes, including questions and discussion

2. Summary

– 8-10 pages in LNCS-Style (LaTeX) until 30/09/2010

The latex layout which has to be used can be found here.

 

Prerequisites

basic knowlede in data mining