A comprehensive overview of useful data mining algorithm

The book Programming Collective Intelligence from Toby Segaran is a practical useful guide through the most common data mining, more exact classification, algorithms. The book covers the traditional algorithms as decision trees, naive Bayesian classifier, neural networks, clustering and not so common ones as support-vector machines  and non-negative matrix factorization. The last one was new for me. The first paper seems not to have surfaced before 2000 so it is relatively new technique and has shown very good results in the case of feature extraction of large numerical spaces. Also optimizing functions like simulated annealing and genetic algorithm are mentioned. Interestingly, there is also a small example for genetic programming.
All algorithms are explained with examples and small programs in Python. The only thing I’ve missed is a mathematical representation in addition to the explanation itself, but at least all utility functions are explained more formal in the appendix.


No comments yet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: