A comprehensive overview of useful data mining algorithm
The book Programming Collective Intelligence from Toby Segaran is a practical useful guide through the most common data mining, more exact classification, algorithms. The book covers the traditional algorithms as decision trees, naive Bayesian classifier, neural networks, clustering and not so common ones as support-vector machines and non-negative matrix factorization. The last one was new for me. The first paper seems not to have surfaced before 2000 so it is relatively new technique and has shown very good results in the case of feature extraction of large numerical spaces. Also optimizing functions like simulated annealing and genetic algorithm are mentioned. Interestingly, there is also a small example for genetic programming.
All algorithms are explained with examples and small programs in Python. The only thing I’ve missed is a mathematical representation in addition to the explanation itself, but at least all utility functions are explained more formal in the appendix.
No comments yet
Leave a reply
