SIGMA:Large Scale Machine Learning Toolkit
From the website:
The goal of this project is to provide a group of parallel machine learning functionalities which can meet the requirements of research work and applications typically with large scale data/features. The toolkit includes but not limited to: classification, clustering, Ranking, statistical analysis, etc and makes them run on hundreds of machines, thousands of CPU cores parallel. We also provide a SDK for researchers/developers to invent their own algorithms and accumulate them into the toolkit.
Algorithms in the toolkit:
- Parallel Classification
- Logistic Regression
- Boosting
- SVM
- PSVM
- PPegasos
- Neural Network
- Parallel Ranking
- LambdaRank
- RankBoost
- Parallel Clustering
- Kmeans
- Random Walk
- Parallel Regression
- Linear Regression
- Regression Tree
- Others
- Parallel-Regularized-SVD
- Parallel-LDA
- Optimization Library
- OWL-QN