best python modules for machine learning, data mining, natural language processing, network analysis, and web scraping
This list is my summary of Quora question What are the best Python 2.7 modules for data mining?
This list is my summary of Quora question What are the best Python 2.7 modules for data mining?
Basics:
- numpy - numerical library, numpy.scipy.org/
- scipy - Advanced math, signal processing, optimization, statistics, www.scipy.org/
- matplotlib, python plotting - Matplotlib, matplotlib.org
- MDP, a collection of supervised and unsupervised learning algorithms, pypi.python.org/pypi/MDP/2.4
- mlpy, Machine Learning Python, mlpy.sourceforge.net
- NetworkX, for graph analysis, networkx.lanl.gov/
- Orange, Data Mining Fruitful & Fun, biolab.si
- pandas, Python Data Analysis Library, pandas.pydata.org
- pybrain, pybrain.org
- scikits-learn - Classic machine learning algorithms - Provide simple an efficient solutions to learning problems, scikit-learn.org/stable/
- NLTK, Natural Language Toolkit, nltk.org
- Scrapy, An open source web scraping framework for Python scrapy.org
- urllib/urllib2