Mlxtend是一个基于Python的开源项目,主要为日常处理数据科学相关的任务提供了一些工具和扩展,项目的Github地址:https://github.com/rasbt/mlxtend
在项目文档的User Guide一栏可以看到,mlxtend主要提供如下几个大类的工具模块
classifier
- Adaline
- EnsembleVoteClassifier
- LogisticRegression
- NeuralNetMLP
- Perceptron
regressor
- LinearRegression
regression_utils
- plot linear regression
feature_selection
- SequentialFeatureSelector
evaluate
- Confusion Matrix
- Plot decision regions
- Plot learning curves
- Scoring
preprocesssing
- DenseTransformer
- MeanCenterer
- Minmax scaling
- Shuffle arrays unison
- Standardize
data
- AutoMPG data
- Boston housing data
- Iris data
- Mnist data
- Load mnist
- Wine data
file_io
- Find filegroups
- Find files
general plotting
- Category scatter
- Enrichment plot
- Stacked barplot
math
- Num combinations
- Num permutations
text
- Generalize names
- Generalize names duplcheck
- Tokenizer
utils
- Counter
general concepts
- Activation functions
- Gradient optimization
- Linear gradient derivative
- Regularization linear
以上每个工具模块都附有相应的example、API和source code,可方便查阅。
附上项目首页的example:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import itertools
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from mlxtend.classifier import EnsembleVoteClassifier
from mlxtend.data import iris_data
from mlxtend.evaluate import plot_decision_regions
# Initializing Classifiers
clf1 = LogisticRegression(random_state=0)
clf2 = RandomForestClassifier(random_state=0)
clf3 = SVC(random_state=0, probability=True)
eclf = EnsembleVoteClassifier(clfs=[clf1, clf2, clf3], weights=[2, 1, 1], voting='soft')
# Loading some example data
X, y = iris_data()
X = X[:,[0, 2]]
# Plotting Decision Regions
gs = gridspec.GridSpec(2, 2)
fig = plt.figure(figsize=(10, 8))
for clf, lab, grd in zip([clf1, clf2, clf3, eclf],
['Logistic Regression', 'Random Forest', 'Naive Bayes', 'Ensemble'],
itertools.product([0, 1], repeat=2)):
clf.fit(X, y)
ax = plt.subplot(gs[grd[0], grd[1]])
fig = plot_decision_regions(X=X, y=y, clf=clf, legend=2)
plt.title(lab)
plt.show()
Leave a Comment