skpp module¶
- class skpp.ProjectionPursuitClassifier(r=10, fit_type='polyfit', degree=3, opt_level='high', example_weights='uniform', pairwise_loss_matrix=None, eps_stage=0.0001, eps_backfit=0.01, stage_maxiter=100, backfit_maxiter=10, random_state=None, show_plots=False, plot_epoch=50)[source]¶
Bases:
BaseEstimator
,ClassifierMixin
Perform classification with projection pursuit.
- Parameters:
pairwise_loss_matrix (array-like of dimension (n_classes, n_classes), default=None) – The adjacency matrix L has entries L[c,k]=l_ck specifying the weight of the penalty of predicting the answer is class k when it is actually class c. If unspecified, all penalties are considered to have the same importance.
See also
ProjectionPursuitRegressor
for definitions of other parameters
- fit(X, Y)[source]¶
Train the model.
- Parameters:
X (array-like of shape (n_samples, n_features)) – The training input samples.
Y (array-like, shape (n_samples,) or (n_samples, n_outputs)) – The target values.
- Returns:
self – A trained model.
- Return type:
ProjectionPursuitClassifier:
- predict(X)[source]¶
Use the fitted estimator to make predictions on new data.
- Parameters:
X (array-like of shape (n_samples, n_features)) – The input samples.
- Returns:
Y – The result of passing X through the evaluation function, taking the argmax of the output, and mapping it back to a class.
- Return type:
array of shape (n_samples)
- class skpp.ProjectionPursuitRegressor(r=10, fit_type='polyfit', degree=3, opt_level='high', example_weights='uniform', out_dim_weights='inverse-variance', eps_stage=0.0001, eps_backfit=0.01, stage_maxiter=100, backfit_maxiter=10, random_state=None, show_plots=False, plot_epoch=50)[source]¶
Bases:
BaseEstimator
,TransformerMixin
,RegressorMixin
,MultiOutputMixin
This class implements the PPR algorithm as detailed in math.pdf.
- Parameters:
r (int, default=10) – The number of terms in the underlying additive model. The input will be put through r projections, r functions of those projections, and then multiplication by r output vectors to determine output.
fit_type ({'polyfit', 'spline'}, default='polyfit') – The kind of function to fit at each stage.
degree (int, default=3:) – The degree of polynomials or spline-sections used as the univariate approximator between projection and weighted residual targets.
opt_level ({'high', 'medium', 'low'}, default='high') – ‘low’ opt_level will disable backfitting. ‘medium’ backfits previous 2D functional fits only (not projections). ‘high’ backfits everything.
example_weights (string or array-like of dimension (n_samples,), default='uniform') – The relative importances given to training examples when calculating loss and solving for parameters.
out_dim_weights (string or array-like, default='inverse-variance') – The relative importances given to output dimensions when calculating the weighted residual (output of the univariate functions f_j). If all dimensions are of the same importance, but outputs are of different scales, then using the inverse variance is a good choice. Possible values: ‘inverse-variance’: Divide outputs by their variances, ‘uniform’: Use a vector of ones as the weights, array: Provide a custom vector of weights of dimension (n_outputs,)
eps_stage (float, default=0.0001) – The mean squared difference between the predictions of the PPR at subsequent iterations of a “stage” (fitting an f, beta pair) must reach below this epsilon in order for the stage to be considered converged.
eps_backfit (float, default=0.01) – The mean squared difference between the predictions of the PPR at subsequent iterations of a “backfit” must reach below this epsilon in order for backfitting to be considered converged.
stage_maxiter (int, default=100) – If a stage does not converge within this many iterations, end the loop and move on. This is useful for divergent cases.
backfit_maxiter (int, default=10) – If a backfit does not converge withint this many iterations, end the loop and move on. Smaller values may be preferred here since backfit iterations are expensive.
random_state (int, numpy.RandomState, default=None) – An optional object with which to seed randomness.
show_plots (boolean, default=False) – Whether to produce plots of projections versus residual variance throughout the training process.
plot_epoch (int, default=50:) – If plots are displayed, show them every plot_epoch iterations of the stage-fitting process.
- fit(X, Y)[source]¶
Train the model.
- Parameters:
X (array-like of shape (n_samples, n_features)) – The training input samples.
Y (array-like, shape (n_samples,) or (n_samples, n_outputs)) – The target values.
- Returns:
self – A trained model.
- Return type:
- predict(X)[source]¶
Use the fitted estimator to make predictions on new data.
- Parameters:
X (array-like of shape (n_samples, n_features)) – The input samples.
- Returns:
Y – The result of passing X through the evaluation function.
- Return type:
array of shape (n_samples) or (n_samples, n_outputs)
- transform(X)[source]¶
Find the projections of X through all alpha vectors in the PPR.
\(A\) is a p x r matrix with projection vectors in each column, and \(X\) is an n x p matrix with examples in each row, so the inner product of the two stores projections.
- Parameters:
X (array-like of shape (n_samples, n_features)) – The input samples.
- Returns:
Projections – where r is the hyperparameter given to the constructor, the number of terms in the additive model, and the jth column is the projection of \(X\) through \(\alpha_j\).
- Return type:
an array of shape (n_samples, r)