skpp module

class skpp.ProjectionPursuitClassifier(r=10, fit_type='polyfit', degree=3, opt_level='high', example_weights='uniform', pairwise_loss_matrix=None, eps_stage=0.0001, eps_backfit=0.01, stage_maxiter=100, backfit_maxiter=10, random_state=None, show_plots=False, plot_epoch=50)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.ClassifierMixin

Perform classification with projection pursuit.

Parameters

pairwise_loss_matrix (array-like of dimension (n_classes, n_classes), default=None) – The adjacency matrix L has entries L[c,k]=l_ck specifying the weight of the penalty of predicting the answer is class k when it is actually class c. If unspecified, all penalties are considered to have the same importance.

See also

ProjectionPursuitRegressor

for definitions of other parameters

fit(X, Y)[source]

Train the model.

Parameters
  • X (array-like of shape (n_samples, n_features)) – The training input samples.

  • Y (array-like, shape (n_samples,) or (n_samples, n_outputs)) – The target values.

Returns

self – A trained model.

Return type

ProjectionPursuitClassifier:

predict(X)[source]

Use the fitted estimator to make predictions on new data.

Parameters

X (array-like of shape (n_samples, n_features)) – The input samples.

Returns

Y – The result of passing X through the evaluation function, taking the argmax of the output, and mapping it back to a class.

Return type

array of shape (n_samples)

class skpp.ProjectionPursuitRegressor(r=10, fit_type='polyfit', degree=3, opt_level='high', example_weights='uniform', out_dim_weights='inverse-variance', eps_stage=0.0001, eps_backfit=0.01, stage_maxiter=100, backfit_maxiter=10, random_state=None, show_plots=False, plot_epoch=50)[source]

Bases: sklearn.base.BaseEstimator, sklearn.base.TransformerMixin, sklearn.base.RegressorMixin, sklearn.base.MultiOutputMixin

This class implements the PPR algorithm as detailed in math.pdf.

Parameters
  • r (int, default=10) – The number of terms in the underlying additive model. The input will be put through r projections, r functions of those projections, and then multiplication by r output vectors to determine output.

  • fit_type ({'polyfit', 'spline'}, default='polyfit') – The kind of function to fit at each stage.

  • degree (int, default=3:) – The degree of polynomials or spline-sections used as the univariate approximator between projection and weighted residual targets.

  • opt_level ({'high', 'medium', 'low'}, default='high') – ‘low’ opt_level will disable backfitting. ‘medium’ backfits previous 2D functional fits only (not projections). ‘high’ backfits everything.

  • example_weights (string or array-like of dimension (n_samples,), default='uniform') – The relative importances given to training examples when calculating loss and solving for parameters.

  • out_dim_weights (string or array-like, default='inverse-variance') – The relative importances given to output dimensions when calculating the weighted residual (output of the univariate functions f_j). If all dimensions are of the same importance, but outputs are of different scales, then using the inverse variance is a good choice. Possible values: ‘inverse-variance’: Divide outputs by their variances, ‘uniform’: Use a vector of ones as the weights, array: Provide a custom vector of weights of dimension (n_outputs,)

  • eps_stage (float, default=0.0001) – The mean squared difference between the predictions of the PPR at subsequent iterations of a “stage” (fitting an f, beta pair) must reach below this epsilon in order for the stage to be considered converged.

  • eps_backfit (float, default=0.01) – The mean squared difference between the predictions of the PPR at subsequent iterations of a “backfit” must reach below this epsilon in order for backfitting to be considered converged.

  • stage_maxiter (int, default=100) – If a stage does not converge within this many iterations, end the loop and move on. This is useful for divergent cases.

  • backfit_maxiter (int, default=10) – If a backfit does not converge withint this many iterations, end the loop and move on. Smaller values may be preferred here since backfit iterations are expensive.

  • random_state (int, numpy.RandomState, default=None) – An optional object with which to seed randomness.

  • show_plots (boolean, default=False) – Whether to produce plots of projections versus residual variance throughout the training process.

  • plot_epoch (int, default=50:) – If plots are displayed, show them every plot_epoch iterations of the stage-fitting process.

fit(X, Y)[source]

Train the model.

Parameters
  • X (array-like of shape (n_samples, n_features)) – The training input samples.

  • Y (array-like, shape (n_samples,) or (n_samples, n_outputs)) – The target values.

Returns

self – A trained model.

Return type

ProjectionPursuitRegressor

predict(X)[source]

Use the fitted estimator to make predictions on new data.

Parameters

X (array-like of shape (n_samples, n_features)) – The input samples.

Returns

Y – The result of passing X through the evaluation function.

Return type

array of shape (n_samples) or (n_samples, n_outputs)

transform(X)[source]

Find the projections of X through all alpha vectors in the PPR.

\(A\) is a p x r matrix with projection vectors in each column, and \(X\) is an n x p matrix with examples in each row, so the inner product of the two stores projections.

Parameters

X (array-like of shape (n_samples, n_features)) – The input samples.

Returns

Projections – where r is the hyperparameter given to the constructor, the number of terms in the additive model, and the jth column is the projection of \(X\) through \(\alpha_j\).

Return type

an array of shape (n_samples, r)