skpp module

class skpp.ProjectionPursuitClassifier(r=10, fit_type='polyfit', degree=3, opt_level='high', example_weights='uniform', pairwise_loss_matrix=None, eps_stage=0.0001, eps_backfit=0.01, stage_maxiter=100, backfit_maxiter=10, random_state=None, show_plots=False, plot_epoch=50)[source]

Bases: BaseEstimator, ClassifierMixin

Perform classification with projection pursuit.

Parameters:

pairwise_loss_matrix (array-like of dimension (n_classes, n_classes), default=None) – The adjacency matrix L has entries L[c,k]=l_ck specifying the weight of the penalty of predicting the answer is class k when it is actually class c. If unspecified, all penalties are considered to have the same importance.

See also

ProjectionPursuitRegressor

for definitions of other parameters

fit(X, Y)[source]

Train the model.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – The training input samples.

  • Y (array-like, shape (n_samples,) or (n_samples, n_outputs)) – The target values.

Returns:

self – A trained model.

Return type:

ProjectionPursuitClassifier:

predict(X)[source]

Use the fitted estimator to make predictions on new data.

Parameters:

X (array-like of shape (n_samples, n_features)) – The input samples.

Returns:

Y – The result of passing X through the evaluation function, taking the argmax of the output, and mapping it back to a class.

Return type:

array of shape (n_samples)

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ProjectionPursuitClassifier

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

class skpp.ProjectionPursuitRegressor(r=10, fit_type='polyfit', degree=3, opt_level='high', example_weights='uniform', out_dim_weights='inverse-variance', eps_stage=0.0001, eps_backfit=0.01, stage_maxiter=100, backfit_maxiter=10, random_state=None, show_plots=False, plot_epoch=50)[source]

Bases: BaseEstimator, TransformerMixin, RegressorMixin, MultiOutputMixin

This class implements the PPR algorithm as detailed in math.pdf.

Parameters:
  • r (int, default=10) – The number of terms in the underlying additive model. The input will be put through r projections, r functions of those projections, and then multiplication by r output vectors to determine output.

  • fit_type ({'polyfit', 'spline'}, default='polyfit') – The kind of function to fit at each stage.

  • degree (int, default=3:) – The degree of polynomials or spline-sections used as the univariate approximator between projection and weighted residual targets.

  • opt_level ({'high', 'medium', 'low'}, default='high') – ‘low’ opt_level will disable backfitting. ‘medium’ backfits previous 2D functional fits only (not projections). ‘high’ backfits everything.

  • example_weights (string or array-like of dimension (n_samples,), default='uniform') – The relative importances given to training examples when calculating loss and solving for parameters.

  • out_dim_weights (string or array-like, default='inverse-variance') – The relative importances given to output dimensions when calculating the weighted residual (output of the univariate functions f_j). If all dimensions are of the same importance, but outputs are of different scales, then using the inverse variance is a good choice. Possible values: ‘inverse-variance’: Divide outputs by their variances, ‘uniform’: Use a vector of ones as the weights, array: Provide a custom vector of weights of dimension (n_outputs,)

  • eps_stage (float, default=0.0001) – The mean squared difference between the predictions of the PPR at subsequent iterations of a “stage” (fitting an f, beta pair) must reach below this epsilon in order for the stage to be considered converged.

  • eps_backfit (float, default=0.01) – The mean squared difference between the predictions of the PPR at subsequent iterations of a “backfit” must reach below this epsilon in order for backfitting to be considered converged.

  • stage_maxiter (int, default=100) – If a stage does not converge within this many iterations, end the loop and move on. This is useful for divergent cases.

  • backfit_maxiter (int, default=10) – If a backfit does not converge withint this many iterations, end the loop and move on. Smaller values may be preferred here since backfit iterations are expensive.

  • random_state (int, numpy.RandomState, default=None) – An optional object with which to seed randomness.

  • show_plots (boolean, default=False) – Whether to produce plots of projections versus residual variance throughout the training process.

  • plot_epoch (int, default=50:) – If plots are displayed, show them every plot_epoch iterations of the stage-fitting process.

fit(X, Y)[source]

Train the model.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – The training input samples.

  • Y (array-like, shape (n_samples,) or (n_samples, n_outputs)) – The target values.

Returns:

self – A trained model.

Return type:

ProjectionPursuitRegressor

predict(X)[source]

Use the fitted estimator to make predictions on new data.

Parameters:

X (array-like of shape (n_samples, n_features)) – The input samples.

Returns:

Y – The result of passing X through the evaluation function.

Return type:

array of shape (n_samples) or (n_samples, n_outputs)

set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') ProjectionPursuitRegressor

Request metadata passed to the score method.

Note that this method is only relevant if enable_metadata_routing=True (see sklearn.set_config()). Please see User Guide on how the routing mechanism works.

The options for each parameter are:

  • True: metadata is requested, and passed to score if provided. The request is ignored if metadata is not provided.

  • False: metadata is not requested and the meta-estimator will not pass it to score.

  • None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.

  • str: metadata should be passed to the meta-estimator with this given alias instead of the original name.

The default (sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.

Added in version 1.3.

Note

This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a Pipeline. Otherwise it has no effect.

Parameters:

sample_weight (str, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED) – Metadata routing for sample_weight parameter in score.

Returns:

self – The updated object.

Return type:

object

transform(X)[source]

Find the projections of X through all alpha vectors in the PPR.

\(A\) is a p x r matrix with projection vectors in each column, and \(X\) is an n x p matrix with examples in each row, so the inner product of the two stores projections.

Parameters:

X (array-like of shape (n_samples, n_features)) – The input samples.

Returns:

Projections – where r is the hyperparameter given to the constructor, the number of terms in the additive model, and the jth column is the projection of \(X\) through \(\alpha_j\).

Return type:

an array of shape (n_samples, r)