Ensemble#
- class sklearn_ensemble_cv.ensemble.Ensemble(**kwargs)#
Ensemble class is built on top of sklearn.ensemble.BaggingRegressor. It provides additional methods for computing ECV estimates.
- Attributes:
estimators_samples_The subset of drawn samples for each base estimator.
Methods
compute_cgcv_estimate(X_train, Y_train[, M, ...])Computes the corrected GCV estimate for the given input data using the provided BaggingRegressor model.
compute_ecv_estimate(X_train, Y_train[, ...])Computes the ECV estimate for the given input data using the provided BaggingRegressor model.
compute_gcv_estimate(X_train, Y_train[, M, ...])Computes the naive GCV estimate for the given input data using the provided BaggingRegressor model.
fit(X, y, *[, sample_weight])Build a Bagging ensemble of estimators from the training set (X, y).
get_metadata_routing()Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
predict(X)Predict regression target for X.
predict_individual(X[, M, n_jobs, verbose])Predicts the target values for the given input data using the provided BaggingRegressor model.
score(X, y[, sample_weight])Return the coefficient of determination of the prediction.
set_fit_request(*[, sample_weight])Request metadata passed to the
fitmethod.set_params(**params)Set the parameters of this estimator.
set_score_request(*[, sample_weight])Request metadata passed to the
scoremethod.compute_risk
extrapolate
- compute_cgcv_estimate(X_train, Y_train, M=None, type='full', return_df=False, n_jobs=-1, verbose=0, **kwargs_est)#
Computes the corrected GCV estimate for the given input data using the provided BaggingRegressor model.
- Parameters:
- X_trainnp.ndarray
[n, p] The input covariates.
- Y_trainnp.ndarray
[n, ] The target values of the input data.
- typestr, optional
The type of CGCV estimate to compute. Can be either ‘full’ (using full observations) or ‘ovlp’ (using overlapping observations).
- return_dfbool, optional
If True, returns the GCV estimate as a pandas.DataFrame object.
- n_jobsint, optional
The number of jobs to run in parallel. If -1, all CPUs are used.
- kwargs_estdict
Additional keyword arguments for the risk estimate.
- Returns:
- risk_gcvnp.ndarray or pandas.DataFrame
[M_test, ] The CGCV estimate for each ensemble size in M_test.
- compute_ecv_estimate(X_train, Y_train, M_test=None, M0=None, return_df=False, n_jobs=-1, verbose=0, **kwargs_est)#
Computes the ECV estimate for the given input data using the provided BaggingRegressor model.
- Parameters:
- X_trainnp.ndarray
[n, p] The input covariates.
- Y_trainnp.ndarray
[n, …] The target values of the input data.
- M_testint or np.ndarray
The maximum ensemble size of the ECV estimate.
- M0int, optional
The number of estimators to use for the OOB estimate. If None, M0 is set to the number of estimators in the BaggingRegressor model.
- return_dfbool, optional
If True, returns the ECV estimate as a pandas.DataFrame object.
- n_jobsint, optional
The number of jobs to run in parallel. If -1, all CPUs are used.
- kwargs_estdict
Additional keyword arguments for the risk estimate.
- Returns:
- risk_ecvnp.ndarray or pandas.DataFrame
[M_test, ] The ECV estimate for each ensemble size in M_test.
- compute_gcv_estimate(X_train, Y_train, M=None, type='full', return_df=False, n_jobs=-1, verbose=0, **kwargs_est)#
Computes the naive GCV estimate for the given input data using the provided BaggingRegressor model.
- Parameters:
- X_trainnp.ndarray
[n, p] The input covariates.
- Y_trainnp.ndarray
[n, ] The target values of the input data.
- typestr, optional
The type of GCV estimate to compute. Can be either ‘full’ (the naive GCV using full observations) or ‘union’ (the naive GCV using training observations).
- return_dfbool, optional
If True, returns the GCV estimate as a pandas.DataFrame object.
- n_jobsint, optional
The number of jobs to run in parallel. If -1, all CPUs are used.
- kwargs_estdict
Additional keyword arguments for the risk estimate.
- Returns:
- risk_gcvnp.ndarray or pandas.DataFrame
[M_test, ] The GCV estimate for each ensemble size in M_test.
- predict_individual(X: ndarray, M: int = -1, n_jobs: int = -1, verbose: bool = 0) ndarray#
Predicts the target values for the given input data using the provided BaggingRegressor model.
- Parameters:
- regrBaggingRegressor
The BaggingRegressor model to use for prediction.
- Xnp.ndarray
[n, p] The input data to predict target values for.
- Returns:
- Y_hatnp.ndarray
[n, M] The predicted target values of all $M$ estimators for the input data.
- set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') Ensemble#
Request metadata passed to the
fitmethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter infit.
- Returns:
- selfobject
The updated object.
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') Ensemble#
Request metadata passed to the
scoremethod.Note that this method is only relevant if
enable_metadata_routing=True(seesklearn.set_config()). Please see User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
Note
This method is only relevant if this estimator is used as a sub-estimator of a meta-estimator, e.g. used inside a
Pipeline. Otherwise it has no effect.- Parameters:
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter inscore.
- Returns:
- selfobject
The updated object.