Ensemble#
- class sklearn_ensemble_cv.ensemble.Ensemble(**kwargs)#
Ensemble class is built on top of sklearn.ensemble.BaggingRegressor. It provides additional methods for computing ECV estimates.
- Attributes:
estimators_samples_The subset of drawn samples for each base estimator.
Methods
compute_cgcv_estimate(X_train, Y_train[, M, ...])Computes the corrected GCV estimate for the given input data using the provided BaggingRegressor model.
compute_ecv_estimate(X_train, Y_train[, ...])Computes the ECV estimate for the given input data using the provided BaggingRegressor model.
compute_gcv_estimate(X_train, Y_train[, M, ...])Computes the naive GCV estimate for the given input data using the provided BaggingRegressor model.
compute_risk(X, Y[, M_test, return_df, avg, ...])Computes the risk estimate for the given input data using the provided BaggingRegressor model.
extrapolate(risk[, M_test])Extrapolates the risk estimate for the given ensemble size using the provided BaggingRegressor model.
fit(X, y[, sample_weight])Build a Bagging ensemble of estimators from the training set (X, y).
get_metadata_routing()Get metadata routing of this object.
get_params([deep])Get parameters for this estimator.
predict(X, **params)Predict regression target for X.
predict_individual(X[, M, n_jobs, verbose])Predicts the target values for the given input data using the provided BaggingRegressor model.
score(X, y[, sample_weight])Return coefficient of determination on test data.
set_fit_request(*[, sample_weight])Configure whether metadata should be requested to be passed to the
fitmethod.set_params(**params)Set the parameters of this estimator.
set_score_request(*[, sample_weight])Configure whether metadata should be requested to be passed to the
scoremethod.- compute_cgcv_estimate(X_train, Y_train, M=None, type='full', return_df=False, n_jobs=-1, verbose=0, **kwargs_est)#
Computes the corrected GCV estimate for the given input data using the provided BaggingRegressor model.
- Parameters:
- X_trainnp.ndarray
[n, p] The input covariates.
- Y_trainnp.ndarray
[n, ] The target values of the input data.
- typestr, optional
The type of CGCV estimate to compute. Can be either ‘full’ (using full observations) or ‘ovlp’ (using overlapping observations).
- return_dfbool, optional
If True, returns the GCV estimate as a pandas.DataFrame object.
- n_jobsint, optional
The number of jobs to run in parallel. If -1, all CPUs are used.
- kwargs_estdict
Additional keyword arguments for the risk estimate.
- Returns:
- risk_gcvnp.ndarray or pandas.DataFrame
[M_test, ] The CGCV estimate for each ensemble size in M_test.
- compute_ecv_estimate(X_train, Y_train, M_test=None, M0=None, return_df=False, n_jobs=-1, verbose=0, **kwargs_est)#
Computes the ECV estimate for the given input data using the provided BaggingRegressor model.
- Parameters:
- X_trainnp.ndarray
[n, p] The input covariates.
- Y_trainnp.ndarray
[n, …] The target values of the input data.
- M_testint or np.ndarray
The maximum ensemble size of the ECV estimate.
- M0int, optional
The number of estimators to use for the OOB estimate. If None, M0 is set to the number of estimators in the BaggingRegressor model.
- return_dfbool, optional
If True, returns the ECV estimate as a pandas.DataFrame object.
- n_jobsint, optional
The number of jobs to run in parallel. If -1, all CPUs are used.
- kwargs_estdict
Additional keyword arguments for the risk estimate.
- Returns:
- risk_ecvnp.ndarray or pandas.DataFrame
[M_test, ] The ECV estimate for each ensemble size in M_test.
- compute_gcv_estimate(X_train, Y_train, M=None, type='full', return_df=False, n_jobs=-1, verbose=0, **kwargs_est)#
Computes the naive GCV estimate for the given input data using the provided BaggingRegressor model.
- Parameters:
- X_trainnp.ndarray
[n, p] The input covariates.
- Y_trainnp.ndarray
[n, ] The target values of the input data.
- typestr, optional
The type of GCV estimate to compute. Can be either ‘full’ (the naive GCV using full observations) or ‘union’ (the naive GCV using training observations).
- return_dfbool, optional
If True, returns the GCV estimate as a pandas.DataFrame object.
- n_jobsint, optional
The number of jobs to run in parallel. If -1, all CPUs are used.
- kwargs_estdict
Additional keyword arguments for the risk estimate.
- Returns:
- risk_gcvnp.ndarray or pandas.DataFrame
[M_test, ] The GCV estimate for each ensemble size in M_test.
- compute_risk(X, Y, M_test=None, return_df=False, avg=True, n_jobs=-1, verbose=0, **kwargs_est)#
Computes the risk estimate for the given input data using the provided BaggingRegressor model.
- Parameters:
- Xnp.ndarray
[n, p] The input covariates.
- Ynp.ndarray
[n, …] The target values of the input data.
- M_testint, optional
The ensemble size of the risk estimate.
- return_dfbool, optional
If True, returns the risk estimate as a pandas.DataFrame object.
- Returns:
- risknp.ndarray or pandas.DataFrame
[M_test, ] The risk estimate for each ensemble size in M_test.
- extrapolate(risk, M_test=None)#
Extrapolates the risk estimate for the given ensemble size using the provided BaggingRegressor model.
- Parameters:
- risknp.ndarray
[M0, ] The risk estimate for the ensemble sizes in M0.
- M_testint or np.ndarray
The ensemble size to extrapolate the risk estimate to.
- Returns:
- risk_ecvnp.ndarray
[M_test, ] The extrapolated risk estimate for each ensemble size in M_test.
- predict_individual(X: ndarray, M: int = -1, n_jobs: int = -1, verbose: bool = 0) ndarray#
Predicts the target values for the given input data using the provided BaggingRegressor model.
- Parameters:
- regrBaggingRegressor
The BaggingRegressor model to use for prediction.
- Xnp.ndarray
[n, p] The input data to predict target values for.
- Returns:
- Y_hatnp.ndarray
[n, M] The predicted target values of all $M$ estimators for the input data.
- set_fit_request(*, sample_weight: bool | None | str = '$UNCHANGED$') Ensemble#
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
- Parameters:
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter infit.
- Returns:
- selfobject
The updated object.
- set_score_request(*, sample_weight: bool | None | str = '$UNCHANGED$') Ensemble#
Configure whether metadata should be requested to be passed to the
scoremethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed toscoreif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it toscore.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.New in version 1.3.
- Parameters:
- sample_weightstr, True, False, or None, default=sklearn.utils.metadata_routing.UNCHANGED
Metadata routing for
sample_weightparameter inscore.
- Returns:
- selfobject
The updated object.