rfwtools.dim_reduction.pca.do_pca_reduction
- rfwtools.dim_reduction.pca.do_pca_reduction(feature_df, metadata_cols, n_components=3, standardize=True, **kwargs)[source]
Performs PCA on subset of columns of feature_df and maintains some example info in results.
- Parameters:
feature_df (
DataFrame) – DataFrame containing example information and feature data.metadata_cols (
List[str]) – The column names of feature_df that contain the metadata of the events (labels, etc.). All columns not listed in event_cols are used in PCA analysis.n_components (
int) – The number of primary components to returnstandardize (
bool) – Should the features be standardized (i.e. (x-mean)/stddev)?kwargs – A dictionary of keyword parameter name/values to be passed to sklearn.decomposition.PCA
- Return type:
Union[Tuple[Union[DataFrame,Series],None],Tuple[Union[DataFrame,Series],PCA]]- Returns:
A tuple of a DataFrame and the PCA model object after fit_transform has been called. The DataFrame contains the PCA output (pc1, pc2, …, pcN) and specified metadata_cols. No data will be in the pc columns should n_components > len(feature_df). If no PCA object can be fit, then None will be returned.