`msmu.tl.pca`

Perform Principal Component Analysis (PCA) on the specified modality of the MuData object.

References

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of machine learning research, 12(Oct), 2825-2830.

Andrzej M., Waldemar R. (1993). Principal Component Analysis (PCA). Computers & Geosciences, 19(3), 303-342.

Parameters:

Name	Type	Description	Default
`mdata`	`MuData`	MuData object containing the data.	required
`modality`	`str`	The modality to perform PCA on.	required
`layer`	`str \| None`	Layer to use for quantification aggregation. If None, the default layer (.X) will be used. Defaults to "scaled".	`None`
`n_components`	`int \| None`	Number of components to keep. if n_components is not set all components are kept:: `n_components == min(n_samples, n_features)` If `n_components == 'mle'` and `svd_solver == 'full'`, Minka's MLE is used to guess the dimension. Use of `n_components == 'mle'` will interpret `svd_solver == 'auto'` as `svd_solver == 'full'`. If `0 < n_components < 1` and `svd_solver == 'full'`, select the number of components such that the amount of variance that needs to be explained is greater than the percentage specified by n_components. If `svd_solver == 'arpack'`, the number of components must be strictly less than the minimum of n_features and n_samples. Hence, the None case results in: `n_components == min(n_samples, n_features) - 1`	`None`
`svd_solver`	`Literal['auto', 'full', 'arpack', 'randomized']`	"auto": The solver is selected by a default 'auto' policy is based on `X.shape` and `n_components`: if the input data has fewer than 1000 features and more than 10 times as many samples, then the "covariance_eigh" solver is used. Otherwise, if the input data is larger than 500x500 and the number of components to extract is lower than 80% of the smallest dimension of the data, then the more efficient "randomized" method is selected. Otherwise the exact "full" SVD is computed and optionally truncated afterwards. "full" : Run exact full SVD calling the standard LAPACK solver via `scipy.linalg.svd` and select the components by postprocessing "arpack" : Run SVD truncated to `n_components` calling ARPACK solver via `scipy.sparse.linalg.svds`. It requires strictly `0 < n_components < min(X.shape)` "randomized" : Run randomized SVD by the method of Halko et al.	`'auto'`
`random_state`	`int \| None`	Used when the 'arpack' or 'randomized' solvers are used. Pass an int for reproducible results across multiple function calls.	`0`
`key_added`	`str`	Base key used for PCA outputs. Results are stored in: - `.obsm[key_added]` for component scores - `.varm[key_added]` for loadings - `.uns[key_added]` for explained variance metadata Defaults to "X_pca".	`'X_pca'`
`**kwargs`	`Any`	Additional keyword arguments passed to PCA constructor.	`{}`

Returns:

Type	Description
`MuData`	Updated MuData object with PCA results.