Principal Component Analysis

PCA Fundamentals

  • Dimensionality reduction technique that transforms features into new uncorrelated components.
  • Principal components are linear combinations of original features that capture maximum variance. pcn = w1*f1 + w2*f2 + ... + wn*fn.
  • First component captures most variance, second captures next most, and so on.
  • PCA can be used for visualization, noise reduction, and as a preprocessing step for other algorithms.

PCA Algorithm Steps

  • Standardize the data to have mean=0 and variance=1.
  • Calculate the covariance matrix of the features.
  • Compute eigenvalues and eigenvectors of the covariance matrix.
  • Sort eigenvalues in descending order and select top k eigenvectors.
  • Project original data onto the selected eigenvectors to get reduced dimensionality data.
  • Choosing Number of Components: Select k components that explain a desired percentage of variance (e.g. 90%).

PCA Concept Visually

PCA Concept
PCA Concept: Transforming original features into principal components that capture maximum variance.