Linear Algebra
- Key Concepts
- Language of Data
- Determinants
- Systems of Linear Equations
- Eigenvalues and Eigenvectors
- Vector Spaces and Subspaces
Concept | Definition | Key Operations | Applications |
---|---|---|---|
Vectors and Scalars | Vectors are ordered arrays of numbers representing points in space; scalars are single numbers | Addition, scalar multiplication, dot product, cross product | Data points, feature vectors, gradients in optimization |
Matrices | Rectangular arrays of numbers representing linear transformations | Addition, multiplication, transpose, inverse, determinant | Dataset representation, transformation matrices, covariance matrices |
Systems of Linear Equations | Sets of equations with multiple variables and linear relationships | Gaussian elimination, LU decomposition, matrix inversion | Solving regression problems, network flow analysis, optimization |
Vector Spaces and Subspaces | Collections of vectors closed under addition and scalar multiplication | Basis, dimension, span, linear independence, null space | Understanding data structure, dimensionality reduction, feature spaces |
Eigenvalues and Eigenvectors | Eigenvectors are directions unchanged by linear transformations; eigenvalues are scaling factors | Eigen-decomposition, spectral theorem, power iteration | Principal Component Analysis (PCA), stability analysis, quantum mechanics |
Determinants | Scalar value indicating matrix properties like invertibility and volume scaling | Formula computation, properties, Cramer's rule | Testing matrix invertibility, computing areas and volumes |
Norms | Measures of vector/matrix magnitude or distance | L1 norm (Manhattan), L2 norm (Euclidean), Frobenius norm | Regularization in ML, distance metrics, error measurement |
Singular Value Decomposition (SVD) | Matrix factorization into where and are orthogonal, is diagonal | Full SVD, reduced SVD, applications in data analysis | Dimensionality reduction, recommender systems, image compression |
Orthogonality | Vectors/matrices with dot product zero, representing perpendicular directions | Orthogonal vectors, orthogonal matrices, orthonormal bases | Coordinate system transformation, Gram-Schmidt process, QR decomposition |
Scalars, Vectors, and Matrices
Aspect | Scalars | Vectors | Matrices |
---|---|---|---|
Definition | A single numerical value | An ordered list of numbers (1D array) | A rectangular array of numbers (2D array) |
Notation | Lowercase italic letters: | Lowercase bold letters: or with an arrow | Uppercase bold letters: |
Dimension | 0D (just one value) | 1D with components (an -dimensional vector) | 2D with elements (rows and columns) |
Characteristics | Magnitude only, no direction | Magnitude and direction (length + orientation) | Collection of numbers arranged in rows and columns |
Examples in Data Analysis |
|
|
|
Basic Operations | Multiplication with vectors/matrices (scaling) |
|
|
Key Uses in Data Analysis | Represent individual values, model parameters, hyperparameters | Represent data points (rows) or features (columns), weights in models, distance/similarity measures | Represent datasets, transformations, statistical measures (covariance, correlations), ML computations |
Geometric Meaning | A point on a number line | A directed arrow (length + direction) in space | A transformation of space, mapping vectors to new vectors |
Relevance | Simple descriptive stats or model constants | Data representation, projections, learning algorithms | Dataset storage, transformations, machine learning models, PCA, regression, deep learning |
Matrices
Matrix Type | Definition / Characteristics | Notation | Key Properties | Relevance in Data Analysis |
---|---|---|---|---|
Identity | Square matrix with 1s on main diagonal, 0s elsewhere. Acts like scalar 1 in multiplication | , . Leaves vectors/matrices unchanged | "Do nothing" transformation. Defines inverses. Used in regularization (e.g., Ridge Regression) | |
Zero | All entries are 0. Can be any dimension. Acts like scalar 0 in addition | . Multiplying with zero matrix yields a zero matrix (if dimensions match) | Represents baseline/no effect. Used for error analysis (perfect fit = zero error). Useful for padding matrices | |
Diagonal | Square matrix with nonzero values only on the main diagonal | Multiplication simplifies to scaling rows/columns. Easily invertible if diagonal entries are nonzero. Eigenvalues are diagonal entries | Used for scaling features. PCA eigenvalues appear in diagonal form. Indicates independence/uncorrelated features. Weighted regression methods | |
Symmetric | Square matrix equal to its transpose: | All eigenvalues are real. Always diagonalizable. Eigenvectors for distinct eigenvalues are orthogonal | Covariance and correlation matrices. Similarity and kernel matrices in machine learning (e.g., SVMs, clustering) | |
Inverse | For square matrix , inverse satisfies | Exists only if . Provides unique solution to linear equations | Critical for regression, solving systems (), Kalman filters, and precision matrices |
Aspect | Key Points | Examples |
---|---|---|
Definition | Determinant is a scalar value computed from a square matrix | Denoted as or . Only defined for square matrices |
Conceptual Meaning |
| If , matrix is singular and columns are dependent |
Calculation (2×2) | Formula: | For , determinant |
Calculation (3×3) | Methods: Sarrus' Rule (only for 3×3) or cofactor expansion | Example: , determinant |
Key Properties |
| Useful for simplifying computation and understanding structural properties |
Invertibility & Solving Systems | → inverse exists | In regression, if : indicates multicollinearity. Small determinants → numerical instability |
Linear Independence & Rank | Zero determinant → linear dependence; matrix rank < dimension | Helps detect redundant features in datasets |
Geometric Meaning | Absolute determinant = scaling factor of area/volume. Sign indicates orientation flip/reflection | If : space collapses to lower dimension (loss of information) |
PCA Relevance | Covariance matrix determinant = product of eigenvalues. Zero determinant means some features perfectly correlated | Links determinants to dimensionality reduction and variance in PCA |
Aspect | Description | Example | Use Cases |
---|---|---|---|
Definition | Set of linear equations with common variables; solutions satisfy all equations simultaneously | , | Models constraints, parameter estimation, optimization |
Possible Solutions |
| Two lines intersecting vs. parallel vs. coincident | Identifies whether models are solvable or if redundancy exists |
Matrix Form | Compact representation using coefficient matrix , variable vector , and constant vector | Enables computation with software; foundation for regression, optimization, and network analysis | |
Gaussian Elimination | Algorithmic row operations to reduce system to row echelon form | Stepwise elimination of variables | Basis for computational solvers; reveals rank, independence, consistency |
Matrix Inversion | Direct solution if is square and invertible: | Least squares regression formula | Theoretical insight, regression coefficients, but unstable for large/ill-conditioned systems |
Applications | Used for regression, optimization, networks, constraint solving | Linear programming, PCA foundation, traffic/circuit analysis | Critical across data science, machine learning, and operations research |
Numerical Considerations | Stability issues can arise for nearly singular systems | Small change in produces large change in | Helps diagnose multicollinearity and instability in models |
Software Tools | Computational libraries perform solving using efficient methods | Python (numpy.linalg.solve ), R (solve() ) | Automates arithmetic, but conceptual understanding required for interpretation |
Aspect | Eigenvalues () | Eigenvectors () |
---|---|---|
Definition | Scalar factors that indicate how much a corresponding eigenvector is stretched or shrunk by a transformation | Non-zero vectors that maintain their direction under a linear transformation, only scaled by their eigenvalue |
Eigen-equation | Appears as , solved from | Obtained by solving for each eigenvalue |
Conceptual Meaning | Represents the magnitude of the scaling effect of the transformation in a given direction | Represents the directions (axes) along which the transformation acts by pure stretching or shrinking without rotation |
Numerical Example | For , eigenvalues are , | For the same matrix: eigenvector for is ; for is |
Role in PCA | Indicate how much variance each principal component explains (larger eigenvalues = higher variance captured) | Define the principal components themselves, i.e., the new axes along which data varies most |
Data Analysis Impact | Rank importance of directions by variance magnitude, guiding dimensionality reduction | Provide new coordinate system for data that simplifies interpretation and visualization |
Other Applications | Indicate stability in dynamic systems; spectral analysis (graph connectivity, community detection) | Show invariant directions in system dynamics; essential in PCA and SVD for feature extraction & data representation |
Uniqueness | Numerical values are unique (though multiplicity may occur) | Not unique - any scalar multiple of an eigenvector is also an eigenvector (commonly normalized to unit length) |
Vector Spaces vs Subspaces
Concept | Vector Space | Subspace |
---|---|---|
Definition | A set of vectors closed under addition and scalar multiplication, following specific axioms | A subset of a vector space that itself satisfies all the vector space axioms |
Required Properties | Closure under addition and scalar multiplication, existence of zero vector, additive inverse, associativity, commutativity, distributivity | Contains the zero vector, closed under addition, closed under scalar multiplication |
Examples | , , | Line through the origin in , plane through origin in , trivial subspace |
Geometric Meaning | The full "space" where vectors (data points) live, can be high-dimensional | A smaller "region" inside a larger vector space, such as a line or plane within that space |
Relevance to Data Analysis | Represents entire data feature space, geometric context for similarity, projections, and transformations | Supports dimensionality reduction (PCA), feature combinations, and efficient data representation |
Concepts
Concept | Meaning | Use Cases |
---|---|---|
Span | All linear combinations of a set of vectors | Defines the full feature space reachable from given features |
Linear Independence | No vector is redundant; none can be expressed as a combination of others | Identifies redundancy (multicollinearity) and supports dimensionality reduction |
Basis | Minimal set of linearly independent vectors that span the whole space | Provides an optimal coordinate system (e.g., PCA basis) |
Dimension | Number of independent directions (size of a basis) | Indicates data complexity and relates to curse of dimensionality |
Null Space | Vectors mapped to zero under a transformation | Reveals redundancy or loss of information; linked to invertibility and multicollinearity |
Column Space | All linear combinations of matrix columns (reachable outputs) | Defines prediction/output space in regression or linear models |
Row Space | All linear combinations of matrix rows | Provides insight into feature relationships; dimension equals matrix rank |