ML Concepts

Machine Learning

  • Branch of AI where computer learns from data without explicit rules.
  • Instead of rules, we provide examples and let model find patterns.
  • Example
    1. Rule based: If age < 18 give discount.
    2. ML based: Here are 1000 examples of customers with age and discount given. Predict discount for new customers on providing age.
    Types of ML
    1. Supervised Learning
    2. Unsupervised Learning
    3. Reinforcement Learning

Supervised Learning

  • Model learns from labeled data to predict outcomes.
  • We provide features(X) and output(y) to the model.
  • Model finds patterns to predict y from X.
  • Example
    1. Predicting house prices using features like size, location, number of rooms.
    2. Classifying emails as spam or not spam using email content, metadata.
    Common Algorithms
    1. K-Nearest Neighbors (KNN)
    2. Support Vector Machines (SVM)
    3. Linear Regression
    4. Logistic Regression
    5. Decision Trees

Regression Concept

  • Predicting a continuous output variable based on input features.
  • Output is a real number, e.g., price, temperature, etc.
Regression Example:
Academic QualificationExperience YearsCompanyPositionSalary
Bachelors5GoogleDeveloper100000
Masters8MicrosoftData Engineer150000
Masters2GoogleDeveloper80000
Bachelors1GoogleDeveloper?????
Regression task: Predicting Salary based on features

Classification Concept

  • Predicting a categorical output variable based on input features.
  • Output is a class label, e.g., spam/not spam, disease/no disease.
Classification Example:
AgeBlood PressureCholesterol LevelDiabetes
25120200No
45140250Yes
30130220No
50150300?????
Classification task: Predicting Diabetes based on health metrics

Unsupervised Learning

  • Model learns from unlabeled data to find hidden patterns.
  • We provide only features(X) to the model.
  • Model finds structure, clusters, or associations in data.
  • Example: Grouping customers into segments based on purchasing behavior without predefined labels.
  • Common Algorithms
    1. K-means Clustering
    2. Principal Component Analysis (PCA)
    3. Hierarchical Clustering
    4. Association Rule Learning

Reinforcement Learning

  • Model learns to make decisions by interacting with an environment.
  • Model receives feedback in the form of rewards or penalties.
  • Goal is to learn a policy that maximizes cumulative reward.
  • Examples
    1. Training a robot to navigate a maze by rewarding for reaching exit & penalizing for collisions.
    2. Teaching an AI agent to play a game by rewarding for winning & penalizing for losing.

ML Terms to Know

  • Feature: Input independent variable in ML model used for predicting output.
  • Label: Output response variable in ML model that we want to predict.
  • Model: Mathematical representation of the relationship between features and label.
  • Training: Process of generating a mathematical model from data.
  • Evaluation: Process of assessing the performance of a trained model.
  • Prediction: Output from trained model when given new input data.
  • Under Fitting (Bias): Model too simple to capture underlying patterns in data, leading to poor performance on both training and test data.
  • Over Fitting (Variance): Model too complex and captures noise in training data, leading to poor generalization on unseen data.
  • Bias-Variance Tradeoff: Balance between underfitting and overfitting to achieve optimal model performance on unseen data.
Bias-Variance Tradeoff
Bias-Variance Tradeoff: Underfitting, Balanced, Overfitting

Steps for Solving ML Problems

  • Define the problem. Determine type of machine learning task.
  • Collect data and preprocess it for duplicates, missing, outlier etc.
  • Implement feature Scaling and Encoding.
  • Split data into training and testing sets.
  • Choose appropriate model and train it on training data.
  • Evaluate the model's performance.
  • Deploy the model and monitor its performance.