Introduction of AI

Subhendu Ghosh

2025-07-26

What is Artificial Intelligence?

AI (Artificial Intelligence) is a branch of computer science that focuses on creating systems or machines that can perform tasks that normally require human intelligence.

These tasks include:

  • Learning – improving performance based on data (machine learning)
  • Reasoning – solving problems and making decisions
  • Perception – understanding images, sounds, and other sensory inputs
  • Natural Language Understanding – understanding and generating human language
  • Planning and Problem-Solving – deciding actions to achieve a goal

Applications of AI

  • Healthcare – disease diagnosis, drug discovery
  • Transportation – self-driving cars, traffic management
  • E-commerce – product recommendations, chatbots
  • Finance – fraud detection, stock predictions
  • Manufacturing – robots, predictive maintenance
  • Education – AI tutors, personalized learning
  • Entertainment – Netflix/YouTube recommendations, gaming AI

Machine Learning

ML (Machine Learning) is a subset of Artificial Intelligence (AI) that focuses on building systems that can learn from data and improve their performance without being explicitly programmed.

Applications of Machine Learning

  • Netflix – recommending movies and shows
  • Gmail – detecting spam emails
  • Google Maps – predicting traffic and best routes
  • Credit Card – fraud detection in transactions

Quick Recap on Linear Regression

Linear Models

General Form

\[ Y_i = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + \dots + \beta_p X_{ip} + \epsilon_i \]

Where:
- \(Y_i\) – dependent variable
- \(X_{ij}\) – predictor variables
- \(\beta_0\) – intercept
- \(\beta_j\) – regression coefficients
- \(\epsilon_i\) – error term

Matrix Form

\[ \mathbf{Y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon} \]

Where:
- \(\mathbf{Y}\) : \(n \times 1\) response vector
- \(\mathbf{X}\) : \(n \times (p+1)\) design matrix
- \(\boldsymbol{\beta}\) : coefficients vector
- \(\boldsymbol{\epsilon}\) : error vector

Assumptions of Linear Models

  • Linearity – The relationship between predictors (\(X\)) and response (\(Y\)) is linear.
  • Independence – Observations are independent of each other.
  • Homoscedasticity – The variance of errors is constant for all values of \(X\).
  • Normality of Errors – Residuals (errors) are normally distributed (important for inference).
  • No Multicollinearity – Predictors should not be highly correlated with each other.

Least Squares Estimation (Multiple Linear Model)

Model:
\[ Y = X\beta + \epsilon \]

Goal: Minimize
\[ SSE = (Y - X\beta)^\top (Y - X\beta) \]

Normal Equations:
\[ X^\top X \hat{\beta} = X^\top Y \]

Solution:
\[ \hat{\beta} = (X^\top X)^{-1} X^\top Y \]

Predictions & Residuals:
\[ \hat{Y} = X\hat{\beta}, \quad e = Y - \hat{Y} \]

Types of Machine Learning

graph TD
    A[Machine Learning] --> B[Supervised]
    A --> C[Unsupervised]
    A --> D[Semi-Supervised]

    B --> B1[Regression]
    B --> B2[Classification]
    

Supervised Learning

Supervised Learning is a type of machine learning where a model is trained on a labeled dataset (input data with correct outputs) so that it can predict outcomes for new, unseen data.

Regression

  • Regression is a supervised learning technique used to predict continuous numeric values.
  • It learns the relationship between input variables (features) and an output variable (target) to estimate a quantity.
  • Examples: Predicting house prices, stock market trends, temperature forecasting.

Classification

  • Classification is a supervised learning technique used to categorize data into discrete classes or labels.
  • The model learns from labeled examples to assign new data to one of the predefined categories.
  • Examples: Email spam detection, handwriting recognition, disease diagnosis (yes/no).

Unsupervised Learning

  • Unsupervised Learning is a type of machine learning where the model is trained on unlabeled data and tries to find hidden patterns or structures.

  • One common approach is clustering, where the algorithm groups similar data points so that points in the same cluster are more similar to each other than to those in other clusters.

  • Examples: Customer segmentation, grouping news articles, market basket analysis.

Semi-Supervised Learning

  • Semi-Supervised Learning is a machine learning approach that uses both labeled and unlabeled data for training.

  • Typically, a small portion of labeled data is combined with a large amount of unlabeled data.

  • It helps when labeling data is costly or time-consuming, but unlabeled data is abundant.

  • Examples:

    • Web content classification (few labeled pages, many unlabeled)
    • Medical image analysis (few expert-labeled scans)
    • Speech recognition with limited transcripts

Tools for Supervised Learning

Supervised learning uses various algorithms to learn from labeled data. These tools are mainly divided into Regression and Classification methods.

1. Linear Models

  • Linear Regression – Predicts continuous values.
  • Logistic Regression – For binary or multi-class classification.

2. Tree-Based Models

  • Decision Trees – Used for both regression and classification.
  • Random Forest – Ensemble of decision trees for better accuracy.
  • Gradient Boosting (XGBoost, LightGBM, CatBoost) – Powerful boosting algorithms.

3. Support Vector Machines (SVM)

  • Works well for classification (binary/multi-class).
  • Adapted for regression as Support Vector Regression (SVR).

4. k-Nearest Neighbors (k-NN)

  • Classifies based on majority of nearest neighbors.
  • Can also perform regression by averaging neighbors.

5. Neural Networks (Deep Learning)

  • Handle both classification and regression (used in image recognition, NLP, etc.).

Tools for Unsupervised Learning

Unsupervised learning works on unlabeled data to find patterns or structures.
The main approaches are clustering and dimensionality reduction.

1. Clustering Algorithms

  • K-Means Clustering – Partitions data into k clusters based on similarity.
  • Hierarchical Clustering – Creates a tree-like structure of clusters.
  • DBSCAN – Detects clusters of arbitrary shape based on density.
  • Gaussian Mixture Models (GMM) – Probabilistic model for soft clustering.

2. Dimensionality Reduction

  • Principal Component Analysis (PCA) – Reduces dimensionality while preserving variance.
  • t-SNE & UMAP – Non-linear techniques for visualization in low dimensions.

3. Association Rule Learning

  • Apriori / FP-Growth – Finds relationships between items (market basket analysis).

4. Autoencoders (Neural Networks)

  • Used for unsupervised feature learning and anomaly detection.

Generative AI

Definition

Generative AI is a branch of Artificial Intelligence that focuses on creating new content—such as text, images, audio, video, or code—based on patterns learned from existing data.

It uses generative models (like GPT, DALL·E, Stable Diffusion) to produce outputs that resemble human-created data.

How It Works?

  1. The model is trained on large datasets.
  2. It learns the patterns and structure of the data.
  3. It then generates new data similar to the training examples.

Branches of Generative AI

Generative AI has several branches based on the type of content it generates:

1. Text Generation

  • Creates human-like text, stories, essays, or code.
  • Examples: GPT (ChatGPT), Gemini, Claude.

2. Image Generation

  • Produces realistic or artistic images from text prompts.
  • Examples: DALL·E, Stable Diffusion, Midjourney.

3. Audio & Music Generation

  • Generates speech, sound effects, or music.
  • Examples: MusicLM, VALL-E, ElevenLabs.

4. Video Generation

  • Creates videos from text descriptions or image sequences.
  • Examples: Runway Gen-2, Sora by OpenAI.

5. Code Generation

  • Writes or completes code in various programming languages.
  • Examples: GitHub Copilot, Code LLaMA.

6. 3D Model & Animation Generation

  • Creates 3D objects or animated characters.
  • Examples: Point-E, DreamFusion.

7. Synthetic Data Generation

  • Produces artificial datasets for training ML models when real data is scarce.