Introduction of AI

Subhendu Ghosh

2025-07-26

What is Artificial Intelligence?

AI (Artificial Intelligence) is a branch of computer science that focuses on creating systems or machines that can perform tasks that normally require human intelligence.

These tasks include:

Learning – improving performance based on data (machine learning)
Reasoning – solving problems and making decisions
Perception – understanding images, sounds, and other sensory inputs
Natural Language Understanding – understanding and generating human language
Planning and Problem-Solving – deciding actions to achieve a goal

Applications of AI

Healthcare – disease diagnosis, drug discovery
Transportation – self-driving cars, traffic management
E-commerce – product recommendations, chatbots
Finance – fraud detection, stock predictions
Manufacturing – robots, predictive maintenance
Education – AI tutors, personalized learning
Entertainment – Netflix/YouTube recommendations, gaming AI

Machine Learning

ML (Machine Learning) is a subset of Artificial Intelligence (AI) that focuses on building systems that can learn from data and improve their performance without being explicitly programmed.

Applications of Machine Learning

Netflix – recommending movies and shows
Gmail – detecting spam emails
Google Maps – predicting traffic and best routes
Credit Card – fraud detection in transactions

Quick Recap on Linear Regression

Linear Models

General Form

\[ Y_i = \beta_0 + \beta_1 X_{i1} + \beta_2 X_{i2} + \dots + \beta_p X_{ip} + \epsilon_i \]

Where:
- \(Y_i\) – dependent variable
- \(X_{ij}\) – predictor variables
- \(\beta_0\) – intercept
- \(\beta_j\) – regression coefficients
- \(\epsilon_i\) – error term

Matrix Form

\[ \mathbf{Y} = \mathbf{X}\boldsymbol{\beta} + \boldsymbol{\epsilon} \]

Where:
- \(\mathbf{Y}\) : \(n \times 1\) response vector
- \(\mathbf{X}\) : \(n \times (p+1)\) design matrix
- \(\boldsymbol{\beta}\) : coefficients vector
- \(\boldsymbol{\epsilon}\) : error vector

Assumptions of Linear Models

Linearity – The relationship between predictors (\(X\)) and response (\(Y\)) is linear.
Independence – Observations are independent of each other.
Homoscedasticity – The variance of errors is constant for all values of \(X\).
Normality of Errors – Residuals (errors) are normally distributed (important for inference).
No Multicollinearity – Predictors should not be highly correlated with each other.

Least Squares Estimation (Multiple Linear Model)

Model:
\[ Y = X\beta + \epsilon \]

Goal: Minimize
\[ SSE = (Y - X\beta)^\top (Y - X\beta) \]

Normal Equations:
\[ X^\top X \hat{\beta} = X^\top Y \]

Solution:
\[ \hat{\beta} = (X^\top X)^{-1} X^\top Y \]

Predictions & Residuals:
\[ \hat{Y} = X\hat{\beta}, \quad e = Y - \hat{Y} \]

Types of Machine Learning

graph TD
    A[Machine Learning] --> B[Supervised]
    A --> C[Unsupervised]
    A --> D[Semi-Supervised]

    B --> B1[Regression]
    B --> B2[Classification]

Supervised Learning

Supervised Learning is a type of machine learning where a model is trained on a labeled dataset (input data with correct outputs) so that it can predict outcomes for new, unseen data.

Regression

Regression is a supervised learning technique used to predict continuous numeric values.
It learns the relationship between input variables (features) and an output variable (target) to estimate a quantity.
Examples: Predicting house prices, stock market trends, temperature forecasting.

Classification

Classification is a supervised learning technique used to categorize data into discrete classes or labels.
The model learns from labeled examples to assign new data to one of the predefined categories.
Examples: Email spam detection, handwriting recognition, disease diagnosis (yes/no).

Unsupervised Learning

Unsupervised Learning is a type of machine learning where the model is trained on unlabeled data and tries to find hidden patterns or structures.
One common approach is clustering, where the algorithm groups similar data points so that points in the same cluster are more similar to each other than to those in other clusters.
Examples: Customer segmentation, grouping news articles, market basket analysis.

Semi-Supervised Learning

Semi-Supervised Learning is a machine learning approach that uses both labeled and unlabeled data for training.
Typically, a small portion of labeled data is combined with a large amount of unlabeled data.
It helps when labeling data is costly or time-consuming, but unlabeled data is abundant.
Examples:
- Web content classification (few labeled pages, many unlabeled)
- Medical image analysis (few expert-labeled scans)
- Speech recognition with limited transcripts

Tools for Supervised Learning

Supervised learning uses various algorithms to learn from labeled data. These tools are mainly divided into Regression and Classification methods.

1. Linear Models

Linear Regression – Predicts continuous values.
Logistic Regression – For binary or multi-class classification.

2. Tree-Based Models

Decision Trees – Used for both regression and classification.
Random Forest – Ensemble of decision trees for better accuracy.
Gradient Boosting (XGBoost, LightGBM, CatBoost) – Powerful boosting algorithms.

3. Support Vector Machines (SVM)

Works well for classification (binary/multi-class).
Adapted for regression as Support Vector Regression (SVR).

4. k-Nearest Neighbors (k-NN)

Classifies based on majority of nearest neighbors.
Can also perform regression by averaging neighbors.

5. Neural Networks (Deep Learning)

Handle both classification and regression (used in image recognition, NLP, etc.).

Tools for Unsupervised Learning

Unsupervised learning works on unlabeled data to find patterns or structures.
The main approaches are clustering and dimensionality reduction.

1. Clustering Algorithms

K-Means Clustering – Partitions data into k clusters based on similarity.
Hierarchical Clustering – Creates a tree-like structure of clusters.
DBSCAN – Detects clusters of arbitrary shape based on density.
Gaussian Mixture Models (GMM) – Probabilistic model for soft clustering.

2. Dimensionality Reduction

Principal Component Analysis (PCA) – Reduces dimensionality while preserving variance.
t-SNE & UMAP – Non-linear techniques for visualization in low dimensions.

3. Association Rule Learning

Apriori / FP-Growth – Finds relationships between items (market basket analysis).

4. Autoencoders (Neural Networks)

Used for unsupervised feature learning and anomaly detection.

Generative AI

Definition

Generative AI is a branch of Artificial Intelligence that focuses on creating new content—such as text, images, audio, video, or code—based on patterns learned from existing data.

It uses generative models (like GPT, DALL·E, Stable Diffusion) to produce outputs that resemble human-created data.

How It Works?

The model is trained on large datasets.
It learns the patterns and structure of the data.
It then generates new data similar to the training examples.

Branches of Generative AI

Generative AI has several branches based on the type of content it generates:

1. Text Generation

Creates human-like text, stories, essays, or code.
Examples: GPT (ChatGPT), Gemini, Claude.

2. Image Generation

Produces realistic or artistic images from text prompts.
Examples: DALL·E, Stable Diffusion, Midjourney.

3. Audio & Music Generation

Generates speech, sound effects, or music.
Examples: MusicLM, VALL-E, ElevenLabs.

4. Video Generation

Creates videos from text descriptions or image sequences.
Examples: Runway Gen-2, Sora by OpenAI.

5. Code Generation

Writes or completes code in various programming languages.
Examples: GitHub Copilot, Code LLaMA.

6. 3D Model & Animation Generation

Creates 3D objects or animated characters.
Examples: Point-E, DreamFusion.

7. Synthetic Data Generation

Produces artificial datasets for training ML models when real data is scarce.