GE143 Science and Technology for Life

Session 3: Introduction to the course

Dr R Batzinger

Payap University

2023-08-15

1 Approaches to Computing

  1. Programming

    • Sequential command processing
    • Functional programming
    • Object-oriented programming
  1. AI and Machine Learning Approaches

  • Regression
  • Decision tree analysis
  • Genetic algorithms
  • Markov states and Generative approaches
  • Neural nets with back propagation
  • Self-organizing maps

2 Basic programming

Wizard vs Ogre

2.1 Basic building blocks

  • Variables: assignable memory \(\pi = 3.1415926\)
  • Operators: modify and combine variables \(+, -, \times, \div, \sqrt{\ \ }\)
  • Logic function: comparison of variables \(\lt, \le, =, \ne, \ge, \gt\)
  • Functions: process the input to calculate a corresponding output \(y=sin(x)\)
  • Input stream / Output stream: receiving and sending information to the world

2.2 Process controls

  • Repeated process
x = 0
while (x < 5) {
  try(x)
  x += 1
}

x = 0
unless (x > 5) {
  try(x) "
  x += 1
}
  • Conditional processing
if(x == 5) {
   print "smiling"
} else {
   print "frown"
}

unless (x == 5) {
   print "smiling"
} else {
   print "smiling"
}

2.3 Actions

The programmer must …

  • specify the actions and nature of every actor.
  • determine what conditions must be tested and the appropriate response
  • arrange the output.

Examples

2.3.1 Basic Actors of this Program

3 Sprite objects on a backdrop

2.3.2 Actions of the Wizard

2.3.3 Actions of the Bolt of Lightning

2.3.4 Actions of the Ogre

2.3.5 Actions of the Wizard

2.4 Comparison of Different programming modes

  • Sequential Programming
   a = 3
   b = 4
   c = a * a + b * b
   c = sqrt(c)
   print c
  • Functional Programming
   print(sqrt(sum(sq(3),sq(4))))
  • Object-oriented Programming
   t = RightTriange.new(3,4)
   t.sideC.print

Calculating the hypotenuse of a right triangle

\[c = \sqrt{a^2 + b^2}\]

2.5 Contrast to AI development

Programming

  • Requires a programmer who understands the problem domain and the processing language
  • Algorithm is implement a set of actions in code written by the programmer
  • Programming logic and choices are hard coded
  • Changing trends and procedures will require revision of the code
  • Test cases are used to demonstrate that the code is correct.

AI

  • Requires a data scientist to develop the dataset
  • The program logic are gleaned from the data
  • Behavoir changes with changes in the data
  • Outcome is verified against the data set
  • The machine learning can be achieved from either graded data or raw data.

3 Regression

Statistical Machine Learning

3.1 Regression analysis and modelling

Year Average miles per gallon
1940 14.9
1950 13.6
1960 13.1
1970 13.5
1980 15.5
1986 18.3

3.2 Analysis

3.3 Regression Summary


Call:
lm(formula = y ~ x + x2)

Residuals:
      1       2       3       4       5       6 
-0.2099  0.2796  0.2360 -0.2404 -0.4499  0.3846 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) 15.1098901  0.4071654  37.110  4.3e-05 ***
x           -0.2455934  0.0418513  -5.868  0.00987 ** 
x2           0.0066648  0.0008655   7.700  0.00455 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.442 on 3 degrees of freedom
Multiple R-squared:  0.9688,    Adjusted R-squared:  0.948 
F-statistic: 46.54 on 2 and 3 DF,  p-value: 0.005518

3.4 Regression results

Coefficients Estimate Std. Error t value Prob
(Intercept) 15.103654 0.536028 28.177 9.81e-05
x -0.224609 0.055097 -4.077 0.0266
x2 0.006171 0.001139 5.416 0.0123

\[mpg = 15.1 - 0.225 t + 0.00617 t^2\]

3.5 Forecasting


Call:
arima(x = log10(AirPassengers), order = c(0, 1, 1), seasonal = list(order = c(0, 
    1, 1), period = 12))

Coefficients:
          ma1     sma1
      -0.4018  -0.5569
s.e.   0.0896   0.0731

sigma^2 estimated as 0.0002543:  log likelihood = 353.96,  aic = -701.92

Call:
arima(x = log10(AirPassengers), order = c(0, 1, 1), seasonal = list(order = c(0, 
    1, 1), period = 12), method = "CSS")

Coefficients:
          ma1     sma1
      -0.3772  -0.5724
s.e.   0.0883   0.0704

sigma^2 estimated as 0.0002619:  part log likelihood = 354.32

Call:
arima(x = window(log10(AirPassengers), start = 1954), order = c(0, 1, 1), seasonal = list(order = c(0, 
    1, 1), period = 12))

Coefficients:
          ma1     sma1
      -0.4797  -0.4460
s.e.   0.1000   0.1514

sigma^2 estimated as 0.0001603:  log likelihood = 208.02,  aic = -410.04

3.6 Plot of the data

3.7 Decomposition of a data trend

3.8 Multivariant Regression - judging wine quality

\[\pmatrix{+ \frac{fixed}{acidity} & + \frac{volatile}{acidity}\\ + \frac{citric}{acid} & + \frac{residual}{sugar}\\ + {\small chlorides} & + {\small alcohol}\\ + \frac{free\ sulfur}{dioxide} & + \frac{total\ sulfur}{dioxide}\\ + {\small density} & + {\small sulphates}\\ + {\small pH}\\}\Rightarrow \frac{quality}{score}\]

3.9 Regression analysis on 12 variables


Call:
lm(formula = quality ~ fixedAcidity + volatileAcidity + citricAcid + 
    residualSugar + chlorides + freeSulfurDioxide + totalSulfurDioxide + 
    density + pH + sulphates + alcohol, data = red)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.68911 -0.36652 -0.04699  0.45202  2.02498 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)         2.197e+01  2.119e+01   1.036   0.3002    
fixedAcidity        2.499e-02  2.595e-02   0.963   0.3357    
volatileAcidity    -1.084e+00  1.211e-01  -8.948  < 2e-16 ***
citricAcid         -1.826e-01  1.472e-01  -1.240   0.2150    
residualSugar       1.633e-02  1.500e-02   1.089   0.2765    
chlorides          -1.874e+00  4.193e-01  -4.470 8.37e-06 ***
freeSulfurDioxide   4.361e-03  2.171e-03   2.009   0.0447 *  
totalSulfurDioxide -3.265e-03  7.287e-04  -4.480 8.00e-06 ***
density            -1.788e+01  2.163e+01  -0.827   0.4086    
pH                 -4.137e-01  1.916e-01  -2.159   0.0310 *  
sulphates           9.163e-01  1.143e-01   8.014 2.13e-15 ***
alcohol             2.762e-01  2.648e-02  10.429  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.648 on 1587 degrees of freedom
Multiple R-squared:  0.3606,    Adjusted R-squared:  0.3561 
F-statistic: 81.35 on 11 and 1587 DF,  p-value: < 2.2e-16

3.10 Refined analysis


Call:
lm(formula = quality ~ volatileAcidity + chlorides + freeSulfurDioxide + 
    totalSulfurDioxide + pH + sulphates + alcohol, data = red)

Residuals:
     Min       1Q   Median       3Q      Max 
-2.68918 -0.36757 -0.04653  0.46081  2.02954 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)         4.4300987  0.4029168  10.995  < 2e-16 ***
volatileAcidity    -1.0127527  0.1008429 -10.043  < 2e-16 ***
chlorides          -2.0178138  0.3975417  -5.076 4.31e-07 ***
freeSulfurDioxide   0.0050774  0.0021255   2.389    0.017 *  
totalSulfurDioxide -0.0034822  0.0006868  -5.070 4.43e-07 ***
pH                 -0.4826614  0.1175581  -4.106 4.23e-05 ***
sulphates           0.8826651  0.1099084   8.031 1.86e-15 ***
alcohol             0.2893028  0.0167958  17.225  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.6477 on 1591 degrees of freedom
Multiple R-squared:  0.3595,    Adjusted R-squared:  0.3567 
F-statistic: 127.6 on 7 and 1591 DF,  p-value: < 2.2e-16

3.11 Accuracy

Comparison of Recorded to Predicted Quality Score

\[\small\matrix{Rec/Pre & 4.0 & 4.5 & 5.0 & 5.5 & 6.0 & 6.5 & 7.0 & 7.5 & Total\\ \hline 3 & & 2 & 7 & 1 & & & & & 10 \\ 4 & & 3& 21 & 22 & 6 & 1 & & & 53 \\ 5 & 1& 6 &273 & 395& 18 & 71 & & 1 & 681 \\ 6 & & 2 & 72& 257&221 & 81 & 5 & & 638 \\ 7 & & & 1 & 20& 87 & 84 & 14& & 199 \\ 8 & & & & & 6 & 11 & 1 & & 18 \\ \hline total & 1 & 13 & 374& 605& 408& 184& 13& 1 & 1599 \\}\]

4 Genetic algorithms

Process based on evolutionary biology

  1. Selection of best candidates
  2. Combinations of best solutions
  3. Crossover
  4. Mutation

4.1 Combination and crossover

4.1.1 Genetic mutation

4.2 Searching for solutions in complicated domain space

  • Most searchs are basically hill-climbing exercises to find the maximum value
  • Basically the search creates an ascent path until the peak is found
  • However, multiple peaks and valleys can hide the true solution
  • Genetic can search multiple areas simultaneously to discover the answer faster.

4.3 A multipeak problem

4.3.1 Mastermind

Mastermind Game
  • The goal is to identify the sequence of 4 colors.
  • There are 8 possible colors.
  • A red score means a color is correct but in the wrong column
  • A green score indicate a color is correct and in the right column

5 Markov Chains

Measure the average transition rate over a year

From To Urban To Suburban
Urban 0.95 0.05
Suburban 0.03 0.97

\[\Updownarrow\]

City dwellers

5% move
95% stay

 

Rural / Suburban dwellers

3% move
97% stay

5.1 Markov Chains

\[\begin{matrix} & & Population & Transition & & Year end\\ Year & & Urban : Rural & matrix & & Population \\ 1: & & \left[\begin{matrix}130000 & 50000\\\end{matrix}\right]& \left[\begin{matrix} 0.95 & 0.05\\ 0.07 & 0.93\\ \end{matrix}\right] & \longrightarrow & \left[\begin{matrix}127000 & 53000\\\end{matrix}\right]\\ 2: & & \left[\begin{matrix}127000& 53000\\\end{matrix}\right]& \left[\begin{matrix} 0.95 & 0.05\\ 0.07 & 0.93\\ \end{matrix}\right] & \longrightarrow & \left[\begin{matrix}124360 & 55640\\\end{matrix}\right]\\ 3: & & \left[\begin{matrix}124360& 55640\\\end{matrix}\right]& \left[\begin{matrix} 0.95 & 0.05\\ 0.07 & 0.93\\ \end{matrix}\right] & \longrightarrow & \left[\begin{matrix}122037 & 57963\\\end{matrix}\right]\\ 4: & &\left[\begin{matrix} 122037 & 57963\\\end{matrix}\right]& \left[\begin{matrix} 0.95 & 0.05\\ 0.07 & 0.93\\ \end{matrix}\right] & \longrightarrow & \left[\begin{matrix}119992& 60008\\\end{matrix}\right]\\ & & & \dots &\\ 20:& &\left[\begin{matrix}105000& 75000\\\end{matrix}\right]& \left[\begin{matrix} 0.95 & 0.05\\ 0.07 & 0.93\\ \end{matrix}\right] & \longrightarrow & \left[\begin{matrix}105000& 75000\\\end{matrix}\right]\\ \end{matrix}\]

6 University students

Transition Matrix

\[\small\matrix{ From/to & Freshmen & Sophomore & Junior & Senior & Drop & Graduate\\ \hline Freshmen & 0.10 & 0.70 & & & 0.20 & \\ Sophomore & & 0.05 & 0.90 & & 0.05 & \\ Junior & & & 0.07 & 0.80 & 0.03 & 0.10 \\ Senior & & & & 0.05 & 0.02 & 0.97 \\ Drop & & & & & 1.00 &\\ Graduate & & & & & & 1.00 \\ }\]

7 Calculated Student Retention

\[\matrix{Year & Freshmen & Sophomore & Junior & Senior & Dropout & Graduate \\ \hline 1 & 200 & & & & & \\ 2 & 20 & 140 & & & 40 & \\ 3 & 2 & 21 & 126 & & 51 & \\ 4 & & 2 & 28 & 101 & 56 & 13 \\ 5 & & & 4 & 27 & 59 & 109\\ 6 & & & 1 & 5 & 60 & 135 \\ 7 & & & & 1& 60 & 139 \\ 8 & & & & & 60 & 140 \\ }\]

8 Neural network

\(\small\left[\begin{matrix}\circ\bullet\bullet\bullet\circ &\circ\circ\bullet\circ\circ & \bullet\bullet\bullet\bullet\circ& \bullet\bullet\bullet\bullet\circ &\circ\circ\bullet\circ\bullet &\bullet\bullet\bullet\bullet\bullet\\ \bullet\circ\circ\circ\bullet &\circ\bullet\bullet\circ\circ & \circ\circ\circ\circ\bullet & \circ\circ\circ\circ\bullet &\circ\bullet\circ\circ\bullet &\bullet\circ\circ\circ\circ \\ \bullet\circ\circ\circ\bullet &\circ\circ\bullet\circ\circ &\circ\bullet\bullet\bullet\circ & \circ\bullet\bullet\bullet\circ &\bullet\bullet\bullet\bullet\bullet &\bullet\bullet\bullet\bullet\circ\\ \bullet\circ\circ\circ\bullet &\circ\circ\bullet\circ\circ & \bullet\circ\circ\circ\circ & \circ\circ\circ\circ\bullet &\circ\circ\circ\circ\bullet &\circ\circ\circ\circ\bullet\\ \circ\bullet\bullet\bullet\circ &\circ\bullet\bullet\bullet\circ&\bullet\bullet\bullet\bullet\bullet& \bullet\bullet\bullet\bullet\circ &\circ\circ\circ\circ\bullet &\bullet\bullet\bullet\bullet\circ\\ & & & & &\\ \circ\circ\circ\bullet\circ &\bullet\bullet\bullet\bullet\bullet & \circ\bullet\bullet\bullet\circ& \circ\bullet\bullet\bullet\circ &\circ\circ\circ\circ\circ& \circ\circ\circ\circ\circ\\ \circ\circ\bullet\circ\circ & \circ\circ\circ\circ\bullet &\bullet\circ\circ\circ\bullet & \bullet\circ\circ\circ\bullet&\circ\circ\circ\circ\circ & \circ\circ\circ\circ\circ \\ \circ\bullet\bullet\bullet\circ & \circ\circ\circ\bullet\circ &\circ\bullet\bullet\bullet\circ& \circ\bullet\bullet\bullet\bullet &\bullet\bullet\bullet\bullet\bullet& \circ\circ\circ\circ\circ\\ \bullet\circ\circ\circ\bullet & \circ\circ\bullet\circ\circ &\bullet\circ\circ\circ\bullet& \circ\circ\circ\circ\bullet & \circ\circ\circ\circ\circ & \circ\bullet\bullet\circ\circ\\ \circ\bullet\bullet\bullet\circ & \circ\circ\bullet\circ\circ & \circ\bullet\bullet\bullet\circ &\circ\circ\circ\circ\bullet &\circ\circ\circ\circ\circ & \circ\bullet\bullet\circ\circ\\ \end{matrix}\right]\Longrightarrow\left[\begin{matrix}0\\ 1\\ 2\\ 3\\ 4\\ 5\\ 6\\ 7\\ 8\\ 9\\ -\\ .\\ \end{matrix}\right]\)

8.1 Layers

8.2 Digital Neuron

8.3 NeuralNet

8.4 Network Learning

9 Self-organizing map

9.1 SOM Examples

9.2 SOM - wines

$unit.classif
 [1]  8  1 12  6 11 11  2  6  2  8 18  3 22 19 18 23  3 22 20  5  5 14  5 10 15
[26]  5 10

$distances
 [1]  4.169619  1.009442  4.037950  3.485934  3.341263  1.689652  3.793879
 [8]  1.803102  1.266725  2.386014  5.661455 16.737000  2.295422  5.931455
[15] 18.710348  3.596056 25.163618  2.125708  3.504451  5.475336  6.467870
[22]  6.905745  3.308457  2.783191  2.699065  5.583573  4.590570

$whatmap
[1] 1

$user.weights
[1] 1

9.3 SOM Wines

9.4 Usefulness of SOM

  • Self-learning algorithm
  • Clusters data into useful groups
  • Compares new data to the nearest groups
  • Useful for an initial analysis of unprocessed data
  • Attempts to group items by similarity

10 Spell checking sample

Eye halve a spelling checker It came with my pea sea. It plainly marks four my revue miss steaks eye kin knot sea. Eye strike a quay and type a word and weight for it to say Weather eye yam wrong oar write. It shows me strait a weigh as soon as a mist ache is maid. It nose bee fore two long and eye can put the error rite. Its rare lea ever wrong. Eye have run this poem threw it, I am shore your pleased to no. Its letter perfect awl the way. My checker told me sew.

10.1 Grammerly corrected

Eye halve a spelling checker It came with my pea sea. It marks four my revue miss steaks eye kin knot sea. Eye strike a quay and type a word and weight for it to say Weather eye yam wrong oar write. It shows me straight a weight as soon as a mist ache is made. It nose bee fore two long and eye can put the error rite. It’s rare lea ever wrong. Eye have run this poem threw it, I shore your pleased to no. Its letter is perfect awl the way. My checker told me to sew.

10.2 MS Word corrected

Eyes halve a spelling checker It came with my pea sea. It plainly marks four my revue misses steaks eye kin knot sea. Eye strike a quay and type a word and weight for it to say Weather eye yam wrong or write. It shows me straight a weigh as soon as a mist ache is a maid. It noses bee fore two long and eye can put the error rite. It’s rare lea ever wrong. Eye have run this poem threw it, I shore your pleased to no. Its letter is perfect all the way. My checker told me to sew.

10.3 Manual proof-copy

I have a spelling checker It came with my PC. It plainly marks for my review mistakes I cannot see. I strike a key and type a word and wait for it to say whether I am wrong or right. It shows me straight away as soon as a mistake is made. It knows before too long and I can put the error right. It’s rarely ever wrong. I have run this poem threw it, I am sure you’re pleased to know it’s letter perfect all the way. My checker told me so.

10.4 Bing Chat correction

I have a spelling checker. It came with my PC. It plainly marks for my review mistakes I cannot see. I strike a key and type a word and wait for it to say whether I am wrong or right. It shows me straight away as soon as a mistake is made. It knows before too long and I can put the error right. It’s rarely ever wrong. I have run this poem through it, I am sure you’re pleased to know. It’s letter perfect all the way. My checker told me so.

10.5 Chat Bing Rewrite

My PC has a spelling checker that marks mistakes I can’t see. I type a word and press a key to check if it’s right or wrong. It shows me right away when I make an error. It knows before long and I can fix it easily. It hardly ever fails. I ran this poem through it and it said it’s perfect. That’s what my checker told me.

10.6 Results

Method Changes Corrections
Grammerly free edition 10 8
MS WORD 360 13 10
Bing Chat 38 38
Manual Copy editing 38 38

11 ChatGPT

  • launched on November 30, 2022, by OpenAI
  • result of over 100 years of natural language processing
  • uses associations to understand prompts and summarize text
  • Uses a generative model to format replay
  • Can match content, format and tone

11.1 How ChatGPT works

  • uses an ultra large language model
  • summarize and interpret natural language prompts
  • retrieve relevant and related information
  • format reports modelled after published documents
  • Attempts to reply based on the most likely text to follow

11.2 Time to Million users

11.3 Spin off AI Tech

  • Dall-E 2 applies generative models to pixels and images
  • Noonoouri applies to sound and videos to create music videos
  • Krisp: Remove background noise from audiotracks
  • Decktopus: AI created presentations
  • Eleven Labs: Voice over
  • 10Web: Build a websites
  • REimagine Home: Interior design

11.4 ChatGPT 3.5 vs 4.0 exam results

Exam ChatGPT 4 ChatGPT 3
Uniform Bar Exam 298 / 400 (~90th) 213 / 400 (~10th)
LSAT 163 (~88th) 149 (~40th)
SAT Evidence-Based Reading & Writing 710 / 800 (~93rd) 670 / 800 (~87th)
SAT Math 700 / 800 (~89th) 590 / 800 (~70th)
GRE Quantitative 163 / 170 (~80th) 147 / 170 (~25th)
GRE Verbal 169 / 170 (~99th) 154 / 170 (~63rd)
GRE Writing 4 / 6 (~54th) 4 / 6 (~54th)
USABO Semifinal Exam 2020 87 / 150 (99th - 100th) 43 / 150 (31st - 33rd)
USNCO Local Section Exam 2022 36 / 60 24 / 60
AP Art History 5 (86th - 100th) 5 (86th - 100th)
AP Biology 5 (85th - 100th) 4 (62nd - 85th)
AP Calculus BC 4 (43rd - 59th) 1 (0th - 7th)
AP Chemistry 4 (71st - 88th) 2 (22nd - 46th)
AP English Language and Composition 2 (14th - 44th) 2 (14th - 44th)
AP Environmental Science 5 (91st - 100th) 5 (91st - 100th)
AP Macroeconomics 5 (84th - 100th) 2 (33rd - 48th)
AP Microeconomics 5 (82nd - 100th) 4 (60th - 82nd)
AP Physics 4 (66th - 84th) 3 (30th - 66th)
AP Psychology 5 (83rd - 100th) 5 (83rd - 100th)
AP Statistics 5 (85th - 100th) 3 (40th - 63rd)
AP US Government 5 (88th - 100th) 4 (77th - 88th)
AP US History 5 (89th - 100th) 4 (74th - 89th)
AP World History 4 (65th - 87th) 4 (65th - 87th)

11.5 Areas of risks

  • Hallucinations
  • Harmful content
  • Harms of representation, allocation, and quality of service
  • Disinformation and influence operations
  • Proliferation of conventional and unconventional weapons
  • Privacy
  • Cybersecurity
  • Potential for risky emergent behaviors
  • Interactions with other systems
  • Economic impacts
  • Acceleration
  • Overreliance

12 FACIAL RECOGNITION

Dog-muffin challenge

12.1 State of the art

Current uses of facial recognition

  • Attendance tracking
  • Personal marketing
  • Banking id, fraud detection
  • Public security
  • Door locks

12.2 Detection avoidance

textured masks Face Recog

Printed clothing

13 Current limitations of AI

  • Common Sense Reasoning: contextual understanding and making intuitive judgments is lacking making understanding and application of common sense difficult

  • Creativity and Originality: output based on existing data and patterns is easy, but creativity and originality is laborious process of trial and error.

  • Emotional Intelligence: comprehension and expression of human emotions is ineffective. True emotional understanding and empathy are still beyond AI capabilities

  • Abstract and Symbolic Thinking: well-defined rules and concrete data come easy, but AI has little capability for handling metaphors, analogies, and abstract reasoning.

  • Physical Dexterity: physical dexterity and manipulation is are still limited in their capabilities compared to human physical abilities.

  • Ethical and Moral Reasoning: decision generally did conside ethical or moral implications

  • Adaptability and Generalization: New situations or applications of knowledge require retraining to perform in a new context.

13.1 The alignment problem

13.2 Deep fakes

<p

A farmer too lazy to plant in the spring has nothing to harvest in the fall.

Proverbs 20:4