Regression and model training

This introducction is based about Luis Guillermo Diaz Monrroy

According to the dictionary of the Royal Academy of the Spanish Language - RAE, Some of the meanings of the word model are the following: - Archetype or point of reference to imitate or reproduce it.

If the data available for a study are a random sample from a population, the central interest is to characterize the population based on the information contained

in the sample data; that is to say, infer the properties of the population based on the observations contained in the sample. The conceptual device that

makes this generalization possible is the model that governs the population, that is, the probabilistic model. The architecture, construction and properties of a model probabilistic is done through the theory and calculation of probabilities. Probabilistic models or probability distribution models are They can be considered from various points of view. One is the number of variables

associated random variables, thus we have univariate models and multivariate models. ados; whether it is a random variable or a random vector (or in general,

a random matrix), respectively. Another classification is according to the values that random variables take, which make random variables be discrete (take a finite number of values or a countably infinite number) which can be continuous (if they take values in an interval of the number line) or mixed. Some of the discrete probabilistic models are: discrete uniform, Bernoulli, binomial, Poisson, geometric, negative binomial, hypergeometric. While those of continuous type include: uniform, normal, gamma, beta, Weibul, among others.

A continuous multivariate probabilistic model is the multivariate normal, while after which a discrete one is the multinomial distribution.

Thus, within the statistical analysis one of the assumptions that must be ensured meet the data is that they have been generated by the probabilistic model

particular assumed. For this purpose, exploratory tools are available. and descriptive graphs such as histograms, box plots

plot and whiskers), stems and leaves plots, quantile plots, quantile (Q-Q plot); as well as the calculation of some statistics such as the average,

median, variance, coefficient of variation, bias, kurtosis, coefficient of cor- relationship support or not the probabilistic assumption about the data. through some

statistics, of a confirmatory nature, it can be evidenced if it should be rejected or not the assumption that a particular probabilistic model generates the data. Pena (2015)

image