Introducción

Objetivos

Objetivo principal

Desarrollar un modelo de machine learning que calcule la probabilidad de aprobar un score digital.

Detalles

Alcance

El proyecto plantea tener un alcance según los lineamientos:

  1. Número de clientes afectados
  2. Patrones visitas
  3. Información recopilada
  4. Otros

Población

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry’s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

EDA

Data

ventas
ID Prod Quant Val Insp
v1 p1 182 1665 unkn
v2 p1 3072 8780 unkn
v3 p1 20393 76990 unkn
v4 p1 112 1100 unkn
v3 p1 6164 20260 unkn
v5 p2 104 1155 unkn
v6 p2 350 5680 unkn
v7 p2 200 4010 unkn
v8 p2 233 2855 unkn
v9 p2 118 1175 unkn

Distribuciones

  • El promedio de los ingresos de venta es 14617.07
  • La cantidad de ventas que superan el promedio son 28271
  • Ingreso promedio por ventas fraudulentas es 93200.02
Data summary
Name sales
Number of rows 401146
Number of columns 5
_______________________
Column type frequency:
factor 3
numeric 2
________________________
Group variables None

Variable type: factor

skim_variable n_missing complete_rate ordered n_unique top_counts
ID 0 1 FALSE 6016 v43: 10159, v54: 6017, v42: 3902, v16: 3016
Prod 0 1 FALSE 4548 p11: 3923, p37: 1824, p14: 1720, p19: 1702
Insp 0 1 FALSE 3 unk: 385414, ok: 14462, fra: 1270

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
Quant 13842 0.97 8442.00 918351.03 100 107 168 738 473883883 ▇▁▁▁▁
Val 1182 1.00 14617.07 69712.59 1005 1345 2675 8680 4642955 ▇▁▁▁▁

Gráficos

G1

G2

Resultados

Column

Chart A

Column

Chart B

Chart C

---
title: "Análisis de datos"
output: 
  flexdashboard::flex_dashboard:
    theme: cosmo 
    orientation: columns
    vertical_layout: fill
    logo: https://cdn-icons-png.flaticon.com/32/25/25231.png
    storyboard: true
    social: menu
    source: embed
---


# Introducción {data-icon="fa fa-battery-1"}

## Objetivos

### Objetivo principal

Desarrollar un modelo de machine learning que calcule la probabilidad de aprobar un score digital.

![](https://docs.microsoft.com/es-ES/azure/architecture/data-science-process/media/overview/tdsp-lifecycle2.png)

## Detalles

### Alcance

El proyecto plantea tener un alcance según los lineamientos:

1. Número de clientes afectados
2. Patrones visitas
3. Información recopilada
4. Otros

### Población

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

![](https://docs.microsoft.com/es-ES/azure/architecture/data-science-process/media/overview/tdsp-dir-structure.png)

# EDA {data-icon="fa fa-bug" .storyboard}

### Data

```{r}
library(DMwR2)
data(sales, package="DMwR2")
library(data.table)
library(knitr)
#df = fread('https://storage.googleapis.com/rmarkdowntaller/01Borrar/BD_COBERTURA%20EXTRAORDINARIA.csv',encoding = 'UTF-8',nrows=50)

knitr::kable(sales[1:10,], caption = 'ventas')
```

### Distribuciones

* El promedio de los ingresos de venta es `r format(mean(sales$Val , na.rm=TRUE), scientific = FALSE)`
* La cantidad de ventas que superan el promedio son `r sum(sales$Quant > mean(sales$Quant, na.rm=TRUE), na.rm=TRUE)`
* Ingreso promedio por ventas fraudulentas es `r library(dplyr); sales %>% filter(Insp =="fraud") %>% summarise(mean(Val, na.rm = TRUE)) %>% as.data.frame() %>%  format(scientific = FALSE)`
```{r}
library(skimr)
skim(sales)
```



### Gráficos

#### G1
```{r echo=FALSE, fig.height=4, fig.width=8, message=FALSE, warning=FALSE}
library(tidyverse)
fig1 <- ggplot(data=sales, aes(x = Insp))+
  geom_bar()
```

#### G2
```{r fig.height=4, fig.width=8}
library(plotly)
ggplotly(fig1)
```


# Resultados

```{r setup, include=FALSE}
library(flexdashboard)
```

Column {data-width=650}
-----------------------------------------------------------------------

### Chart A

```{r}
library(leaflet)

m <- leaflet() %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  addMarkers(lng=174.768, lat=-36.852, popup="The birthplace of R")
m
```

Column {data-width=350}
-----------------------------------------------------------------------

### Chart B

```{r}

```

### Chart C

```{r}

```