title: “EDA_Phyton” author: “Andrea B” date: “2023-11-16” output: pdf_document: pdf_print: paged —

¿Cuál es la Distribución de la Generación?

A continuación se realizará el conteo del número de pokemon que se encuentran en cada generación.

Pokemon.value_counts('Type 1')

## Type 1
## Water       112
## Normal       98
## Grass        70
## Bug          69
## Psychic      57
## Fire         52
## Electric     44
## Rock         44
## Ghost        32
## Ground       32
## Dragon       32
## Dark         31
## Poison       28
## Fighting     27
## Steel        27
## Ice          24
## Fairy        17
## Flying        4
## Name: count, dtype: int64

Pokemon.describe()

##                 #      Total          HP  ...     Sp. Def       Speed  Generation
## count  800.000000  800.00000  800.000000  ...  800.000000  800.000000   800.00000
## mean   362.813750  435.10250   69.258750  ...   71.902500   68.277500     3.32375
## std    208.343798  119.96304   25.534669  ...   27.828916   29.060474     1.66129
## min      1.000000  180.00000    1.000000  ...   20.000000    5.000000     1.00000
## 25%    184.750000  330.00000   50.000000  ...   50.000000   45.000000     2.00000
## 50%    364.500000  450.00000   65.000000  ...   70.000000   65.000000     3.00000
## 75%    539.250000  515.00000   80.000000  ...   90.000000   90.000000     5.00000
## max    721.000000  780.00000  255.000000  ...  230.000000  180.000000     6.00000
## 
## [8 rows x 9 columns]

Pokemon["Total"].plot(kind='hist',figsize=(10,8))

¿En que generación es mas probable encontrar un pokemon no lengendario con un ataque alto?.

Dentro del analisis realizado se relaciona el codigo a continuacion del conteo de los pokemon legendarios arrojando los siguientes datos:

Pokemon["Attack"].isin(["True", "False"])

## 0      False
## 1      False
## 2      False
## 3      False
## 4      False
##        ...  
## 795    False
## 796    False
## 797    False
## 798    False
## 799    False
## Name: Attack, Length: 800, dtype: bool

Pokemon.info()

## <class 'pandas.core.frame.DataFrame'>
## RangeIndex: 800 entries, 0 to 799
## Data columns (total 13 columns):
##  #   Column      Non-Null Count  Dtype 
## ---  ------      --------------  ----- 
##  0   #           800 non-null    int64 
##  1   Name        800 non-null    object
##  2   Type 1      800 non-null    object
##  3   Type 2      414 non-null    object
##  4   Total       800 non-null    int64 
##  5   HP          800 non-null    int64 
##  6   Attack      800 non-null    int64 
##  7   Defense     800 non-null    int64 
##  8   Sp. Atk     800 non-null    int64 
##  9   Sp. Def     800 non-null    int64 
##  10  Speed       800 non-null    int64 
##  11  Generation  800 non-null    int64 
##  12  Legendary   800 non-null    bool  
## dtypes: bool(1), int64(9), object(3)
## memory usage: 75.9+ KB

Pokemon["Attack"].sum()

## 63201

Pokemon.loc[Pokemon["Generation"]].shape

## (800, 13)

Pokemon Legendario y No Legendario

Dentro del analisis de datos de Pokemon vamos a realizar la comparación del tipo más común de pokemon legendario.

Pokemon Legendario

Pokemon.loc[Pokemon["Legendary"],"Type 1"].value_counts()

## Type 1
## Psychic     14
## Dragon      12
## Fire         5
## Electric     4
## Water        4
## Rock         4
## Steel        4
## Ground       4
## Grass        3
## Ice          2
## Normal       2
## Ghost        2
## Dark         2
## Flying       2
## Fairy        1
## Name: count, dtype: int64

Pokemon.loc[Pokemon["Legendary"],"Type 1"].value_counts().plot(kind="bar")

Pokemon No Legendario

ordinary = Pokemon[Pokemon["Legendary"] == False].reset_index(drop=True)
print(ordinary.shape)

## (735, 13)

ordinary.head()

##    #                   Name Type 1  ... Speed  Generation  Legendary
## 0  1              Bulbasaur  Grass  ...    45           1      False
## 1  2                Ivysaur  Grass  ...    60           1      False
## 2  3               Venusaur  Grass  ...    80           1      False
## 3  3  VenusaurMega Venusaur  Grass  ...    80           1      False
## 4  4             Charmander   Fire  ...    65           1      False
## 
## [5 rows x 13 columns]

ordinary = Pokemon[Pokemon["Legendary"] == False].value_counts()

ordinary = Pokemon[Pokemon["Legendary"] == False].value_counts().plot(kind="bar")

Analisis Final de Pokemon

Durante el desarrollo de este proyecto se pudo visualizar como estan cada uno catalogados cada uno de los pokemons a continuación se muestra un grafico de acuerdo a su categoria.

Pokemon["Type 1"].value_counts().plot(kind="pie",autopct="%1.1f%%",cmap='tab20c',figsize=(10,8))

Su distribución total.

Pokemon.head()

##    #                   Name Type 1  ... Speed  Generation  Legendary
## 0  1              Bulbasaur  Grass  ...    45           1      False
## 1  2                Ivysaur  Grass  ...    60           1      False
## 2  3               Venusaur  Grass  ...    80           1      False
## 3  3  VenusaurMega Venusaur  Grass  ...    80           1      False
## 4  4             Charmander   Fire  ...    65           1      False
## 
## [5 rows x 13 columns]

Pokemon["Total"].plot(kind='box',vert=False,figsize=(10,5))

Distribución de pokemon legendarios.

Pokemon["Legendary"].value_counts().plot(kind="pie",autopct="%1.1f%%",cmap="Set3",figsize=(10,8))

Podemos visualizar cual es el pokemon mas poderoso de las 3 primeras generaciones, del tipo agua.

Pokemon["Generation"].value_counts(sort=False).plot(kind="bar")

(
    (Pokemon["Generation"]==1)|
    (Pokemon["Generation"]==2) |
    (Pokemon["Generation"]==3)
).sum()

## 432

Pokemon.loc[
    (Pokemon["Type 1"]=="Water")&
    Pokemon["Generation"].isin([1,2,3])
].sort_values(by="Total",ascending=False).head()

##        #                     Name Type 1  ... Speed  Generation  Legendary
## 422  382      KyogrePrimal Kyogre  Water  ...    90           3       True
## 421  382                   Kyogre  Water  ...    90           3       True
## 141  130    GyaradosMega Gyarados  Water  ...    81           1      False
## 283  260    SwampertMega Swampert  Water  ...    70           3      False
## 12     9  BlastoiseMega Blastoise  Water  ...    78           1      False
## 
## [5 rows x 13 columns]

El Pokémon legendario ultrapoderoso se muestra a continuación.

import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(14, 7))
sns.scatterplot(data=Pokemon, x="Defense", y="Attack", hue='Legendary', ax=ax)
ax.annotate(
    "Who's this guy?", xy=(140, 150), xytext=(160, 150), color='red',
    arrowprops=dict(arrowstyle="->", color='red')
)

sns.set_palette("husl", 8)
ax = sns.distplot(Pokemon['Attack'])

## <string>:1: UserWarning: 
## 
## `distplot` is a deprecated function and will be removed in seaborn v0.14.0.
## 
## Please adapt your code to use either `displot` (a figure-level function with
## similar flexibility) or `histplot` (an axes-level function for histograms).
## 
## For a guide to updating your code to use the new functions, please see
## https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751
## 
## C:\Users\Andrea\AppData\Local\Programs\Python\PYTHON~1\Lib\site-packages\seaborn\_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
##   if pd.api.types.is_categorical_dtype(vector):
## C:\Users\Andrea\AppData\Local\Programs\Python\PYTHON~1\Lib\site-packages\seaborn\_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
##   with pd.option_context('mode.use_inf_as_na', True):

ax.set_title("Pokemon Attack Distribution", fontdict={'fontsize': 16})

sns.set_palette("husl", 8)
ax = sns.distplot(Pokemon['Defense'])

## <string>:1: UserWarning: 
## 
## `distplot` is a deprecated function and will be removed in seaborn v0.14.0.
## 
## Please adapt your code to use either `displot` (a figure-level function with
## similar flexibility) or `histplot` (an axes-level function for histograms).
## 
## For a guide to updating your code to use the new functions, please see
## https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751
## 
## C:\Users\Andrea\AppData\Local\Programs\Python\PYTHON~1\Lib\site-packages\seaborn\_oldcore.py:1498: FutureWarning: is_categorical_dtype is deprecated and will be removed in a future version. Use isinstance(dtype, CategoricalDtype) instead
##   if pd.api.types.is_categorical_dtype(vector):
## C:\Users\Andrea\AppData\Local\Programs\Python\PYTHON~1\Lib\site-packages\seaborn\_oldcore.py:1119: FutureWarning: use_inf_as_na option is deprecated and will be removed in a future version. Convert inf values to NaN before operating instead.
##   with pd.option_context('mode.use_inf_as_na', True):

ax.set_title("Pokemon Defense Distribution", fontdict={'fontsize': 16})

mean_attack_generation = Pokemon.groupby("Generation")["Attack"].mean().sort_values()
print(mean_attack_generation)

## Generation
## 2    72.028302
## 6    75.804878
## 1    76.638554
## 3    81.625000
## 5    82.066667
## 4    82.867769
## Name: Attack, dtype: float64

mean_defense_generation = Pokemon.groupby("Generation")["Defense"].mean().sort_values()
print(mean_defense_generation)

## Generation
## 1    70.861446
## 5    72.327273
## 2    73.386792
## 3    74.100000
## 6    76.682927
## 4    78.132231
## Name: Defense, dtype: float64

Pokémon con Estadísticas

¿Cuál es la Distribución de la Generación?

¿En que generación es mas probable encontrar un pokemon no lengendario con un ataque alto?.

Pokemon Legendario y No Legendario

Analisis Final de Pokemon