Fixing Errors When Importing csv Tables into pgadmin4¶

Create a table in pgadmin4¶

At first, create a table in pgadmin4 that matches columns of the table in my local directory that I want to import to. Right-click on the table in pgadmin4, then click 'Import/Export data' to copy data into pgadmin4. If two tables not matching columns on names, order, format that will cause errors.

The way that reduces errors is to copy names of columns via python function 'dataset.columns' as follow:

In [ ]:
import pandas as pd

df = pd.read_csv("/Users/nnthieu/Downloads/synthea_sample_data_csv_latest/supplies.csv")
df.shape
print(df.columns)
print(df.info())
Index(['DATE', 'PATIENT', 'ENCOUNTER', 'CODE', 'DESCRIPTION', 'QUANTITY'], dtype='object')
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1376 entries, 0 to 1375
Data columns (total 6 columns):
 #   Column       Non-Null Count  Dtype 
---  ------       --------------  ----- 
 0   DATE         1376 non-null   object
 1   PATIENT      1376 non-null   object
 2   ENCOUNTER    1376 non-null   object
 3   CODE         1376 non-null   int64 
 4   DESCRIPTION  1376 non-null   object
 5   QUANTITY     1376 non-null   int64 
dtypes: int64(2), object(4)
memory usage: 64.6+ KB
None

'DATE', 'PATIENT', 'ENCOUNTER', 'CODE', 'DESCRIPTION', 'QUANTITY'

Copy this results I will have the names of columns of the table 'supplies' exactly in the names, order.

Also see the info above to know the data type of columns. Then I create a table using SQL codes as follow:

CREATE TABLE supplies (

DATE DATE

,PATIENT VARCHAR(100)

,ENCOUNTER VARCHAR(100)

,CODE VARCHAR(100)

,DESCRIPTION VARCHAR(100)

,QUANTITY INT );

These codes will create a table in pgadmin4 with columns and their data type as above.

Then right-click on the name of table in pgadmin4 to import table from local directory into.

Potential errors¶

Errors if not matching names, oder of columns¶

'DATE', 'PATIENT', 'ENCOUNTER', 'CODE', 'DESCRIPTION', 'QUANTITY'

This way makes the names, order correct.

Errors if not matching format of data types.¶

Errors may happen for DATE, INT, VARCHAR and long enough for VARCHAR.

3 types of DATE format:

DATE for '2024-05-15'

TIMESTAMP for '2024-05-15 05:20:34'

TIMESTAMPTZ for '2024-05-15T05:20:34Z'

Check the .csv file before in order to set data type for the table correctly. If the data type not matching exactly that will cause errors.

PATIENT VARCHAR(100) is long enough

DESCRIPTION VARCHAR(100) may not long enough, so set

DESCRIPTION VARCHAR(500) is long enough

QUANTIY VARCHAR(100) will cause error, it should be

QUANTITY INT

If errors happen, check the 'error message' to know what cause errors, at what columns. Then fix the errors exactly.

If in .csv file, the DATE column has a value of '20240415' that will cause error, so I have to correct this value before importing the table into pgadmin4 in order to avoid errors.

.