Spring 2025
For these examples, I’ll use some New Zealand fiscal data:
url_str <- "https://www.stats.govt.nz/assets/Uploads/Annual-enterprise-survey/Annual-enterprise-survey-2023-financial-year-provisional/Download-data/annual-enterprise-survey-2023-financial-year-provisional.csv" enterprise_data <- read.table(url_str, header=T, sep=',') # Load the data off the Internet head(enterprise_data, 3)
## Year Industry_aggregation_NZSIOC Industry_code_NZSIOC Industry_name_NZSIOC ## 1 2023 Level 1 99999 All industries ## 2 2023 Level 1 99999 All industries ## 3 2023 Level 1 99999 All industries ## Units Variable_code ## 1 Dollars (millions) H01 ## 2 Dollars (millions) H04 ## 3 Dollars (millions) H05 ## Variable_name Variable_category Value ## 1 Total income Financial performance 930995 ## 2 Sales, government funding, grants and subsidies Financial performance 821630 ## 3 Interest, dividends and donations Financial performance 84354 ## Industry_code_ANZSIC06 ## 1 ANZSIC06 divisions A-S (excluding classes K6330, L6711, O7552, O760, O771, O772, S9540, S9601, S9602, and S9603) ## 2 ANZSIC06 divisions A-S (excluding classes K6330, L6711, O7552, O760, O771, O772, S9540, S9601, S9602, and S9603) ## 3 ANZSIC06 divisions A-S (excluding classes K6330, L6711, O7552, O760, O771, O772, S9540, S9601, S9602, and S9603)
write
FunctionsYou can write out data frames as text-readable files using any delimeter (separator) you wish:
write.table(enterprise_data, file="enterprise.csv", sep=",") system("cat enterprise.csv | head -n 4") # just show the top 4 lines of that file to confirm
save(enterprise_data, file="enterprise.Rdata") # Save it as a binary object rm(enterprise_data) # Remove it from the environment load(file="enterprise.Rdata") # Restore it to the environment
You will need to install external library writexl
library(writexl) write_xlsx(enterprise_data, "enterprise-data.xslsx")
Install the haven external library to perform other exports:
SAS: write_xpt(enterprise_data, "enterprise-data.xpt")
SPSS: write_sav(enterprise_data, "enterprise-data.sav")
Stata: write_dta(enterprise_data, "enterprise-data.dta")
More information: https://haven.tidyverse.org/
For these examples, I’ll use some New Zealand fiscal data:
import pandas as pd import os url_str = "https://www.stats.govt.nz/assets/Uploads/Annual-enterprise-survey/Annual-enterprise-survey-2023-financial-year-provisional/Download-data/annual-enterprise-survey-2023-financial-year-provisional.csv" enterprise_data = pd.read_csv(url_str, sep=',') # Load the data off the Internet enterprise_data.head(3)
## Year ... Industry_code_ANZSIC06 ## 0 2023 ... ANZSIC06 divisions A-S (excluding classes K633... ## 1 2023 ... ANZSIC06 divisions A-S (excluding classes K633... ## 2 2023 ... ANZSIC06 divisions A-S (excluding classes K633... ## ## [3 rows x 10 columns]
to_csv
DataFrame MethodYou can write out data frames as text-readable files using any delimeter (separator) you wish:
enterprise_data.to_csv("enterprise.csv", sep=",") os.system("cat enterprise.csv | head -n 4") # just show the top 4 lines of that file to confirm
## 0
enterprise_data.to_pickle('enterprise.pkl') # Save it as a binary object enterprise_data = None # Remove it from the environment enterprise_data = pd.read_pickle('enterprise.pkl') # Restore it to the environment
You will need to install external library openpyxl
enterprise_data.to_excel('enterprise-data.xlsx', index=False)
Install the whatever external libraries needed to perform other exports:
Stata: enterprise_data.to_stata("enterprise-data.dta")
SAS: Requires the saspy library
SPSS: Requires the pyreadstat library
More information: https://pandas.pydata.org/docs/reference/io.html