Late submission is not allowed.

Questions:

1. Please load up packages tidyquant, tidyverse and timetk. Import the data file from Tronclass: tej_day_price_2024_20250630.txt. Try using functions read_csv(), read_tsv() and read_delim() to import data. Show me the imported results of by using one of the three functions: glimpse(), head() or str() to show the results. (Note: If you are asked to show the results, you can use head() or glimpse() when your data in your answers is very big and long.)

## Rows: 337,347
## Columns: 12
## $ CO_ID                 <chr> "1101 TCC", "1102 ACC", "1103 CHC", "1104 UCC", …
## $ Date                  <dbl> 20240102, 20240102, 20240102, 20240102, 20240102…
## $ `TSE ID`              <dbl> 1101, 1102, 1103, 1104, 1108, 1109, 1110, 1201, …
## $ `TSE Sector`          <chr> "01", "01", "01", "01", "01", "01", "01", "02", …
## $ `English Short Name`  <chr> "TCC", "ACC", "CHC", "UCC", "Lucky Cement", "HSI…
## $ `Open(NTD)`           <dbl> 32.5373, 37.2642, 17.7825, 26.0628, 14.1679, 16.…
## $ `High(NTD)`           <dbl> 32.5373, 37.4442, 17.7825, 26.1505, 14.1679, 16.…
## $ `Low(NTD)`            <dbl> 32.3038, 36.9492, 17.5953, 25.9750, 14.0343, 16.…
## $ `Close(NTD)`          <dbl> 32.3972, 37.0392, 17.6421, 26.0628, 14.0788, 16.…
## $ `Volume(1000S)`       <dbl> 14937, 6223, 171, 260, 442, 228, 57, 126, 48, 18…
## $ `Amount(NTD1000)`     <dbl> 518751, 256522, 3240, 7736, 6992, 4159, 1075, 24…
## $ `Market Cap.(NTD MN)` <dbl> 262026, 145941, 14896, 19995, 6395, 6209, 10754,…

2. Replace column 2, 3, 5, 9 and 12 with new column names: “date”, “id”, “name”, “price”, “cap”. Show your results.

## Rows: 337,347
## Columns: 5
## $ id    <dbl> 1101, 1102, 1103, 1104, 1108, 1109, 1110, 1201, 1203, 1210, 1213…
## $ name  <chr> "TCC", "ACC", "CHC", "UCC", "Lucky Cement", "HSINGTA", "Tuna Cem…
## $ date  <dbl> 20240102, 20240102, 20240102, 20240102, 20240102, 20240102, 2024…
## $ price <dbl> 32.3972, 37.0392, 17.6421, 26.0628, 14.0788, 16.1807, 18.3336, 1…
## $ cap   <dbl> 262026, 145941, 14896, 19995, 6395, 6209, 10754, 9640, 13992, 52…

3. Select column id, date, price, and change idformat to text, date format to date. Also change the data format from long to wide and show your results (Hint: you can use dcast() or spread() function).


4. Show the stock ids with NA values and compute the number of NA for each stock.


5. Replace NA values with the closest available stock prices (Hint: you can use na.locf()).


6. Delete the stock which contains prices of NA in your data in question 4. Show the updated number of rows and columns in your filtered data.

## [1] 358 937

7. Convert data in Question 6 into time series data (xts) (Hint: you can use tk_xts()). And calculuate daily rate of returns (Hint: use Return.calculate() and compute discrete returns). Delete the first row and show the first five stocks with first five days of returns.

##                    1101         1102         1103         1104         1108
## 2024-01-03 -0.014408653 -0.012149290 -0.007958236 -0.006733735 -0.003160781
## 2024-01-04  0.000000000  0.012298711  0.000000000 -0.013558772  0.003170803
## 2024-01-05  0.004384536 -0.001214929  0.002674026  0.010306896  0.000000000
## 2024-01-08 -0.002909225  0.004865628  0.002666895 -0.003399291  0.006328664
## 2024-01-09 -0.005841680 -0.007263102 -0.002659801 -0.013655209 -0.025155457

8. Compute monthly returns. Delete the first row and show the first five stocks with first five months of returns (Hint: you can use to.period() or to.monthly()).

##                    1101         1102         1103        1104         1108
## 2024-02-29  0.007852356  0.008727188 -0.008404066 0.015384975  0.012860762
## 2024-03-29  0.014194342  0.008546035  0.000000000 0.003176291 -0.003157880
## 2024-04-30 -0.009273852  0.002293315 -0.018668262 0.000000000 -0.014530578
## 2024-05-31  0.004564721 -0.010715712  0.061451680 0.021021905  0.008931255
## 2024-06-28 -0.001460420  0.000000000  0.000000000 0.005641527  0.002990431

9. Find the 20 largest cap firms in the year end of 2024 and 2025. Show the results (Hint: you can use select(), filter(), group_by(), arrange(), slice(), ungroup()).


Summary

This analysis demonstrates:

  1. Data Import: Successfully loaded 337,347 rows of Taiwan stock market data
  2. Data Cleaning: Handled missing values and restructured data
  3. Data Transformation: Converted between long/wide formats and to time series
  4. Financial Calculations: Computed daily and monthly returns
  5. Market Analysis: Identified top companies by market capitalization

Dataset Coverage:

  • Time Period: January 2, 2024 to June 30, 2025
  • Stocks: 936 companies (after removing 10 with excessive NAs)
  • Trading Days: 358 days