统计学(Statistics)
机器学习(Machine learning)
数据科学(Data science)
人工智能(Artificial intelligence)
数据挖掘(Data mining)
模式识别(Pattern recognition)
深度学习(Deep learning)
Update on 2025-09-25
统计学(Statistics)
机器学习(Machine learning)
数据科学(Data science)
人工智能(Artificial intelligence)
数据挖掘(Data mining)
模式识别(Pattern recognition)
深度学习(Deep learning)
| Years | Hits | Salary | |
|---|---|---|---|
| -Alan Ashby | 14 | 81 | 475.0 |
| -Alvin Davis | 3 | 130 | 480.0 |
| -Andre Dawson | 11 | 141 | 500.0 |
| -Andres Galarraga | 2 | 87 | 91.5 |
| -Alfredo Griffin | 11 | 169 | 750.0 |
| -Al Newman | 2 | 37 | 70.0 |
| -Argenis Salazar | 3 | 73 | 100.0 |
| -Andres Thomas | 2 | 81 | 75.0 |
建立模型(Model setup)
\[ \mathbf y = F_m(\mathbf X; \boldsymbol \beta)\ -> \mathbf X \boldsymbol \beta. \] 训练参数(Training)
\[ \hat{\boldsymbol \beta} = \underset{\boldsymbol \beta}{\rm argmin}\ L(\mathbf X_{train}, \mathbf y_{train}, \boldsymbol \beta)\ -> \underset{\boldsymbol \beta}{\rm argmin} || \mathbf y_{train} - \mathbf X_{train} \boldsymbol \beta||_2^2. \] 预测(Prediction)
\[ \hat{\mathbf y} = F_m(\mathbf X_{test}; \hat{\boldsymbol \beta})\ -> \mathbf X_{test} \hat{\boldsymbol \beta}. \] 评估(Evaluation)
\[ {\rm mse(Mean\ squared\ error}) = \frac{1}{n_{\rm test}} \sum_{n_{\rm test}} (\hat{ y} - y_{\rm test})^2,\\ \rm acc(Accuracy) = \frac{True\ samples}{Total\ samples}. \]
Python
import numpy as np import pandas as pd import matplotlib.pyplot as plt import sklearn
R
library(data.table) library(dplyr) library(ggplot2) library(mlr3)
## Platform: macOS-15.7-arm64-arm-64bit
## Platform: Apple M4 Pro
Python
print("Python version:", platform.python_version())
## Python version: 3.12.10
R
print(R.version$version.string)
## [1] "R version 4.5.1 (2025-06-13)"
Python
import time a = np.random.standard_normal([5000, 5000]) start = time.time() b = a @ a.T end = time.time() time_ela = end - start print(time_ela)
## 0.19492793083190918
R
a <- matrix(rnorm(5000 * 5000), 5000) system.time( b <- tcrossprod(a))
## user system elapsed ## 0.350 0.014 0.204
Python
import torch
a = torch.randn([5000, 5000], device = torch.device('mps'))
t1 = time.time()
b_t = a @ a.T
print('torch @:', time.time() - t1)
## torch @: 0.0041158199310302734
import mlx.core as mx
a = mx.random.normal([5000, 5000], stream = mx.gpu)
t1 = time.time()
b_m = a @ a.T
mx.eval(b_m)
print('mlx @:', time.time() - t1)
## mlx @: 0.06186795234680176
Python
start = time.time() inv = np.linalg.inv(b) end = time.time() time_ela = end - start print(time_ela)
## 0.797569990158081
R
system.time(inv <- solve(b))
## user system elapsed ## 1.655 0.041 1.039
Python
start = time.time() uv = np.linalg.eigh(b) end = time.time() time_ela = end - start print(time_ela)
## 7.239617109298706
R
system.time(uv <- eigen(b))
## user system elapsed ## 10.713 0.128 9.472
Python
from nycflights13 import flights
## /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library/reticulate/python/rpytools/loader.py:120: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81. ## return _find_and_load(name, import_)
pd.set_option('display.max_columns', 20)
print(flights.iloc[0:7, 0:10])
## year month day dep_time sched_dep_time dep_delay arr_time \ ## 0 2013 1 1 517.0 515 2.0 830.0 ## 1 2013 1 1 533.0 529 4.0 850.0 ## 2 2013 1 1 542.0 540 2.0 923.0 ## 3 2013 1 1 544.0 545 -1.0 1004.0 ## 4 2013 1 1 554.0 600 -6.0 812.0 ## 5 2013 1 1 554.0 558 -4.0 740.0 ## 6 2013 1 1 555.0 600 -5.0 913.0 ## ## sched_arr_time arr_delay carrier ## 0 819 11.0 UA ## 1 830 20.0 UA ## 2 850 33.0 AA ## 3 1022 -18.0 B6 ## 4 837 -25.0 DL ## 5 728 12.0 UA ## 6 854 19.0 B6
R
kable(head(py$flights[, 1 : 10], 7), align = "c")
| year | month | day | dep_time | sched_dep_time | dep_delay | arr_time | sched_arr_time | arr_delay | carrier |
|---|---|---|---|---|---|---|---|---|---|
| 2013 | 1 | 1 | 517 | 515 | 2 | 830 | 819 | 11 | UA |
| 2013 | 1 | 1 | 533 | 529 | 4 | 850 | 830 | 20 | UA |
| 2013 | 1 | 1 | 542 | 540 | 2 | 923 | 850 | 33 | AA |
| 2013 | 1 | 1 | 544 | 545 | -1 | 1004 | 1022 | -18 | B6 |
| 2013 | 1 | 1 | 554 | 600 | -6 | 812 | 837 | -25 | DL |
| 2013 | 1 | 1 | 554 | 558 | -4 | 740 | 728 | 12 | UA |
| 2013 | 1 | 1 | 555 | 600 | -5 | 913 | 854 | 19 | B6 |