How have the transistors (in million) of CPUs and GPUs changed over time, and how does this reflect on Moore’s Law?
Code
# Importing the datasetchip_data <-read_csv("/Users/parsakeyvani/Desktop/Adv Data viz/spring-2024-lab5-plotly-keyvanip/data/chip_dataset.csv")# data cleaning and manipulation to answer the first questionq1_data <- chip_data %>%select(`Release Date`, Type, `Transistors (million)`) %>%mutate(`Release Date`=as.Date(`Release Date`,"%m/%d/%y")) %>%filter(month(`Release Date`) ==12&year(`Release Date`) <2021) %>%mutate(Date =yearmonth(`Release Date`)) %>%group_by(Date, Type) %>%summarise(Transistors =round(mean(as.integer(`Transistors (million)`)))) %>%mutate(Growth =round(Transistors /lag(Transistors), 2)) # Create the Plotly plotplot_ly(data = q1_data, x =~as.Date(Date), y =~Transistors, type ='scatter', mode ='lines+markers',color =~Type, hoverinfo ='text',text =~paste('Date:', Date, "<br>Growth (if val = 2, matches Moore's Law):", Growth)) %>%layout(title ='Transistors Growth Over Time by Type',xaxis =list(title ='Date'),yaxis =list(title ='Transistors (Million)'))
Figure 1: This graph illustrates the validation of Moore’s Law, demonstrating that the number of transistors on an integrated circuit has indeed doubled approximately every two years. The verification of this trend is facilitated through an interactive feature: hovering over data points reveals detailed information. Additionally, sequential data points are analyzed using a ‘Growth’ metric within the plot
Plot 2
Question
How does the base clock speed (Freq in GHz) correlate with the transistor count for CPUs and GPUs, considering the data available?
Code
# data cleaning and manipualtion to answer the second questionq2_data <- chip_data %>%select(`Release Date`, Type, `Transistors (million)`, `Freq (GHz)`) %>%mutate(`Release Date`=as.Date(`Release Date`,"%m/%d/%y")) %>%filter(month(`Release Date`) ==12&year(`Release Date`) <2021) %>%mutate(Date =yearmonth(`Release Date`)) %>%group_by(Date, Type) %>%summarise(Transistors =round(mean(as.integer(`Transistors (million)`))),Freq =round(mean(as.integer(`Freq (GHz)`)))) %>%drop_na()# Separate data by Type and fit linear models for eachmodels <- q2_data %>%split(.$Type) %>%map(~linear_reg() %>%set_engine("lm") %>%set_mode("regression") %>%fit(Transistors ~ Freq, data = .x))# Prepare a sequence of Freq values for predictions for each Typefreq_range <-range(q2_data$Freq)x_range <-seq(from = freq_range[1], to = freq_range[2], length.out =100)# Predict using the models for each Type and prepare for plottingpredictions <-map2(models, names(models), ~{ new_data <-tibble(Freq = x_range) predicted <-predict(.x, new_data) %>%bind_cols(new_data, .) %>%mutate(Type = .y)}) %>%bind_rows()predictions <- predictions %>%mutate(Type =ifelse(Type =="CPU", "Regression Fit CPU", "Regression Fit GPU")) %>%mutate(Line_Color =ifelse(Type =="Regression Fit CPU", "blue", "grey"))# Create the initial scatter plotfig <-plot_ly(data = q2_data, x =~Freq, y =~Transistors, type ='scatter', mode ='markers',color =~Type, colors =c("CPU"="blue", "GPU"="grey"), alpha =0.65) %>%layout(title ='Clock Speed (Freq in GHz) Correlation with the Transistors \nCount for CPUs and GPUs',xaxis =list(title ='Freq (GHz)'),yaxis =list(title ='Transistors (Million)'))# Add the regression lines for each Type with specified colorsfig <- fig %>%add_lines(data = predictions, x =~Freq, y =~.pred, line =list(color =~Line_Color), name =~Type)fig
Figure 2: The plot shows correlation between Base Clock Speed and Transistor Count in CPUs and GPUs. Both GPUs and CPUs exhibit a positive correlation between Freq and the number of transistors; however, GPUs show a greater magnitude of increase.
How do CPUs and GPUs differ in terms of die size and process size for the same time period?
Code
import numpy as npimport pandas as pdimport plotly.express as px# moving the cleaned data from r to pythonq3_data_py = r.q3_dataq3_data_py = pd.DataFrame(q3_data_py) # Creating the line plotfig = px.line(q3_data_py, x='Date', y='die_size', color='Type', facet_col="Type", title='Die Size Over Time (quarterly) by Type', labels={'die_size': 'Die Size', 'Date': 'Date'}, template='plotly_white', markers=True) # Adding markers to points on the line for clarityfig.show()
Figure 3: This graph tracks the quarterly changes in die size for both CPUs and GPUs over two decades. The plot reveals the fluctuations and general upward trends in the physical size of the processor dies, which is indicative of the technological advancements and manufacturing capabilities in semiconductor fabrication during the observed period. The data shows variability in die sizes, with GPUs generally exhibiting larger die sizes compared to CPUs, which reflects their different roles and performance requirements in computing.
Plot 2
Question
Which vendor has shown the most significant improvements in chip technology (specifically process size) over the years and what are the trends?
Code
# moving the cleaned data from r to pythonq4_data_py = r.q4_dataq4_data_py = pd.DataFrame(q4_data_py)# Convert the Date column to datetime type for better plottingq4_data_py['Date'] = pd.to_datetime(q4_data_py['Date'])# Create the line plotfig = px.line(q4_data_py, x='Date', y='Process_size', color='Vendor', facet_col='Vendor', facet_col_wrap=2, title='Process Size Over Time by Vendor', labels={'Process_size': 'Process Size (nm)', 'Date': 'Date'}, template='plotly_white', hover_data=['Date'], markers=True)fig.show()
Figure 4: This graph displays the progressive reduction in semiconductor process size (measured in nanometers) of four major vendors: ATI, NVIDIA, Intel, and AMD. Each line represents the vendor’s trajectory in minimizing the process size of their chips, which is a key indicator of technological advancement in semiconductor manufacturing. The trend lines show that all vendors have successfully reduced their process sizes over the years, with varying degrees of steepness in their respective trajectories. The plot shows that NVIDIA has been able to make the lowest process size as of the latest date with 5 nm. Intel and AMD are also very close coming at 6 nm both. ATI however is the vendor with the largest nm at 40, but this is mainly becasue our data stops at 2012 for ATI, which is not an accurate comparison as we have more recent data for the other three vendors.