class: right, middle, inverse, title-slide .title[ # Education in an Age of Generative AI ] .subtitle[ ##
] .author[ ### Saurabh Khanna & Sebastiaan den Broeder
TLC-FMG AI Team ] .date[ ### Feb 11, 2025 ] --- class: middle <style type="text/css"> .gray { color: lightgray; } </style> # Agenda - Where we are - Risks - Innovation --- class: middle, center, inverse # Where we are --- class: middle # Smart AIs are now omnipresent - End of 2022: ChatGPT launched - End of 2023: GPT-4 publicly available - End of 2024<sup><span style="color: #10e313;">1</span></sup>: + 🇺🇸: OpenAI’s GPT-4o, Anthropic’s Claude Sonnet 3.5, Google’s Gemini 1.5, <span style="color: #10e313;">Llama 3.2 from Meta</span>, Musk’s Grok 2, Amazon’s Nova + 🇨🇳: <span style="color: #10e313;">Alibaba’s Qwen, R1’s DeepSeek, and 01.ai’s Yi</span> + 🇫🇷: <span style="color: #10e313;">Mistral</span> .footnote[<span style="color: #10e313;"><sup>1</sup> Open-weight models</span>] --- class: middle # How (text) models think <img src="chat1.jpg" width="75%" style="display: block; margin: auto;" /> --- class: middle # How (text) models think <img src="chat3.png" width="75%" style="display: block; margin: auto;" /> --- class: middle # Smarter AIs are emerging  --- class: middle # In 2025 - Diminishing returns in text improvements `\(\to\)` already quite good - Massive improvements in image and video generation  --- <video autoplay muted loop style="position:absolute; top:0; left:0; width:100%; height:100%;"> <source src="sora.mp4" type="video/mp4"> Your browser might not support the video html tag. </video> --- class: middle, center, inverse # Where we are _as educators_ --- class: middle # AI can easily ace most traditional assessments <img src="exam1.png" width="80%" style="display: block; margin: auto;" /> --- class: middle # Wide and rapid adoption <img src="exam2.jpg" width="80%" style="display: block; margin: auto;" /> --- class: middle, center # The Two Illusions<sup>1</sup> .footnote[ 1: [(Mollick & Mollick, 2024)](https://dx.doi.org/10.2139/ssrn.4802463) ] --- class: middle # The Detection Illusion - Believing we can still easily detect AI use, and therefore can prevent it from being used in schoolwork - Leads us to rely on likely-outdated assessments, believing we can easily spot AI-generated work --- class: middle, no-animation # The Detection Illusion The technology has far surpassed our ability to consistently identify it. - Teachers could not, even though they thought they could [(Fleckenstein et al, 2024)](https://www.sciencedirect.com/science/article/pii/S2666920X24000109) -- - Editors at top linguistic journals could not [(Casal & Kessler, 2023)](https://doi.org/10.1016/j.rmal.2023.100068) -- - Detectors biased against non-native English speakers [(Liang et al, 2023)](https://arxiv.org/abs/2304.02819) -- - GPT-4 detected to be _more human than human_ [(Rathi et al, 2024)](https://arxiv.org/pdf/2407.08853) -- - GPT-4 gets it wrong 95% of the time [(Wu et al, 2024)](https://arxiv.org/pdf/2310.14724) --- class: middle # The Knowledge Illusion - Perceived understanding without genuine comprehension - Generative AI can produce convincing yet superficial responses - Bigger difference between homework and exam scores - Extremely easy to outsource any kind of mental effort > “There is no reason to believe that the students are aware that their homework strategy lowers their exam score... they make the commonsense inference that any study strategy that raises their homework quiz score raises their exam score as well. [(Glass & Kang, 2020)](https://www.tandfonline.com/doi/abs/10.1080/01443410.2020.1802645)” --- class: middle, center, inverse # Risks --- class: middle, no-animation # Risks: The Flaky Student's Ideal Course A student looking to game the system would want: - Many take-home Canvas quizzes - ChatGPT/Gemini, NotebookLM -- - Short take-home writing assignments requiring limited high-level skills -- - A teacher that isn’t aware of the capabilities of AI -- - **No** proper knowledge testing (for instance through in-person exams) -- - **No** long essays or papers (*this requires more prompting/effort...*) -- - An assessment structure that **doesn’t** compensate risky with 'insulated' assessment --- class: middle, center # [🔬 Course Risk Assessment](https://tlc-ai-risk.streamlit.app/) --- class: middle, no-animation # Risks: What we *can* focus on Remember the **Detection Illusion**: trying to spot AI usage in writing is a losing battle (and punishment depends on student honesty) - Allow us to focus on bigger questions - Which skills and assessment forms are still *necessary* and relevant? - Which skills can we responsibly *outsource/de-emphasize*? -- - Intrinsic motivation of the student - Making courses more engaging and directly relevant - Working **with** GenAI to enhance the level of education. --- class: middle, center, inverse # Innovation --- class: middle, center # [🎓 UvA AI Chatbot](https://uva-aichat.azurewebsites.net/) --- class: middle # Formative Assessment/Feedback  --- class: middle # Local/Private chatbots Options: [Jan.AI](https://jan.ai/), [LM Studio](https://lmstudio.ai/), [GPT4All](https://www.nomic.ai/gpt4all), [LocalAI](https://localai.io/) <center> <video autoplay muted loop style="width:60%; height:60%;"> <source src="jan.mp4" type="video/mp4"> Your browser might not support the video html tag. </video> </center> --- class: middle # Pilots at TLC-FMG .pull-left[ ## 11 ongoing pilots Topics range from: - Creating exam questions - Updating case study scenarios by context - Creating group presentations - Check LLM responses to assignments - LLM feedback and grading code ] .pull-right[ You can sign up:  ] --- class: inverse, middle # 💬 Thoughts/Questions