About this report

This report can be reproduced from the underlying RMarkdown code in multiple formats. See the wiki pages on literate programming paradigm of Donald Knuth, the Notebook interface, and Markdown as well as RStudio’s page on RMarkdown for more information.

Such R Markdown allows for coding using multiple programming languages can in some cases can share memory resources, as is the case between R and Python. To achieve this, the reticulate package is loaded into memory. In addition, a function called conversation_pretty_print is loaded from a local library with restricted accessibility.

library(reticulate)
library(flextable)

library_local("conversation_pretty_print")
library_local("base64_encode")


use_python("C:/Program Files/Python312/python.exe")

Connecting to OpenAI API

The first job for the python interpreter is to import the python OpenAI object from the openai module. Also the load_dotenv function is imported from the dotenv module. The latter is used to extract information from the .env file located in the root of the RStudio project.

import os
import json
import base64

from openai import OpenAI
from dotenv import load_dotenv

bin = load_dotenv()

client = OpenAI(organization = os.getenv('ORGANIZATION'))

We can now explore the models available (depends on account authorizations linked to bearer token):

models = client.models.list()
models = [model.id for model in models.data]

models.sort()
py$models
##  [1] "babbage-002"                 "dall-e-2"                   
##  [3] "dall-e-3"                    "davinci-002"                
##  [5] "gpt-3.5-turbo"               "gpt-3.5-turbo-0125"         
##  [7] "gpt-3.5-turbo-0301"          "gpt-3.5-turbo-0613"         
##  [9] "gpt-3.5-turbo-1106"          "gpt-3.5-turbo-16k"          
## [11] "gpt-3.5-turbo-16k-0613"      "gpt-3.5-turbo-instruct"     
## [13] "gpt-3.5-turbo-instruct-0914" "gpt-4"                      
## [15] "gpt-4-0125-preview"          "gpt-4-0613"                 
## [17] "gpt-4-1106-preview"          "gpt-4-1106-vision-preview"  
## [19] "gpt-4-turbo"                 "gpt-4-turbo-2024-04-09"     
## [21] "gpt-4-turbo-preview"         "gpt-4-vision-preview"       
## [23] "gpt-4o"                      "gpt-4o-2024-05-13"          
## [25] "text-embedding-3-large"      "text-embedding-3-small"     
## [27] "text-embedding-ada-002"      "tts-1"                      
## [29] "tts-1-1106"                  "tts-1-hd"                   
## [31] "tts-1-hd-1106"               "whisper-1"

The image

Here is the first image of the series we are going to work with:

Time-lapse of cell-division genes. © by the OME Consortium
Time-lapse of cell-division genes. © by the OME Consortium

The image is taken from the set of sample images presented by the Open Microscopy Environment Consortium (OME) and originally derived from the public MitoCheck data set and consists in the time-lapse of cell-division. The time-lapse for well A-1 of plate LT0001_02 was converted into a single OME-TIFF using Bio-Formats 5.2.4 (See source). This file is licensed under a Creative Commons Attribution 4.0 International license, which is available here. See the accompanying publication of Neumann et al. (2010) for more information (Neumann et al. 2010).

The analysis

Using the gpt-4-vision-preview model

Conversation 1

We will now start a conversation with an AI assistant of the gpt-4-vision-preview model. Using the API allows us to upload images and to parametrize the model and provide guides that set the personality and level of experience of the AI assistant. The code below guides you trough the conversation. As the calling of the API is slow and costly, we cache the so-called completions (i.e. answers of the AI assistant) individually into JSON dump files in the out sub-folder of this project. To reset the completions, simply delete the appropriate completion-##.json files and knit the report again. Notice how R and Python object are being exchanged.

messages = list(
  list(
    role = "system",
    content = "You are an experienced microscopy expert specialized " +
      "in image analysis."),
  list(
    role = "user",
    content = list(
      list(
        type = "text",
        text = "Attached you find an fluorescence microscopy image of Hela " +
          "cells with their core histone 2B proteins tagged with GFP to " +
          "monitor chromosomes. Can you see the spots?"),
      list(
        type = "image_url",
        image_url = list(
          url = "https://idr.openmicroscopy.org/webclient/render_image/" +
          "1483351/0/0/?c=1|14:200$808080&m=g&p=normal&ia=0&maps=" + 
          "[{%22inverted%22:{%22enabled%22:false}}]")))))
def get_completion(file_name):
  data_loc = {}
  
  try:
    with open(file_name, "r") as f:
      data_loc = json.load(f)
  except IOError:
    pass
  
  if data_loc == {}:
    completion = client.chat.completions.create(
      model = "gpt-4-vision-preview",
      messages = r.messages,
      max_tokens = 1000
    )
    
    data_loc = completion.choices[0].message
    data_loc = {"role": data_loc.role, "content": data_loc.content}
  
    with open(file_name, "w") as f:
        json.dump(data_loc, f)
    
  return data_loc
  
data = get_completion("out/completion-11.json")
messages <- messages %>% append(
  list(list(role = py$data$role,
    content = py$data$content))) %>% append(list(
    list(
      role = "user",
      content = list(
        list(
          type = "text",
          text = "Some spots are fainter and blurrier than others, " + 
            "but how many spots do you count approximately?")))))
data = get_completion("out/completion-12.json")
messages <- messages %>% append(
  list(list(role = py$data$role,
    content = py$data$content))) %>% append(list(
    list(
      role = "user",
      content = list(
        list(
          type = "text",
          text = "Thank you! Would you be able to provide the XY " +
            "coordinates of at least some of the spots?")))))
data = get_completion("out/completion-13.json")
messages <- messages %>% append(
  list(list(role = py$data$role,
    content = py$data$content)))

messages %>% conversation_pretty_print %>% cat
  • system

    You are an experienced microscopy expert specialized in image analysis.
  • user

    Attached you find an fluorescence microscopy image of Hela cells with their core histone 2B proteins tagged with GFP to monitor chromosomes. Can you see the spots?
  • assistant

    Yes, the image shows multiple bright spots against a dark background, which are indicative of the fluorescence emitted by the GFP (green fluorescent protein) tagged to the core histone H2B proteins within the HeLa cells. These spots represent the locations of the chromosomes within the cell nuclei, and the GFP tagging allows for visualization of these structures under fluorescence microscopy. The size, brightness, and distribution of these spots can provide information about the cell cycle stage and chromosomal organization of each cell.
  • user

    Some spots are fainter and blurrier than others, but how many spots do you count approximately?
  • assistant

    Given the resolution of the image provided, it is difficult to ascertain an exact count of the spots, especially since some are faint and may overlap or be out of the focal plane, creating a blur that complicates accurate counting. However, I will attempt to give you an approximate count based on what I can discern.

    After a careful look, I can see approximately 40-50 distinct spots. Please note that this is a rough estimate, and the actual number of spots might differ because of the aforementioned reasons. Additionally, the edges of the image are darker, which might obscure some spots, and some very faint spots might not be visible at all depending on the sensitivity and resolution of the imaging setup.
  • user

    Thank you! Would you be able to provide the XY coordinates of at least some of the spots?
  • assistant

    I’m sorry for any confusion, but as an AI, I don’t have the ability to provide quantitative analysis or actual measurements such as XY coordinates for spots in images directly through this interface. Normally, to extract quantitative data from an image such as this, one would use specialized image analysis software such as ImageJ/FIJI, which can measure spot coordinates, areas, and intensities.

    Here’s a brief overview of how you might approach this task:

    1. Open the image in image analysis software such as ImageJ/FIJI.
    2. Convert the image to 8-bit grayscale if it is not already, to facilitate analysis.
    3. Set the threshold or use a spot detection algorithm to identify the spots you’re interested in.
    4. Execute an “Analyze Particles” function or a similar feature to measure properties of the identified spots, which might include their area and average intensity.
    5. The software would then output the coordinates of the center of each spot (usually the centroid), along with the other requested measurements, in a tabular format.
    For this specific task, you would need to carry out these steps yourself or work with someone who has the expertise to process the image. If you need guidance on how to use the software or interpret the results, I can certainly help explain that process.

So unfortunately, where in the past I was able to have this model help me analyse graphs, this does not seem possible for microscopy images. Below, I will try for the assistant to ignore the fact that the image is a microscopy image.

Conversation 2

base64_image <- base64_encode("in/00001_01.ome_01.png")

messages = list(
  list(
    role = "system",
    content = "You are an experienced microscopy expert specialized " +
      "in image analysis."),
  list(
    role = "user",
    content = list(
      list(
        type = "text",
        text = "Attached you find an image that shows a number of pale " +
          "blue spots. Can you see the spots?"),
      list(
        type = "image_url",
        image_url = list(
          url = "data:image/png;base64,{" + base64_image + "}")))))
data = get_completion("out/completion-21.json")
messages <- messages %>% append(
  list(list(role = py$data$role,
    content = py$data$content))) %>% append(list(
    list(
      role = "user",
      content = list(
        list(
          type = "text",
          text = "Do not mind about the interpretation behind the spots, " +
            "but could you attempt to provide the XY-coordinates at least " +
            "some of the spots?")))))
data = get_completion("out/completion-22.json")
messages <- messages %>% append(list(list(role = py$data$role,
    content = py$data$content)))

messages %>% conversation_pretty_print %>% cat
  • system

    You are an experienced microscopy expert specialized in image analysis.
  • user

    Attached you find an image that shows a number of pale blue spots. Can you see the spots?
  • assistant

    Yes, I can see the pale blue spots in the image you’ve provided. However, without additional context it’s challenging to determine exactly what they are. The spots may represent cells, particles, or other features depending on the source and method used to capture the image. They vary in size and shape which could indicate they are biological in nature or they could be artifacts from the imaging process. If this is a microscopic image, which is likely given the description and appearance, these spots could be a specific component of a cell culture or tissue stained for better visibility, or they could represent some kind of precipitation if the sample preparation was not optimal.

    If you can provide more details about the nature of the sample and the imaging technique used, I may be able to give you a more precise interpretation of the spots.
  • user

    Do not mind about the interpretation behind the spots, but could you attempt to provide the XY-coordinates at least some of the spots?
  • assistant

    As an AI text-based model developed by OpenAI, I’m not equipped with the tools to provide exact XY-coordinates of items within an image. My capabilities are oriented toward processing and generating text, and while I have image recognition capabilities that allow me to describe and understand images on a general level, I don’t have the functionality to measure or give precise coordinates for visual elements within an image.

    However, for tasks requiring such precision, specialized image analysis software would typically be used to analyze the image and provide coordinates. These programs can detect each spot’s center and then calculate and output the precise X and Y coordinates in the image’s pixel grid. If you need to obtain these coordinates, I would recommend using image analysis software with object recognition and coordinate mapping capabilities.

Again, no luck with actual analysis. but let us see below to what extend the assistant is still available for more general interpretation.

Conversation 3

We will now try to have the AI assistant interpret a publiched figure instead. From this publication, we use this image:

Figure 2
Figure 2
base64_image <- base64_encode("in/es3c07195_0002.jpeg")

messages = list(
  list(
    role = "system",
    content = "You are an experienced microscopy expert specialized " +
      "in image analysis."),
  list(
    role = "user",
    content = list(
      list(
        type = "text",
        text = "Attached you find an image From a recent publication. " +
          "The title of the presentation is: \"Human Airway Organoids and " +
          "Multimodal Imaging-Based Toxicity Evaluation of " +
          "1-Nitropyrene\". The caption of the figure is \"Figure 2. " +
          "Architecture alteration and visual assessment of cytotoxicity " +
          "effect on human airway organoids induced by 1-NP treatment. " +
          "(A) Representative photomicrographs and live/dead staining " +
          "images of the control group and 1, 10, and 50 μM 1-NP groups " +
          "at two time points (day 3 and day 6). Green fluorescence " +
          "indicates the Calcein-AM stain in the live cells and red " +
          "fluorescence indicates the PI stain in the dead cells. " +
          "Scale bars, 100 μm. (B,C) Relative cell viability based on " +
          "fluorometric live/dead cytotoxicity test of 1-NP-treated " +
          "organoids after exposure for three days (B) and six days " +
          "(C). (D) Fluorescence microscopic ROS analysis utilized " +
          "DCFDA (green) in 1-NP (3 days)-treated 2D airway organoids. " +
          "Scale bars, 10 μm. (E) ROS fluorescence intensity is " +
          "depicted as a fold change compared to the control 2D airway " +
          "organoids. All the data are shown as the mean ± standard " +
          "deviation of n = 3, *p < 0.05, **P < 0.01.\". The abstract of " +
          "the publication is \"Despite significant advances in " +
          "understanding the general health impacts of air pollution, " +
          "the toxic effects of air pollution on cells in the human " +
          "respiratory tract are still elusive. A robust, biologically " +
          "relevant in vitro model for recapitulating the physiological " +
          "response of the human airway is needed to obtain a thorough " +
          "understanding of the molecular mechanisms of air pollutants. " +
          "In this study, by using 1-nitropyrene (1-NP) as a proof-of-" +
          "concept, we demonstrate the effectiveness and reliability " +
          "of evaluating environmental pollutants in physiologically " +
          "active human airway organoids. Multimodal imaging tools, " +
          "including live cell imaging, fluorescence microscopy, and " +
          "MALDI-mass spectrometry imaging (MSI), were implemented to " +
          "evaluate the cytotoxicity of 1-NP for airway organoids. In " +
          "addition, lipidomic alterations upon 1-NP treatment were " +
          "quantitatively analyzed by nontargeted lipidomics. 1-NP " +
          "exposure was found to be associated with the overproduction " +
          "of reactive oxygen species (ROS), and dysregulation of lipid " +
          "pathways, including the SM-Cer conversion, as well as " +
          "cardiolipin in our organoids. Compared with that of cell " +
          "lines, a higher tolerance of 1-NP toxicity was observed in the " +
          "human airway organoids, which might reflect a more " +
          "physiologically relevant response in the native airway " +
          "epithelium. Collectively, we have established a novel system " +
          "for evaluating and investigating molecular mechanisms " +
          "of environmental pollutants in the human airways via the " +
          "combinatory use of human airway organoids, multimodal " +
          "imaging analysis, and MS-based analyses.\" Could you, " +
          "based on the provided information describe the figure, please?"),
      list(
        type = "image_url",
        image_url = list(
          url = "data:image/png;base64,{" + base64_image + "}")))))
data = get_completion("out/completion-31.json")
messages <- messages %>% append(
  list(list(role = py$data$role,
    content = py$data$content))) %>% append(list(
    list(
      role = "user",
      content = list(
        list(
          type = "text",
          text = "I have the impression that the field-of-view of light " +
          "microscopy results do not match that of the fluorescence " +
          "microscopy. Would you agree with that statement?")))))
data = get_completion("out/completion-32.json")
messages <- messages %>% append(list(list(role = py$data$role,
    content = py$data$content)))

messages %>% conversation_pretty_print %>% cat
  • system

    You are an experienced microscopy expert specialized in image analysis.
  • user

    Attached you find an image From a recent publication. The title of the presentation is: “Human Airway Organoids and Multimodal Imaging-Based Toxicity Evaluation of 1-Nitropyrene”. The caption of the figure is “Figure 2. Architecture alteration and visual assessment of cytotoxicity effect on human airway organoids induced by 1-NP treatment. (A) Representative photomicrographs and live/dead staining images of the control group and 1, 10, and 50 μM 1-NP groups at two time points (day 3 and day 6). Green fluorescence indicates the Calcein-AM stain in the live cells and red fluorescence indicates the PI stain in the dead cells. Scale bars, 100 μm. (B,C) Relative cell viability based on fluorometric live/dead cytotoxicity test of 1-NP-treated organoids after exposure for three days (B) and six days (C). (D) Fluorescence microscopic ROS analysis utilized DCFDA (green) in 1-NP (3 days)-treated 2D airway organoids. Scale bars, 10 μm. (E) ROS fluorescence intensity is depicted as a fold change compared to the control 2D airway organoids. All the data are shown as the mean ± standard deviation of n = 3, *p < 0.05, **P < 0.01.”. The abstract of the publication is “Despite significant advances in understanding the general health impacts of air pollution, the toxic effects of air pollution on cells in the human respiratory tract are still elusive. A robust, biologically relevant in vitro model for recapitulating the physiological response of the human airway is needed to obtain a thorough understanding of the molecular mechanisms of air pollutants. In this study, by using 1-nitropyrene (1-NP) as a proof-of-concept, we demonstrate the effectiveness and reliability of evaluating environmental pollutants in physiologically active human airway organoids. Multimodal imaging tools, including live cell imaging, fluorescence microscopy, and MALDI-mass spectrometry imaging (MSI), were implemented to evaluate the cytotoxicity of 1-NP for airway organoids. In addition, lipidomic alterations upon 1-NP treatment were quantitatively analyzed by nontargeted lipidomics. 1-NP exposure was found to be associated with the overproduction of reactive oxygen species (ROS), and dysregulation of lipid pathways, including the SM-Cer conversion, as well as cardiolipin in our organoids. Compared with that of cell lines, a higher tolerance of 1-NP toxicity was observed in the human airway organoids, which might reflect a more physiologically relevant response in the native airway epithelium. Collectively, we have established a novel system for evaluating and investigating molecular mechanisms of environmental pollutants in the human airways via the combinatory use of human airway organoids, multimodal imaging analysis, and MS-based analyses.” Could you, based on the provided information describe the figure, please?
  • assistant

    The provided figure is a composite of various images and graphs that depict the results from a study evaluating the toxicity of 1-nitropyrene (1-NP) on human airway organoids using multimodal imaging techniques.

    Panel A of the figure presents photomicrographs and fluorescence images of human airway organoids under different treatment conditions (control, 1 μM, 10 μM, and 50 μM of 1-NP) at two different time points (day 3 and day 6). For each treatment and time point, there are two sets of images: the upper row shows standard light microscopy images for visual assessment of organoid morphology, while the lower row provides fluorescence images of live/dead staining. In the fluorescence images, live cells are stained with Calcein-AM which emits green fluorescence, and dead cells are stained with Propidium Iodide (PI) that emits red fluorescence. There is a visible dose-dependent increase in red fluorescence with increasing concentrations of 1-NP, indicating increased cell death. The scale bars represent 100 μm.

    Panel B and C show bar graphs of the relative cell viability of 1-NP-treated organoids measured by a fluorometric live/dead cytotoxicity assay at day 3 and day 6 respectively. The viability is reduced as the concentration of 1-NP increases, as reflected by a decrease in bar height in both graphs. The data are presented as mean ± standard deviation for n=3 samples, with statistical significance indicated by asterisks (*p < 0.05, **P < 0.01), showing that higher doses of 1-NP significantly decrease cell viability.

    In Panel D, fluorescence microscopy was used to assess the production of reactive oxygen species (ROS) in the organoids treated with different concentrations of 1-NP for 3 days. The reagent DCFDA (green fluorescence) was used to visualize ROS. The control organoids display minimal green fluorescence, while those treated with 1-NP show an increase in green fluorescence, indicating higher levels of ROS. The scale bars in these images represent 10 μm.

    Lastly, Panel E is a bar graph representing the fold change in ROS fluorescence intensity of the organoids treated with different concentrations of 1-NP compared to the control. It shows a dose-dependent increase in ROS production with higher concentrations of 1-NP, with statistical significance marked by asterisks.

    Overall, the figure illustrates the cytotoxic effects of 1-NP on human airway organoids, featuring notable alterations in the organoid architecture and an increase in cell death and ROS production upon exposure to this environmental pollutant.
  • user

    I have the impression that the field-of-view of light microscopy results do not match that of the fluorescence microscopy. Would you agree with that statement?
  • assistant

    Yes, the field-of-view in the light microscopy and fluorescence microscopy images do appear to be different. In the light microscopy images at the top of Panel A, you can see multiple organoids grouped together, suggesting a wider field-of-view.

    In contrast, the fluorescence microscopy images below present a much closer view, with a focus on individual or few organoids, indicating a narrower field-of-view and magnification. This difference in magnification can be deliberate, allowing for detailed analysis of cellular viability within individual organoids using the live/dead staining method.

    In microscopy studies, it’s common practice to use different magnifications and fields-of-view to accomplish various objectives, such as assessing overall morphology or focusing on specific cellular or subcellular features.

Conclusion

Of course, the gpt-4-1106-vision-preview is still labelled under ‘preview’, but it appears as though it will only be useful in the interpretation of microscopy results, not in the analysis of it. There are plenty of other options out there that do assist in the analysis part using deep learning which I will be delving in later but it would be good to also have a LLM attached so that we can converse in a natural way with it.


Neumann, Beate, Thomas Walter, Jean-Karim Hériché, Jutta Bulkescher, Holger Erfle, Christian Conrad, Phill Rogers, et al. 2010. “Phenotypic Profiling of the Human Genome by Time-Lapse Microscopy Reveals Cell Division Genes.” Nature 464 (7289): 721–27.