Type 2 diabetes is partly controlled by slowing how fast the body digests starch. Three plants known to help with blood sugar — Beta vulgaris, Syzygium cumini and Terminalia arjuna — were tested for useful plant chemicals. These chemicals were then checked against two starch-digesting enzymes, alpha-amylase (4GQR) and alpha-glucosidase (6C9X). The extracts were studied with simple chemical tests and GC-MS, and twelve compounds were docked using CB-Dock2. Tiliroside, betacyanin, rutin and naringin bound the enzymes most strongly, and safety checks looked good for most. Cross- referencing binding strength with predicted absorption and toxicity narrows the field to a small, realistic short-list of natural leads for new diabetes drugs.
Diabetes now affects hundreds of millions of people, and Type 2 diabetes drives more than 90% of adult cases (Tran, Pham, and Le 2020). After a meal, two gut enzymes — alpha-amylase and alpha-glucosidase — break starch down into glucose and push blood sugar up.
Blocking these two enzymes slows that rise, which is exactly how the drug acarbose works (Wu et al. 2020). Acarbose and similar drugs, however, often upset the stomach, so researchers have turned to plant chemicals, which hit several targets at once and usually cause fewer side effects (Tran, Pham, and Le 2020). Beetroot, jamun and arjuna have all been used for years to help control blood sugar, and each one carries a different set of active chemicals — flavonoids and phenolics in S. cumini (Ayyanar and Subash-Babu 2012), triterpenoids in T. arjuna (Jain et al. 2009), and betalains and flavonol glycosides in B. vulgaris (Zia, Sunita, and Sneha 2021).
Few studies, though, compare the chemicals from these three plants side by side, or test them against both key enzymes at the same time. It therefore stays unclear which compounds bind the tightest and remain safe enough to take forward (Mohamed et al. 2023).
Here we report a combined screen. The three extracts were profiled by simple chemical tests and GC-MS, twelve compounds were docked against alpha-amylase (4GQR) and alpha-glucosidase (6C9X), and each compound was then ranked by binding strength and checked for predicted absorption and safety.
The literature review was carried out using Google Scholar. Plant chemicals were extracted from dried, powdered plant material (20 g each) by Soxhlet extraction with ethanol, and the amount recovered was weighed. Each extract was then run through thirteen simple chemical tests and a GC-MS scan. For docking, twelve compounds were downloaded from PubChem, the two proteins were cleaned up in BIOVIA Discovery Studio, and docking was run with CB-Dock2 (Liu et al. 2022). SwissADME predicted how well the body would absorb each compound (Daina, Michielin, and Zoete 2017), and ProTox-II predicted its safety (Banerjee et al. 2018). The data behind every figure sit inside this document. Use the tabs below to switch between the extraction, docking and absorption/safety results.
The extraction worked for all three plants. T. arjuna gave the most material and S. cumini the least (Figure 1). Each extract was then carried through the thirteen confirmatory colour-change and precipitation tests described in Methods; Figure 2 shows representative before-and-after test tubes from this screen, and the positive/negative calls read off them are summarised as a fingerprint in Figure 3: T. arjuna tested positive for saponins, phytosterols, phenols and tannins; S. cumini for alkaloids, flavonoids and phenols; and B. vulgaris for alkaloids, flavonoids and carbohydrates. GC-MS found the target compounds in all three samples over a 51-minute run; the main ones and their match (Q) values are listed in Table 1.
master |>
distinct(plant) |>
mutate(yield = c(80.00, 76.67, 83.33)[match(plant, names(plant_cols))],
err = c(2.00, 2.00, 1.67)[match(plant, names(plant_cols))]) |>
ggplot(aes(reorder(plant, yield), yield, fill = plant)) +
geom_col(width = .68) +
geom_errorbar(aes(ymin = yield - err, ymax = yield + err), width = .18,
colour = "grey30", linewidth = .8) +
geom_text(aes(label = sprintf("%.1f%%", yield)), hjust = -0.18, fontface = "bold",
colour = ink, size = 4.3) +
scale_fill_manual(values = plant_cols, guide = "none") +
coord_flip(ylim = c(0, 95)) +
labs(title = "Extraction yield by species", x = NULL, y = "Yield (% by weight)") +
theme_pub() +
theme(axis.text.y = element_text(face = "bold.italic", colour = ink, size = 11.5))Figure 1. Average phytochemical recovery (% by weight) after the solvent was removed. Arjuna bark is the richest source; jamun seed the leanest.
Each panel below pairs the crude extract (before) with its reagent-treated counterpart (after); a colour shift, precipitate or turbidity change marks a positive call for that test, while an unchanged tube is scored negative — these are the raw readouts that feed directly into the fingerprint in Figure 3.
Figure 2. Representative before/after outcomes for twelve of the thirteen qualitative confirmatory tests run on the three extracts. A change in colour, the appearance of a precipitate, or a shift in turbidity after reagent addition flags a positive result for the metabolite class being probed; a tube that stays visually unchanged is scored negative.
classes <- c("Saponins","Alkaloids","Phenols","Tannins","Flavonoids",
"Phytosterols","Carbohydrates","Glycosides","Reducing sugars")
fp <- tibble::tribble(
~plant, ~class, ~present,
"Beta vulgaris","Saponins",0,"Beta vulgaris","Alkaloids",1,"Beta vulgaris","Phenols",0,
"Beta vulgaris","Tannins",0,"Beta vulgaris","Flavonoids",1,"Beta vulgaris","Phytosterols",0,
"Beta vulgaris","Carbohydrates",1,"Beta vulgaris","Glycosides",1,"Beta vulgaris","Reducing sugars",0,
"Terminalia arjuna","Saponins",1,"Terminalia arjuna","Alkaloids",0,"Terminalia arjuna","Phenols",1,
"Terminalia arjuna","Tannins",1,"Terminalia arjuna","Flavonoids",0,"Terminalia arjuna","Phytosterols",1,
"Terminalia arjuna","Carbohydrates",0,"Terminalia arjuna","Glycosides",1,"Terminalia arjuna","Reducing sugars",1,
"Syzygium cumini","Saponins",0,"Syzygium cumini","Alkaloids",1,"Syzygium cumini","Phenols",1,
"Syzygium cumini","Tannins",1,"Syzygium cumini","Flavonoids",1,"Syzygium cumini","Phytosterols",0,
"Syzygium cumini","Carbohydrates",1,"Syzygium cumini","Glycosides",1,"Syzygium cumini","Reducing sugars",1
) |>
mutate(plant = factor(plant, levels = c("Beta vulgaris","Syzygium cumini","Terminalia arjuna")),
class = factor(class, levels = rev(classes)),
fillc = ifelse(present == 1, as.character(plant_cols[as.character(plant)]), "#F1F0EC"))
ggplot(fp, aes(plant, class)) +
geom_tile(aes(fill = fillc), colour = "white", linewidth = 2.2, width = .92, height = .92) +
geom_text(data = subset(fp, present == 1), aes(label = "\u2713"),
colour = "white", fontface = "bold", size = 6) +
geom_text(data = subset(fp, present == 0), aes(label = "\u2013"),
colour = "#C2C7C2", size = 5) +
scale_fill_identity() +
scale_x_discrete(position = "top", labels = function(x) gsub(" ", "\n", x)) +
labs(title = "Phytochemical fingerprint", x = NULL, y = NULL) +
theme_pub() +
theme(panel.grid.major = element_blank(),
axis.text.y = element_text(face = "bold", colour = ink, size = 11.5),
axis.text.x = element_text(face = "bold.italic", colour = ink, size = 11.5, lineheight = .9))Figure 3. Qualitative phytochemical fingerprint. Filled cells mark metabolite classes confirmed in each extract; colour encodes the source species. The three signatures are distinct and only partly overlapping.
gcms <- tibble(
ret_time = c(1.78,2.04,2.34,3.53,3.78,7.76,9.45,10.62,11.63,11.58,12.66,13.59),
compound = c("Astragalin","Betacyanin","Coumarins","Tiliroside","Caryophyllene","Myricetin",
"Rutin","Arjunolic acid","Arjunone","Baicalein","Naringin","Luteolin"),
est_conc = c(265.6,28.4,130.2,8.87,11.3,21.5,74.3,16.5,30.8,59.9,58.7,61.7),
q_value = c("94/35","86","86","NA","NA","82","93","NA","NA","93","83","97")
)
gcms |>
arrange(ret_time) |>
kable(col.names = c("Ret. time (min)", "Compound", "Est. conc.", "Q value"),
caption = "Table 1. Main plant compounds identified by GC-MS.")| Ret. time (min) | Compound | Est. conc. | Q value |
|---|---|---|---|
| 1.78 | Astragalin | 265.60 | 94/35 |
| 2.04 | Betacyanin | 28.40 | 86 |
| 2.34 | Coumarins | 130.20 | 86 |
| 3.53 | Tiliroside | 8.87 | NA |
| 3.78 | Caryophyllene | 11.30 | NA |
| 7.76 | Myricetin | 21.50 | 82 |
| 9.45 | Rutin | 74.30 | 93 |
| 10.62 | Arjunolic acid | 16.50 | NA |
| 11.58 | Baicalein | 59.90 | 93 |
| 11.63 | Arjunone | 30.80 | NA |
| 12.66 | Naringin | 58.70 | 83 |
| 13.59 | Luteolin | 61.70 | 97 |
All twelve compounds bound well to both enzymes, with scores from −6.7 to −10.2 kcal/mol. Tiliroside bound the tightest overall (−10.2 with 4GQR), followed by betacyanin and rutin; naringin led against alpha-glucosidase (−9.4). A more negative score marks a tighter fit. Figure 4 maps all 24 protein–ligand complexes; the darkest tiles flag the strongest predicted enzyme blockers.
hm <- master |>
select(compound, plant, amylase, glucosidase) |>
pivot_longer(c(amylase, glucosidase), names_to = "target", values_to = "score") |>
mutate(target = recode(target,
amylase = "alpha-Amylase\n(4GQR)",
glucosidase = "alpha-Glucosidase\n(6C9X)"))
ord <- master |> arrange(plant, mean_dock) |> pull(compound)
hm <- hm |> mutate(compound = factor(compound, levels = rev(ord)),
lab_col = ifelse(score <= -8.4, "white", "#0B3D34"))
ggplot(hm, aes(target, compound, fill = score)) +
geom_tile(colour = "white", linewidth = 1.4, width = .96, height = .9) +
geom_text(aes(label = sprintf("%.1f", score), colour = lab_col), fontface = "bold", size = 4.1) +
scale_colour_identity() +
scale_fill_gradientn(colours = c("#08362E","#15715C","#4FAE94","#9FD3C4","#D8ECE5"),
limits = c(-10.4, -6.5), name = "kcal/mol",
breaks = c(-10,-9,-8,-7)) +
facet_grid(plant ~ ., scales = "free_y", space = "free_y", switch = "y") +
labs(title = "Predicted binding affinity across both targets",
subtitle = "Lower (darker) = stronger predicted interaction", x = NULL, y = NULL) +
theme_pub() +
theme(panel.grid.major = element_blank(),
axis.text.y = element_text(face = "bold", colour = ink, size = 11),
axis.text.x = element_text(face = "bold", size = 10, lineheight = .9),
strip.placement = "outside",
strip.text.y.left = element_text(angle = 0, hjust = 0, face = "bold.italic", size = 10.5),
panel.spacing = unit(6, "pt"))Figure 4. Binding-energy heatmap for all 24 complexes, grouped by plant and ordered by mean affinity. Every compound clears −7 kcal/mol; the darkest tiles — tiliroside, betacyanin, rutin and naringin — are the strongest binders.
Strongest α-amylase score in the panel, holds firm against α-glucosidase, and pairs that potency with a high predicted safe dose and a clean toxicity class — efficacy without a safety trade-off.
Table 2 ranks every compound by score, but a single number hides why a ligand binds well. Figure 5 opens that up: 2D protein–ligand interaction diagrams for all 24 complexes, showing exactly which residues each compound touches and through what kind of contact (conventional and carbon hydrogen bonds, π-alkyl, π-sigma, π-anion and van der Waals). The pattern matches the heatmap in Figure 4 — the tightest binders (tiliroside, betacyanin, rutin, naringin) are the ones drawing three or more hydrogen bonds to key catalytic residues such as Asp197/Asp300, Glu233/Glu235 and Trp59/His101, whereas the looser-binding terpenoids (caryophyllene, arjunolic acid, arjunone) lean almost entirely on van der Waals and π-alkyl contacts with no, or very few, hydrogen bonds.
Figure 5. Two-dimensional protein–ligand interaction diagrams for all twelve compounds docked against alpha-amylase (4GQR, top row of each block) and alpha-glucosidase (6C9X, bottom row), grouped by plant of origin (A: S. cumini, B: B. vulgaris, C: T. arjuna). Dashed lines are colour-coded by interaction type (legend, right); residue shading marks polar, acidic, basic, hydrophobic or ‘greasy’ side chains.
master |>
transmute(Plant = plant, Ligand = compound, `4GQR` = amylase, `6C9X` = glucosidase) |>
arrange(Plant, `4GQR`) |>
kable(col.names = c("Plant", "Ligand", "4GQR (kcal/mol)", "6C9X (kcal/mol)"),
caption = "Table 2. Docking scores against the two diabetes targets.")| Plant | Ligand | 4GQR (kcal/mol) | 6C9X (kcal/mol) |
|---|---|---|---|
| Beta vulgaris | Tiliroside | -10.2 | -9.7 |
| Beta vulgaris | Betacyanin | -9.0 | -9.3 |
| Beta vulgaris | Astragalin | -8.1 | -8.5 |
| Beta vulgaris | Coumarins | -7.7 | -9.2 |
| Syzygium cumini | Rutin | -9.0 | -9.2 |
| Syzygium cumini | Naringin | -8.5 | -9.4 |
| Syzygium cumini | Myricetin | -8.2 | -8.7 |
| Syzygium cumini | Caryophyllene | -7.2 | -7.5 |
| Terminalia arjuna | Luteolin | -8.4 | -8.5 |
| Terminalia arjuna | Baicalein | -8.0 | -8.3 |
| Terminalia arjuna | Arjunolic acid | -7.6 | -8.1 |
| Terminalia arjuna | Arjunone | -6.7 | -7.7 |
A strong docking score alone can mislead: the tightest binder is useless if it is poorly absorbed or toxic. Figure 6 plots binding strength against predicted toxicity and sizes each marker by bioavailability, turning twelve separate numbers into one decision map; Figure 7 then ranks drug-likeness directly. Coumarins, caryophyllene, myricetin and the arjuna triterpenoids absorb best (0.55–0.56); most compounds sit in the safe range (LD₅₀ ≥ 2000 mg/kg). Myricetin is the warning — strong binding but a low LD₅₀ of 159 mg/kg (Table 3).
lead <- c("Tiliroside","Naringin","Rutin","Luteolin","Myricetin","Betacyanin")
ggplot(master, aes(mean_dock, ld50)) +
annotate("rect", xmin = -10.6, xmax = -8.5, ymin = 1800, ymax = 6300, fill = green2, alpha = .07) +
annotate("text", x = -9.55, y = 5950, hjust = .5, label = "PREFERRED LEAD SPACE",
colour = green2, fontface = "bold", size = 3.4) +
geom_point(aes(fill = plant, size = bioavail), shape = 21, colour = "white", stroke = 1.1) +
geom_text_repel(aes(label = compound, colour = plant), fontface = "bold", size = 3.6,
box.padding = .9, point.padding = .6, force = 4, seed = 12,
min.segment.length = 0, segment.colour = "#B7BEBA", max.overlaps = Inf) +
scale_fill_manual(values = plant_cols, name = "Plant source",
guide = guide_legend(override.aes = list(size = 4.5))) +
scale_colour_manual(values = plant_cols, guide = "none") +
scale_size_continuous(range = c(4.5, 13), breaks = c(0.11, 0.17, 0.55), name = "Bioavailability",
guide = guide_legend(override.aes = list(fill = "#6E8C84", colour = "white"))) +
scale_x_reverse(expand = expansion(mult = .07)) +
scale_y_log10(labels = label_comma(), breaks = c(159, 305, 1000, 2000, 5000)) +
labs(title = "Efficacy vs safety across twelve phytochemicals",
subtitle = "Stronger binding to the right · safer (higher LD₅₀) upward · marker area = bioavailability",
x = "Mean predicted binding energy (kcal/mol)", y = "Predicted oral LD₅₀ (mg/kg, log scale)") +
theme_pub()Figure 6. The efficacy–safety landscape. Compounds drift right as binding strengthens and rise as predicted LD₅₀ (safety) increases; marker area scales with oral bioavailability. The shaded zone collects candidates that are both potent and well-tolerated.
d4 <- master |> arrange(plant, bioavail) |> mutate(compound = factor(compound, levels = compound))
ggplot(d4, aes(bioavail, compound)) +
annotate("rect", xmin = 0.50, xmax = 0.60, ymin = -Inf, ymax = Inf, fill = green2, alpha = .06) +
geom_vline(xintercept = 0.50, linetype = "22", colour = green2, linewidth = .7) +
geom_segment(aes(x = 0, xend = bioavail, yend = compound, colour = plant), linewidth = 1.1) +
geom_point(aes(fill = plant, size = ld50), shape = 21, colour = "white", stroke = 1) +
geom_text(aes(label = sprintf("LD50 %s", comma(ld50))), hjust = -0.22, size = 3,
colour = muted, nudge_x = .006) +
scale_fill_manual(values = plant_cols, name = "Plant source") +
scale_colour_manual(values = plant_cols, guide = "none") +
scale_size_continuous(range = c(3, 9), guide = "none") +
scale_x_continuous(limits = c(0, .82), breaks = seq(0, .6, .1)) +
labs(title = "Drug-likeness and safety profile",
subtitle = "SwissADME bioavailability score · marker size scaled to predicted LD50",
x = "Predicted oral bioavailability score", y = NULL) +
theme_pub() +
theme(axis.text.y = element_text(face = "bold", colour = ink, size = 10.5),
legend.position = "bottom")Figure 7. Drug-likeness and acute-safety profile. Bars rank oral bioavailability; the shaded band marks compounds above the cut-off used here, and marker size scales with predicted LD₅₀.
master |>
transmute(Ligand = compound, Bioavailability = bioavail,
`LD50 (mg/kg)` = ld50, `Toxicity class` = tox_class) |>
kable(caption = "Table 3. Predicted absorption and safety values.")| Ligand | Bioavailability | LD50 (mg/kg) | Toxicity class |
|---|---|---|---|
| Astragalin | 0.17 | 5000 | 5 |
| Betacyanin | 0.11 | 305 | 4 |
| Coumarins | 0.55 | 3890 | 5 |
| Tiliroside | 0.17 | 5000 | 5 |
| Caryophyllene | 0.55 | 5300 | 5 |
| Myricetin | 0.55 | 159 | 3 |
| Naringin | 0.17 | 2300 | 5 |
| Rutin | 0.17 | 5000 | 5 |
| Arjunolic acid | 0.56 | 2000 | 4 |
| Arjunone | 0.55 | 2000 | 4 |
| Baicalein | 0.55 | 3919 | 5 |
| Luteolin | 0.55 | 3919 | 5 |
The results reveal a clear pattern: the tightest binders — tiliroside, betacyanin, rutin and naringin — are all large molecules that carry many sugar and hydroxyl groups. These groups anchor the molecule with a web of hydrogen bonds inside the active site, while their flat rings stack against nearby residues for extra grip. Tiliroside, a sugar-linked flavonoid from B. vulgaris, gripped alpha-amylase the hardest, and naringin gripped alpha-glucosidase the hardest. Both out-scored the smaller compound luteolin, which earlier work had highlighted (Davella and Mamidala 2021), and this echoes other studies where sugar-linked flavonoids repeatedly top the list of alpha-amylase binders (Mohamed et al. 2023).
Binding strength, however, does not tell the whole story. The very groups that grip the enzyme also weigh the molecule down and lower how well the gut can absorb it. The smaller, easily absorbed compounds (coumarins, caryophyllene) bind more loosely. Myricetin sounds a clear warning: it binds strongly yet carries a low safe dose, so a docking score alone must never decide which compound to chase.
Two limits deserve a plain statement. First, docking predicts how well a compound fits, not whether it truly blocks the enzyme, so laboratory tests remain essential. Second, the method holds the protein rigid and ignores water. Future work should pair lab activity tests with more detailed simulations of the leading compounds. Even with these caveats, the screen delivers a solid short-list — led by tiliroside, rutin and naringin — for building plant-based diabetes treatments.