Correlation and mutual information are both ways to measure how much two things are related to each other.

Correlation is a measure of how much two variables move together. If one variable increases when the other increases, they are positively correlated. If one decreases when the other increases, they are negatively correlated. If there’s no consistent pattern, they are not correlated. Correlation values range from -1 (perfect negative correlation) to 1 (perfect positive correlation), with 0 indicating no correlation.

Mutual Information, on the other hand, is a bit more complex. It measures how much knowing the value of one variable reduces uncertainty about the value of the other. In other words, it tells us how much information about one variable is contained in the other. Mutual information values are always non-negative, with larger values indicating a stronger relationship.

While mutual information is a powerful tool for understanding the relationship between two variables within a given dataset or process, it’s not generally meaningful to compare mutual information values between different datasets or processes.

This is because the mutual information value is dependent on the specific distribution of the variables in question. Different processes or datasets can have different distributions, which can lead to different mutual information values even if the underlying relationships are similar.

Moreover, mutual information is measured in bits (or sometimes nats), which represent the reduction in uncertainty about one variable given knowledge of another. This reduction in uncertainty is specific to the variables and their distribution, and doesn’t necessarily translate to other contexts.

So, while mutual information can provide valuable insights into the relationships within a dataset, it should be used with caution when comparing across datasets or processes. It’s always important to consider the specific context and characteristics of the data you’re working with.

Let’s look at some random variables I’ve generated to see how this works out in an example


#packages
if (!require("infotheo")) {
  install.packages("infotheo")
}
if (!require("MASS")) {
  install.packages("MASS")
}
library(infotheo)
library(MASS)

# Set seed 
set.seed(123)
results <- data.frame(correlation = numeric(), mutual_information = numeric())

#Playing with correlations of interest (Playing COI)
for (correlation in seq(0, 1, by = 0.1)) {
  
  # bivariate normal dist
  data <- mvrnorm(1000, mu = c(0, 0), Sigma = matrix(c(1, correlation, correlation, 1), nrow = 2))
  
  mutual_information <- mutinformation(discretize(data[, 1]), discretize(data[, 2]))
  
  results <- rbind(results, c(correlation, mutual_information))
}

colnames(results) <- c("Correlation", "Mutual Information")
plot(results$Correlation, results$`Mutual Information`, 
     xlab = "Correlation", ylab = "Mutual Information", 
     main = "Correlation vs Mutual Information")

print(results)

When the correlation is 0 (meaning the variables are not correlated), the mutual information is also quite low, indicating that knowing the value of one variable doesn’t tell us much about the other.

As the correlation increases from 0 to 0.9, the mutual information also increases. This means that as the variables become more strongly correlated, knowing the value of one variable gives us more information about the other. I’ve also included correlation of 1 for completeness.

The mutual information increases more rapidly as the correlation gets higher. This suggests that even small increases in correlation can lead to large increases in mutual information when the correlation is already high.

Note that the mutual information for a correlation of .6 (~.24) is closer the mutual information of correlation 0.0 (~.03) than to correlation of .8 (~.49)

In summary, both correlation and mutual information are ways to measure the relationship between two variables, but they capture slightly different aspects of this relationship. While correlation measures the degree to which the variables move together, mutual information measures the amount of information they share.

A note for the curious: There are exceptions, but this is the rule

LS0tDQp0aXRsZTogIkNvcnJlbGF0aW9uIGFuZCBNdXR1YWwgSW5mb3JtYXRpb24iDQpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sNCi0tLQ0KDQpDb3JyZWxhdGlvbiBhbmQgbXV0dWFsIGluZm9ybWF0aW9uIGFyZSBib3RoIHdheXMgdG8gbWVhc3VyZSBob3cgbXVjaCB0d28gdGhpbmdzIGFyZSByZWxhdGVkIHRvIGVhY2ggb3RoZXIuDQoNCkNvcnJlbGF0aW9uIGlzIGEgbWVhc3VyZSBvZiBob3cgbXVjaCB0d28gdmFyaWFibGVzIG1vdmUgdG9nZXRoZXIuIElmIG9uZSB2YXJpYWJsZSBpbmNyZWFzZXMgd2hlbiB0aGUgb3RoZXIgaW5jcmVhc2VzLCB0aGV5IGFyZSBwb3NpdGl2ZWx5IGNvcnJlbGF0ZWQuIElmIG9uZSBkZWNyZWFzZXMgd2hlbiB0aGUgb3RoZXIgaW5jcmVhc2VzLCB0aGV5IGFyZSBuZWdhdGl2ZWx5IGNvcnJlbGF0ZWQuIElmIHRoZXJlJ3Mgbm8gY29uc2lzdGVudCBwYXR0ZXJuLCB0aGV5IGFyZSBub3QgY29ycmVsYXRlZC4gQ29ycmVsYXRpb24gdmFsdWVzIHJhbmdlIGZyb20gLTEgKHBlcmZlY3QgbmVnYXRpdmUgY29ycmVsYXRpb24pIHRvIDEgKHBlcmZlY3QgcG9zaXRpdmUgY29ycmVsYXRpb24pLCB3aXRoIDAgaW5kaWNhdGluZyBubyBjb3JyZWxhdGlvbi4NCg0KTXV0dWFsIEluZm9ybWF0aW9uLCBvbiB0aGUgb3RoZXIgaGFuZCwgaXMgYSBiaXQgbW9yZSBjb21wbGV4LiBJdCBtZWFzdXJlcyBob3cgbXVjaCBrbm93aW5nIHRoZSB2YWx1ZSBvZiBvbmUgdmFyaWFibGUgcmVkdWNlcyB1bmNlcnRhaW50eSBhYm91dCB0aGUgdmFsdWUgb2YgdGhlIG90aGVyLiBJbiBvdGhlciB3b3JkcywgaXQgdGVsbHMgdXMgaG93IG11Y2ggaW5mb3JtYXRpb24gYWJvdXQgb25lIHZhcmlhYmxlIGlzIGNvbnRhaW5lZCBpbiB0aGUgb3RoZXIuIE11dHVhbCBpbmZvcm1hdGlvbiB2YWx1ZXMgYXJlIGFsd2F5cyBub24tbmVnYXRpdmUsIHdpdGggbGFyZ2VyIHZhbHVlcyBpbmRpY2F0aW5nIGEgc3Ryb25nZXIgcmVsYXRpb25zaGlwLg0KDQpXaGlsZSBtdXR1YWwgaW5mb3JtYXRpb24gaXMgYSBwb3dlcmZ1bCB0b29sIGZvciB1bmRlcnN0YW5kaW5nIHRoZSByZWxhdGlvbnNoaXAgYmV0d2VlbiB0d28gdmFyaWFibGVzIHdpdGhpbiBhIGdpdmVuIGRhdGFzZXQgb3IgcHJvY2VzcywgaXQncyBub3QgZ2VuZXJhbGx5IG1lYW5pbmdmdWwgdG8gY29tcGFyZSBtdXR1YWwgaW5mb3JtYXRpb24gdmFsdWVzIGJldHdlZW4gZGlmZmVyZW50IGRhdGFzZXRzIG9yIHByb2Nlc3Nlcy4NCg0KVGhpcyBpcyBiZWNhdXNlIHRoZSBtdXR1YWwgaW5mb3JtYXRpb24gdmFsdWUgaXMgZGVwZW5kZW50IG9uIHRoZSBzcGVjaWZpYyBkaXN0cmlidXRpb24gb2YgdGhlIHZhcmlhYmxlcyBpbiBxdWVzdGlvbi4gRGlmZmVyZW50IHByb2Nlc3NlcyBvciBkYXRhc2V0cyBjYW4gaGF2ZSBkaWZmZXJlbnQgZGlzdHJpYnV0aW9ucywgd2hpY2ggY2FuIGxlYWQgdG8gZGlmZmVyZW50IG11dHVhbCBpbmZvcm1hdGlvbiB2YWx1ZXMgZXZlbiBpZiB0aGUgdW5kZXJseWluZyByZWxhdGlvbnNoaXBzIGFyZSBzaW1pbGFyLg0KDQpNb3Jlb3ZlciwgbXV0dWFsIGluZm9ybWF0aW9uIGlzIG1lYXN1cmVkIGluIGJpdHMgKG9yIHNvbWV0aW1lcyBuYXRzKSwgd2hpY2ggcmVwcmVzZW50IHRoZSByZWR1Y3Rpb24gaW4gdW5jZXJ0YWludHkgYWJvdXQgb25lIHZhcmlhYmxlIGdpdmVuIGtub3dsZWRnZSBvZiBhbm90aGVyLiBUaGlzIHJlZHVjdGlvbiBpbiB1bmNlcnRhaW50eSBpcyBzcGVjaWZpYyB0byB0aGUgdmFyaWFibGVzIGFuZCB0aGVpciBkaXN0cmlidXRpb24sIGFuZCBkb2Vzbid0IG5lY2Vzc2FyaWx5IHRyYW5zbGF0ZSB0byBvdGhlciBjb250ZXh0cy4NCg0KU28sIHdoaWxlIG11dHVhbCBpbmZvcm1hdGlvbiBjYW4gcHJvdmlkZSB2YWx1YWJsZSBpbnNpZ2h0cyBpbnRvIHRoZSByZWxhdGlvbnNoaXBzIHdpdGhpbiBhIGRhdGFzZXQsIGl0IHNob3VsZCBiZSB1c2VkIHdpdGggY2F1dGlvbiB3aGVuIGNvbXBhcmluZyBhY3Jvc3MgZGF0YXNldHMgb3IgcHJvY2Vzc2VzLiBJdCdzIGFsd2F5cyBpbXBvcnRhbnQgdG8gY29uc2lkZXIgdGhlIHNwZWNpZmljIGNvbnRleHQgYW5kIGNoYXJhY3RlcmlzdGljcyBvZiB0aGUgZGF0YSB5b3UncmUgd29ya2luZyB3aXRoLg0KDQpMZXQncyBsb29rIGF0IHNvbWUgcmFuZG9tIHZhcmlhYmxlcyBJJ3ZlIGdlbmVyYXRlZCB0byBzZWUgaG93IHRoaXMgd29ya3Mgb3V0IGluIGFuIGV4YW1wbGUNCg0KYGBge3J9DQoNCiNwYWNrYWdlcw0KaWYgKCFyZXF1aXJlKCJpbmZvdGhlbyIpKSB7DQogIGluc3RhbGwucGFja2FnZXMoImluZm90aGVvIikNCn0NCmlmICghcmVxdWlyZSgiTUFTUyIpKSB7DQogIGluc3RhbGwucGFja2FnZXMoIk1BU1MiKQ0KfQ0KbGlicmFyeShpbmZvdGhlbykNCmxpYnJhcnkoTUFTUykNCg0KIyBTZXQgc2VlZCANCnNldC5zZWVkKDEyMykNCnJlc3VsdHMgPC0gZGF0YS5mcmFtZShjb3JyZWxhdGlvbiA9IG51bWVyaWMoKSwgbXV0dWFsX2luZm9ybWF0aW9uID0gbnVtZXJpYygpKQ0KDQojUGxheWluZyB3aXRoIGNvcnJlbGF0aW9ucyBvZiBpbnRlcmVzdCAoUGxheWluZyBDT0kpDQpmb3IgKGNvcnJlbGF0aW9uIGluIHNlcSgwLCAxLCBieSA9IDAuMSkpIHsNCiAgDQogICMgYml2YXJpYXRlIG5vcm1hbCBkaXN0DQogIGRhdGEgPC0gbXZybm9ybSgxMDAwLCBtdSA9IGMoMCwgMCksIFNpZ21hID0gbWF0cml4KGMoMSwgY29ycmVsYXRpb24sIGNvcnJlbGF0aW9uLCAxKSwgbnJvdyA9IDIpKQ0KICANCiAgbXV0dWFsX2luZm9ybWF0aW9uIDwtIG11dGluZm9ybWF0aW9uKGRpc2NyZXRpemUoZGF0YVssIDFdKSwgZGlzY3JldGl6ZShkYXRhWywgMl0pKQ0KICANCiAgcmVzdWx0cyA8LSByYmluZChyZXN1bHRzLCBjKGNvcnJlbGF0aW9uLCBtdXR1YWxfaW5mb3JtYXRpb24pKQ0KfQ0KDQpjb2xuYW1lcyhyZXN1bHRzKSA8LSBjKCJDb3JyZWxhdGlvbiIsICJNdXR1YWwgSW5mb3JtYXRpb24iKQ0KcGxvdChyZXN1bHRzJENvcnJlbGF0aW9uLCByZXN1bHRzJGBNdXR1YWwgSW5mb3JtYXRpb25gLCANCiAgICAgeGxhYiA9ICJDb3JyZWxhdGlvbiIsIHlsYWIgPSAiTXV0dWFsIEluZm9ybWF0aW9uIiwgDQogICAgIG1haW4gPSAiQ29ycmVsYXRpb24gdnMgTXV0dWFsIEluZm9ybWF0aW9uIikNCnByaW50KHJlc3VsdHMpDQpgYGANCg0KV2hlbiB0aGUgY29ycmVsYXRpb24gaXMgMCAobWVhbmluZyB0aGUgdmFyaWFibGVzIGFyZSBub3QgY29ycmVsYXRlZCksIHRoZSBtdXR1YWwgaW5mb3JtYXRpb24gaXMgYWxzbyBxdWl0ZSBsb3csIGluZGljYXRpbmcgdGhhdCBrbm93aW5nIHRoZSB2YWx1ZSBvZiBvbmUgdmFyaWFibGUgZG9lc24ndCB0ZWxsIHVzIG11Y2ggYWJvdXQgdGhlIG90aGVyLg0KDQpBcyB0aGUgY29ycmVsYXRpb24gaW5jcmVhc2VzIGZyb20gMCB0byAwLjksIHRoZSBtdXR1YWwgaW5mb3JtYXRpb24gYWxzbyBpbmNyZWFzZXMuIFRoaXMgbWVhbnMgdGhhdCBhcyB0aGUgdmFyaWFibGVzIGJlY29tZSBtb3JlIHN0cm9uZ2x5IGNvcnJlbGF0ZWQsIGtub3dpbmcgdGhlIHZhbHVlIG9mIG9uZSB2YXJpYWJsZSBnaXZlcyB1cyBtb3JlIGluZm9ybWF0aW9uIGFib3V0IHRoZSBvdGhlci4gSSd2ZSBhbHNvIGluY2x1ZGVkIGNvcnJlbGF0aW9uIG9mIDEgZm9yIGNvbXBsZXRlbmVzcy4NCg0KVGhlIG11dHVhbCBpbmZvcm1hdGlvbiBpbmNyZWFzZXMgbW9yZSByYXBpZGx5IGFzIHRoZSBjb3JyZWxhdGlvbiBnZXRzIGhpZ2hlci4gVGhpcyBzdWdnZXN0cyB0aGF0IGV2ZW4gc21hbGwgaW5jcmVhc2VzIGluIGNvcnJlbGF0aW9uIGNhbiBsZWFkIHRvIGxhcmdlIGluY3JlYXNlcyBpbiBtdXR1YWwgaW5mb3JtYXRpb24gd2hlbiB0aGUgY29ycmVsYXRpb24gaXMgYWxyZWFkeSBoaWdoLg0KDQpOb3RlIHRoYXQgdGhlIG11dHVhbCBpbmZvcm1hdGlvbiBmb3IgYSBjb3JyZWxhdGlvbiBvZiAuNiAofi4yNCkgaXMgY2xvc2VyIHRoZSBtdXR1YWwgaW5mb3JtYXRpb24gb2YgY29ycmVsYXRpb24gMC4wICh+LjAzKSB0aGFuIHRvIGNvcnJlbGF0aW9uIG9mIC44ICh+LjQ5KQ0KDQpJbiBzdW1tYXJ5LCBib3RoIGNvcnJlbGF0aW9uIGFuZCBtdXR1YWwgaW5mb3JtYXRpb24gYXJlIHdheXMgdG8gbWVhc3VyZSB0aGUgcmVsYXRpb25zaGlwIGJldHdlZW4gdHdvIHZhcmlhYmxlcywgYnV0IHRoZXkgY2FwdHVyZSBzbGlnaHRseSBkaWZmZXJlbnQgYXNwZWN0cyBvZiB0aGlzIHJlbGF0aW9uc2hpcC4gV2hpbGUgY29ycmVsYXRpb24gbWVhc3VyZXMgdGhlIGRlZ3JlZSB0byB3aGljaCB0aGUgdmFyaWFibGVzIG1vdmUgdG9nZXRoZXIsIG11dHVhbCBpbmZvcm1hdGlvbiBtZWFzdXJlcyB0aGUgYW1vdW50IG9mIGluZm9ybWF0aW9uIHRoZXkgc2hhcmUuDQoNCkEgbm90ZSBmb3IgdGhlIGN1cmlvdXM6IFRoZXJlIGFyZSBleGNlcHRpb25zLCBidXQgdGhpcyBpcyB0aGUgcnVsZQ==