Humanity already has at its disposal a reliable and quite simple formula for counting combinations. I’m talking, of course, about the well known binomial coefficient. Despite it being known so well, I could not help but completely forget about it a couple years ago, when I was faced against a combinatorics problem. During that memory lapse I ended up developing my own counting procedure, which I aim to expose and explain on this document.
Regarding the relevance of my counting procedure, I mostly contemplate it as a curiosity. Perhaps it might have pedagogical value, but admiteddly very little else —at least with my current grasp on Mathematics I am unable to see much value in it—.
From a finite, non-empty set \(A\) we construct a set \(B\) that is specified as follows: \[\large B = \bigg\{\{a_i\}_{i = 1}^k \hspace{0.4mm} \bigg| \hspace{0.4mm} \forall i, \hspace{0.8mm} a_i \in A\boldsymbol{;} \hspace{2mm} a_i \neq a_j \hspace{1mm} \text{for} \hspace{1mm} i \neq j \bigg\} \hspace{9mm} \text{with} \hspace{3mm} k = 2, ..., \left| A\right|\]
Let me remark that the elements of set \(B\) are not tuples —they are sets—. In the world of tuples \((a, b, c)\) is not the same as \((c, a, b)\), nor is it the same as any of its other possible rearrangements. In the world of sets, however, \(\{a, b, c\}\) is in fact the same as \(\{c, a, b\}\), and it is also the same as each of its other possible rearrangements.
That is the first feature of the elements of \(B\): they are sequence-independent. The second feature of the elements of \(B\) is that each of them consists of \(\boldsymbol{k}\) elements of \(\boldsymbol{A}\) —being \(k\) an integer between 2 and the cardinality of \(A\)—. Lastly, the third feature that must be satisfied by every element of \(B\) is that within it no duplicate of an element of \(\boldsymbol{A}\) is allowed —that is, every element of \(A\) is only allowed to account for a single member of a collection that is in \(B\) (yet another way of saying this: it is invalid to create an element of \(\boldsymbol{B}\) by enlisting the same element of \(\boldsymbol{A}\) as a member of it more than once)—.
Given this recipe for constructing set \(B\) from set \(A\), we are asked what is the cardinality of \(B\) —i.e. how many elements does \(B\) comprise—.
The problem statement given above is certain to not be encountered in a real-world inspired setting. A much more ordinarily-written version of the problem is, for example, the following. There are three labels and five boxes. Considering that only a single label is to be adhered to a box, how many ways are there in which we can label three out of the five boxes?
We shall consider the set \(A\) to be \(\{1, 2, 3, 4, 5\}\). Set \(B\) is, then, defined as follows: \[\large B = \bigg\{\{a, b, c\} \hspace{0.4mm} \bigg| \hspace{0.4mm} a \in A; \hspace{2mm} b \in A - \{a\}; \hspace{2mm} c \in A - \{a, b\}\bigg\}\]
Let me reassert the features of the elements of \(B\) so that we can then verify whether the problem at hand satisfies those features.
Regarding the first feature, note that a possible member of \(B\) is, for example, the set \(\{1, 2, 4\}\). What this set represents is the outcome where the labeled boxes turn out to be the first, the second, and the fourth one. We only care about which boxes end up being labeled, with no regard for the order in which they are labeled. For our purposes —and that is why we define the elements of \(B\) to be sets instead of tuples—, the collection \(\{1, 2, 4\}\) is exactly the same as the collection \(\{4, 1, 2\}\), and it is also exactly the same as any of its other possible rearrangements.
Moving on to the second feature, it is very evident that in this case \(k\) is equal to \(3\), which indeed is an integer not bigger than \(5\) —the cardinality of \(A\)— and not smaller than \(2\).
Finally, as to the third feature, the very statement of our boxes and labels problem specifies that only a single label is to be adhered to a box. A set in \(B\) in which there where duplicate values of elements of \(A\) would come to mean that a certain box was labeled more than once, and thus the problem statement would be infringed.
With this we should be convinced that the boxes and labels problem is a particular incarnation of the general problem we introduced on Section 1. Up next, the counting method shall be explained.
Suppose the boxes are arranged side by side, one next to the other. The number \(1 \in A\) would come to represent the box that is placed furthest to the left. The number \(2 \in A\) would come to represent the box that is placed to the right of the previous box, and so on. On the other hand, we have three labels which we will be distributing among the boxes.
My proposal is to establish an arrangement of the labels. That is, we are to refer to one of the labels as the first label, another one is to be referred to as the second label, and so on. What makes the \(\boldsymbol{i}\)-th label be the \(\boldsymbol{i}\)-th label is that the \(\boldsymbol{(i+1)}\)-th label will never be allowed to be adhered to a box to the left of the box that the \(\boldsymbol{i}\)-th label is adhered to.
Since the restriction we just imposed on the placement of labels must hold but we also need all three labels to be adhered (each one to a different box), the \(i\)-th label can not be placed on just any box —it must leave a minimum number of boxes available to its right so that each of the remaining labels can find a valid box to be adhered to—. For example, if we adhere the first label to the fourth box, there will be only one available box for the remaining labels. On this box (which is the fifth one) the second label will be adhered, but then the third label will be left without a box. Therefore, the second label must make sure to leave at least one box to its right, and the first label must make sure to leave at least two boxes to its right.
Taking into account these considerations, we are going to distribute the first two labels among the boxes and then count the number of possible boxes that the third label can be adhered to. Below is a nice illustration of all possible cases under the condition that the first label is adhered to the first box.
Given that the first label is adhered to the first box, the second label can either be adhered to the second box, or to the third box, or to the fourth box. The second label can not be adhered to the fifth box because if it were then there would not be a box available for the third label to be adhered to. Let us consider the case when the second label is adhered to the second box. Then, given the placement of the first two labels, the third label has 3 options: it can either be placed on the third box, or on the fourth one, or on the fifth one. Similarly, from the Figure above we can appreciate that, maintaining the placement of the first label, if the second label is adhered to the third box then the third label has 2 options, and it has only 1 option if the second label is placed on the fourth box instead.
The first label is, under the restrictions we have established, by all means allowed to be placed on the second or third boxes. The Figure below illustrates all valid placements of the labels given that the first label is adhered to either of those boxes.
I believe the diagrams and the explanations provided should suffice for understanding the restrictions we imposed on the placement of the labels. Now, the question that must be addressed is why do we impose them. Recall that the restrictions are two.
The second restriction has to be specified in order to prevent the first restriction from leading us to run out of boxes without using up all our labels. We wouldn’t require it if we did not impose the first restriction. The really interesting thing is why do we require the first resctriction.
Allow me to explain it through an example. Consider the arrangement where the first label is adhered to the second box, the second label is adhered to the fourth box, and we leave the third label no option but to be placed on the fifth box. Why don’t we allow instead the third label to be placed either on the first box or on the third box? Suppose we do place the third label on the first box. This arrangement is represented by the collection \(\{2, 4, 1\}\). Please do note that this arrangement is exactly equal to the one given by \(\{1, 2, 4\}\) —that is, the arrangement where the first label is adhered to the first box, the second label is adhered to the second box, and the third label is adhered to the fourth box—. Thus, we have just obtained a redundant collection —a collection that was going to be found anyway while considering all valid arrangements that involve having the first two labels fixed on other positions (namely, first label on first position and second label on second position)—. And what about the other possibility? What about, while maintaining the first label fixed on the second box and the second label fixed on the fourth box, adhering the third label to the third box? The corresponding collection would be \(\{2, 4, 3\}\). This is exactly the same collection as \(\{2, 3, 4\}\), which was bound to be found by considering all valid arrangements that involve fixing the first label on the second box and fixing the second label on the third box.
Therefore, the first restriction is imposed in order to not obtain redundant collections.
So far so good. We now need to derive a nice formula from this thought process.
Let \(y\) be the number of boxes that remain to the right of the box to which the first label is adhered. We know, as per the second restriction, that the least value \(y\) can take is in this case \(\boldsymbol{2}\) —since there are two other labels that need to be assigned valid boxes—. The biggest value \(y\) can take is \(\boldsymbol{\left| A \right| - 1}\). In this case the cardinality of \(A\) is \(5\) —as there are five boxes—, and therefore the maximum value of \(y\) is \(4\).
Let \(x\) be the number of boxes that remain to the right of the box to which the second label is adhered. The minimum value of \(\boldsymbol{x}\) is \(\boldsymbol{1}\), for there is still one label that requires to be placed on a valid box. The maximum value of \(\boldsymbol{x}\) is, logically, \(\boldsymbol{y - 1}\). Think about it: if the second label is placed on the valid box that is adjacent to the box on which the first label was placed, then the amount of valid boxes diminishes by 1.
Therefore, the amount of possible ways in which we can distribute three labels among five boxes is
\[\large \sum_{y = 2}^4\sum_{x = 1}^{y - 1} x\]
Let us work out a little bit the formula we just wrote.
\[\begin{equation} \large \sum_{y = 2}^4\sum_{x = 1}^{y - 1} x = \sum_{x = 1}^1x + \sum_{x = 1}^2x + \sum_{x = 1}^{3}x \tag{4.1} \end{equation}\]
The first term on the right-hand side of Equation (4.1) accounts for \(y\) equal \(2\) —that is, for when the first label is adhered to the third box, leaving only \(\boldsymbol{2}\) valid boxes for the remaining labels—. The second term accounts for \(y\) equal \(3\) —that is, for when the first label is adhered to the second box, leaving \(\boldsymbol{3}\) valid boxes for the remaining labels—. The third term accounts for \(y\) equal \(4\) —that is, for when the first label is adhered to the first box, leaving \(\boldsymbol{4}\) valid boxes for the remaining labels—.
Going even further and focusing our attention on the third term on the right-hand side of Equation (4.1), our formula accurately conveys the idea that if the first label leaves four valid boxes then the second label can either leave \(3\), \(2\), or \(1\) valid boxes for the third label, and it also conveys the idea that we must sum those numbers in order to obtain the total number of collections obtainable by fixing the first label on the first box.
Similarly, all the terms on the right-hand side of the Equation convey a interpretation grounded on the thought process that was explained on Section 3.
Let me point out that computing the summations actually does yield the value of \(10\), which is equal to the binomial coefficient \(\large \binom{5}{3}\).
The formula depends on two arguments: \(\left| A \right|\) and \(k\). The argument \(\left|A \right|\) determines the upper limit of the outermost summation, while the argument \(k\) determines how many summations appear.
For example, for \(\left|A\right| = 13\) and \(k = 8\) the formula becomes \[\large \sum_{z = 7}^{12}\sum_{y = 6}^{z - 1}\sum_{x = 5}^{y - 1}\sum_{w = 4}^{x - 1}\sum_{v = 3}^{w - 1}\sum_{u = 2}^{v - 1}\sum_{t = 1}^{u - 1}t\]
Clearly, as the value of \(k\) increases not only does the notation become somewhat extravagant but also the calculation by hand becomes very tedious. Nevertheless, this counting method does fulfill its job. The usual way of determining how many combinations of size 8 can be made from a set of 13 elements is by computing \(\large \binom{13}{8}\). What this yields is 1287. If you were so kind to click on this link you would verify that my counting method correctly yields that same number.
It is now time for us to write the generic form of the formula, with \(\left|A \right|\) and \(k\) left unspecified. Due to its resemblance with \(C\), which is a notation used to refer to the binomial coefficient, I will denote my counting formula by \(\boldsymbol{\zeta}\).
Let \(\boldsymbol{m}\) be \(\boldsymbol{k - 2}\). Also, let us denote \(\boldsymbol{\left|A\right|}\) by \(\boldsymbol{n}\). Then, what from now on shall be called either the zeta formula or the zeta method for counting combinations is the following:
\[\begin{equation} \Large \zeta^n_m = \begin{cases} \sum_{s_{m+1} = m+1}^{n - 1}s_{m+1} & \text{if} \hspace{2.5mm} m = 0 \\ \\ \\ \sum_{s_{m+1} = m+1}^{n - 1}\bigg(\sum_{s_m = m}^{s_{m+1} - 1}s_m\bigg) & \text{if} \hspace{2.5mm} m = 1 \\ \\ \\ \sum_{s_{m+1} = m+1}^{n - 1}\bigg(\sum_{s_m = m}^{s_{m+1} - 1} \ldots \sum_{s_1 = 1}^{s_2 - 1}s_1\bigg) & \text{if} \hspace{2.5mm} m = 2, \ldots, n - 2 \end{cases} \tag{4.2} \end{equation}\]
Note that the convention just adopted of writing the zeta formula in terms of \(m\) instead of writing it in terms of \(k\) is helpful in order for the user to be aware of just how many summations have to be written. The outermost summation must always be present, followed by \(\boldsymbol{m}\) summations and then the parameter \(\boldsymbol{s_1}\).
The parameters \(s_i\) are placeholders; feel free to replace them with the symbols of your preference. I, for instance, on the example given above (the one with \(\left|A\right| = n = 13\), \(k = 8\)), replaced \(s_1\) with \(t\), then \(s_2\) with \(u\), etc. As for the interpretation of these parameters, it is easier to think of them in terms of our boxes and labels setup: \(\boldsymbol{s_i}\) is the number of valid boxes left after the \(\boldsymbol{(k-i)}\)-th label has been assigned a box.
For a reference on how to deal with the third case of Equation (4.2) you may check again the example above. It deals with \(k = 8\), which is the same as saying \(m = 6\).
Finally, even though this may be redundant at this point, let me state the following:
\[\begin{equation} \Large \zeta^n_m = C_{m+2}^n \tag{4.3} \end{equation}\]
It is well known that the binomial coefficient possesses a property of symmetry.
\[\begin{align} \Large C^n_k &= \Large C^n_{n-k} \\ \\ \Large C^n_{k-2+2} &= \Large C^n_{n-k-2+2} \end {align}\]
From this, by incorporating Equation (4.3) and replacing \(k = m + 2\) we obtain
\[\begin{equation} \Large \zeta^n_{m} = \zeta^n_{n - m - 4} \tag{4.4} \end{equation}\]
Equation (4.4) depicts the property of symmetry for the zeta formula. This is very useful because it allows us to trim the amount of summations that we deal with when making calculations. Instead of, for example, calculating \(\large\zeta^{20}_{13}\) —which involves \(m + 1 = 14\) summations— we can just calculate \(\large\zeta^{20}_3\), which is less intricate, as it involves only four summations.
Equation (4.3) depicts the relationship between the zeta formula and the binomial coefficient. On this Section we offer a proper demonstration to support the claim of equivalence between the zeta formula and the binomial coefficient.
A preliminary lemma required in our demonstration is the following:
\[\begin{align} \sum_{j = b}^a\frac{j!}{b!(j-b)!} &= \frac{(a+1)!}{(b+1)!(a-b)!} \tag{4.5} \end{align}\]
for \(a \in \mathbb{N}, \hspace{0.5mm} a \geq b\).
Let us demonstrate this lemma by mathematical induction.
As usual, the base case is trivial. Just replace \(a = b\) on Equation (4.5) and verify that the equation holds. It does.
Inductive hypothesis: \(\boldsymbol{\sum_{j = b}^{a - 1}\frac{j!}{b!(j-b)!} = \frac{a!}{(b+1)!(a-1-b)!}}\)
Inductive step:
\[\begin{align} \sum_{j = b}^a\frac{j!}{b!(j-b)!} &= \sum_{j = b}^{a - 1}\bigg(\frac{j!}{b!(j-b)!}\bigg) + \frac{a!}{b!(a-b)!} \\ \\ &= \frac{a!}{(b+1)!(a-1-b)!} + \frac{a!}{b!(a-b)!} \\ \\ &= \frac{a!(a-b)}{(b+1)b!(a-b)!} + \frac{a!}{b!(a-b)!} \\ \\ &= \frac{a!}{b!(a-b)!}\bigg(\frac{a-b}{b+1} + 1\bigg) \\ \\ &= \frac{a!}{b!(a-b)!}\bigg(\frac{a+1}{b+1}\bigg) \\ \\ &= \frac{(a+1)!}{(b+1)!(a-b)!} \end{align}\]
With this we have earned the right to invoke the lemma whenever we need it.
For \(m = 0\) the zeta formula becomes the sum of the first \(n-1\) natural numbers. This is known to be equal to \((n-1)n/2\). The binomial coefficient \(\large \binom{n}{2}\) is, on the other hand, \(n!/(2!(n-2)!) = n(n-1)(n-2)!/(2(n-2)!) = n(n-1)/2\). With this we have just proven Equation (4.3) for \(\boldsymbol{m = 0}\). This constitutes the base case in our proof by induction.
Inductive hypothesis:
\[\begin{align} \Large \zeta^n_{m-1} &= \Large C_{m+1}^n \\ \\ &= \frac{n!}{(m+1)!(n-m-1)!} \end{align}\]
and this holds true for any arbitrary natural number \(n\) greater than or equal to \(m+1\). We shall consider \(\boldsymbol{n = j}\) while performing the inductive step.
Inductive step:
Note that for \(m = 1, \ldots, n-2\) the zeta formula can be written as follows:
\[\begin{equation} \Large \zeta_m^n = \Large \sum_{j = m+1}^{n-1}\zeta_{m-1}^j \end{equation}\]
By incorporating the inductive hypothesis we obtain
\[\begin{equation} \Large \zeta_m^n = \large \sum_{j = m+1}^{n-1}\frac{j!}{(m+1)!(j-m-1)!} \end{equation}\]
Note that by considering \(\boldsymbol{b = m+1}\) and \(\boldsymbol{a = n-1}\) we have just found an instance where we can apply our lemma. Finally, then, by replacing Equation (4.5) on the right-hand side of the Equation above we obtain
\[\begin{equation} \Large \zeta_m^n = \normalsize \frac{n!}{(m + 2)!(n-m-2)!} \end{equation}\]
With this the proof has been fulfilled and we are now able to claim that the zeta formula is indeed an alternative take on the binomial coefficient.