Learning on LowDimensional Image Data (1)
Posted on Sat 18 April 2020 in learning
Introduction
Many problems in machine learning are difficult to interpret because the highdimensional state spaces elude intuition. Certain specific traits of these spaces  often referred to as the curse of dimensionality  influence the distribution of objects, metric properties and other aspects in a way that makes it difficult to apply geometric imagination and methods  leave alone differential (smooth) ones  to the manifolds that express decision regions of  in particular classification  problems in these spaces.
In this article we will restrict ourselves to two or threedimensional state spaces to see what we can learn from these much more accessible objects.
Decision Regions in Toy Spaces
How many classification problems can an image of two pixels create?
Let $N$ define the number of possible pixel values of a grayscale image and $p$ his number of pixels. The totality of all these images forms a $p$dimensional state space and is called the Classification Toy Space of dimension $p$ or $\bm{T_p}$.
Consider the following problems:

Binary classification $(k=2)$:
For $N = 256$ and $p = 2$, $N^p = 65536$ possible $2$pixel images exist. $T_2$ contains $2^{N^2}$ different subsets ^{1} ^{2} and every such subset $U$ together with his complement $\overline{U}$ can be interpreted as decision regions of a particular binary classification problem. Obviously, the two classification problems $(U,\overline{U})$ and $(\overline{U}, U)$ are equivalent. After removing the trivial cases $(T_2,\emptyset),(\emptyset,T_2)$ and considering equivalent partitions the same we have finally $2^{N^21}1$ nontrivial binary problems. Each of them can be represented as a single $N \times N$ image with two colors, marking the complementary decision regions. Note that the latter form a binary partition of $T_2$.

Classification with $k$ classes $(k \geq 2)$:
The Stirling Number is the number of partitions of a set of $n$ elements by $k$ nonempty sets: $S(n,k) = \bigg\{{n\atop k}\bigg\} = \frac{1}{k!}\sum_{j=0}^k (1)^j\binom{k}{j}(kj)^n$ For $k=2$ and $n=N^2$ we get the previous case. In general, the number of classification problems with $k$ categories in $T_p$ is $S(N^p,k)$. In the case $p=3, \: T_p$ is a cube.
Regression Toy Spaces
In order to consider regression problems, we will add an additional dimension to every element of classification toy spaces. This dimension contains values in [0,1] that should in general be interpreted as probabilities before the decision step of a classification problem. The corresponding space is called Regression Toy Space or $\bm{{\Theta}_p}$. In case, the context is unambiguous, we simply use the term Toy Space.
Examples
Spiralshaped regions for a binary classificator $(k=2)$  Decision regions for classification problem with multiple categories $(k=10)$ 
To be continued...