Course:CPSC 522/SelfOrganizing Maps
Contents
SelfOrganizing Maps
The selforganizing map is a dimenionalityreduction approach that stems from the similaritybased structure of the brain.
Principal Author: Svetlana Sodol
Collaborators:
Abstract
Selforganizing map is an automatic dataanalysis method, that allows to obtain an insight into the topographical relationships of the input data.
We introduce the concept of a selforganizing map, its computing and biological backgrounds. We describe the algorithms for implementation including a batch approach. We mention existing opensource implementations and list application areas.
Builds on
Selforganizing maps are a type of unsupervised artificial neural networks.
Related Pages
Selforganizing maps are related to vector quantization methods, with some biological connections and opportunity for emergent properties. Can be compared to PCA methods for dimensionality reduction.
Content
Background
The formulation of the selforganizing map (SOM) is attributed to Teuvo Kohonen (and SOMs are also referred to as Kohonen maps)^{[1]}.
The main idea is to preserve similarity structure (topology) of the highdimensional input data in the lowerdimensional "map" produced.
Biological basis
Kohonen formulates the background for SOMs as grounded in biology. In the brain, specific areas specialize on inputs that are similar. It has been found that neural cells in the brain often form assemblies, called brain maps, in which their topographic location corresponds to some feature value of a specific stimulus in an orderly fashion.^{[2]}. For example, all the visual input is processed separately from auditory input, and inputs from visual cells close together stay close together (retinotopic organization). It has been shown that these maps are not predetermined genetically but are modified strongly by experiences (Merzenich 1983) and thus are formed instead by learning.
Brain maps lead to some further important realizations about the brain:
1) this process is general to how the brain functions (generalizable to type of input)
2) what the brain cells get has been preprocessed and therefore is abstract (not linearly related to the actual sensory experience)
Thus, abstract representations can form at any level of abstraction using such a similiaritybased processing model, and we can model brain maps that operate on semantic items (the emergence of operation on meaning that is actually grounded in neurons)
and "how symbolic representation for concepts can be formed automatically", which has been a big idea for cognitive science ^{[3]}.
This has lead to development of the early artificial featuresensitive cells through competitive learning neural networks.
Computational connection
An optimally tuned featuresensitive filter by competitive learning has already been demonstrated in abstract form in signal processing. It was the classical vector quantization (VQ) algorithm (alternatively known as the kmeans clustering  we represent each data point by its closest centroid or codebook vector). SOM resembles the VQ heavily, but differs from other artificial neural networks approaches in that it does not use errorcorrection. What distinguishes the SOM from the VQ is that the centroids are also spatially and globally ordered.^{[2]}
Algorithm
In a 2013 paper, Kohonen summarizes the SOM algorithm as follows:
"Every input data item shall select the model that matches best with
the input item, and this model, as well as a subset of its spatial
neighbors in the grid, shall be modified for better matching."^{[2]}
Similar to VQ (and kmeans implementations) the modifications are centred on the winning model and the local changes are propagated over time.
There are two versions for SOM algorithm:
1) Incremental: recursive stepwise approximation where the inputs are fed to the network one at a time.
2) Batch: all data is applied as a batch where all model nodes are updated concurrently.
In the image on the right you can see a representation of what the SOM model accomplishes on the input data. This image is taken from the original Wikipedia page on SOMs and was made by user Chompinha.
SOM Structure
SOM is usually structured as a 2D regular grid of nodes. Every node is a representation (model denoted by ) of the input observations. Random or preorganized initializations for the model values work, dependent on the task.
The inputs are denoted as for iteration .
The grid is automatically organized such that similar models are placed closer to each other than to dissimilar ones, thus being "both a similiarity graph and a clustering diagram".^{[3]}
Incremental algorithm
After initialization, the incremental algorithm has two main steps that are repeated for many iterations until convergence:
For each input :
1) Get the "winner" model for the input
2) Update the models
Finding the "winner"
The "winner" model is defined by the following condition:
where is the current model value at node
is the value of the current observation
Thus the "winner" model found in step 1 is the model that matches best (is closest) with the current input. The distance between the models and the inputs is usually Euclidean.
Updating the models
For step 2 the following is the update function  at iteration calculate the value for each model at iteration :
is the neighbourhood function value for nodes and
is the "winner" model's node for the current iteration from step 1.
Neighbourhood function
The neighbourhood function can be described simply as if is smaller than given radius from node and otherwise .
Here, is the learning rate, that decreses with the number of iterations,
(for the 2D case) are the grid locations for the model node being updated and the winner model node
The radius of the nodes that need to be updated also decreases with the number of iterations. From this definition we can form the set of nodes that are within the radius and refer to it as the neighbourhood set .
The definition of the neighbourhood function is often a Gaussian:
Batch algorithm
The batch implementation does not require specification of learning factor and is often faster in practice.
The batch algorithm is an iterative algorithm, where the update is performed on all of the data at the same time and might require many run throughs, until we reach convergence at iteration and the final models .
This is the update function^{[4]}:
Convergence condition
For both algorithms, we run the update for many iterations, until we reach a stable solution  the models converge i.e. are not changed much between iterations.
In mathematical terms we want to reach the following stationary state condition:
Implementations
Kohonen and his group have developed and released implementations and freeware software packages: SOMPAC and SOM Toolbox. There is also an available parallel implementation Somoclu.^{[4]}
Applications
Visualization
SOMs are very useful for visualization, as the spatial order of the map provides a convenient and quick visual inspection of the similarity relationships of the input data and their clustering tendency. This can also be used to validate samples. And with proper calibration, the clusterings/classes become explicit.^{[2]} See the figure to the right for an example. This image is taken from https://peltarion.com/platform and featured on the regular Wikipedia page for SOMs.
Management of large document collections
However, SOMs have mostly been used in management of large document collections.
In 2000, Kohonen et al used SOM to manage an electronic version of Encyclopaedia Britiannica.
A total of 115 000 documents were used, with vocabulary size of 39 058 words. Inverse document frequency was used to weight the words to represent the documents in an abstract way. The SOM grid contained 12 096 nodes. The end result is a map, where each document is mapped onto its best matching node. In a browsing mode, the user can click on a particular node and see a list of all of the documents associated with that node. For example, clicking on a node titled "shark" one would see links to articles titled "fox shark", "blacktip shark" and "shark: Description and habits ". Other modes of usage can also be implemented.
Other applications
Some other application areas include:
 Industrial process control
 Finance analyses
 Bioinformatics and biomedical analysis
 Profiling criminal behavior
 Categorization of galaxies
 Clustering and classification of visual data ^{[5]}
 Classification of diabetes ^{[6]}
 Vowel recognition ^{[6]}
Benefits
 SOM can handle large volumes of data
 SOM can work on large mappings in reasonable time even on a personal computer
 Provides good visualizations, which are a hard problem for multidimensional data
 Flexible in applications  can use SOM to cluster, classify, identify outliers, etc
 Can use the stepwise algorithm to add new examples online
 Based on biology
Pitfalls
 SOM does not seem to be used as a stand alone  as with other learning methods there is a lot of preprocessing
 Similiarity (the main measure of closeness for SOM) is hard to define
 SOM does not make a generative model of the data  focuses on modelling the trends it sees in the data without assumptions of where the data is coming from (in the simple original SOM case)
 Biological basis is often ignored in implementations/extensions/applications
Annotated Bibliography
 ↑ Kohonen, Teuvo (1982). "Selforganized formation of topologically correct feature maps". Biological Cybernetics. 43: 59–69.
 ↑ ^{2.0} ^{2.1} ^{2.2} ^{2.3} Kohonen, Teuvo (January 2013). "Essentials of the selforganizing map". Neural Networks. 37: 52–65.
 ↑ ^{3.0} ^{3.1} Kohonen, Teuvo (May 1998). "The selforganizing map". Neurocomputing. 21 (13): 1–6.
 ↑ ^{4.0} ^{4.1} Wittek, Peter; Gao, Shi Chao (June 2017). "Somoclu: An Efficient Parallel Library for SelfOrganizing Maps" (PDF). Journal of Statistical Software. 78 (9).
 ↑ Furao, Shen; Ogura, Tomotaka (2007). "An enhanced selforganizing incremental neural network for online unsupervised learning". Neural Networks. 20: 893–903.
 ↑ ^{6.0} ^{6.1} Deng, Da; Kasabov, Nikola (2003). "Online pattern analysis by evolving selforganizing maps". Neurocomputing. 51: 87–103.
To Add
is the following needed?
 code/pseudocode
 example
 annotations
For some other pages:
 history of formation of the SOM algorithms and basis for the decisions
 initialization of parameters
 theoretical implications
 applications and comparison to other approaches
 impact on brain modelling progress?
 big problem in AI/roboitcs  grounding of meaning  related?
