Classification of Parallel Computers
graph TD
PS[Parallel System] --> SIMD
PS --> MIMD
MIMD --> DM[Distributed Memory]
DM --> MPP
DM --> NOW
DM --> Clusters
MIMD --> SM[Shared Memory]
SM --> UMA
SM --> NUMA
SM --> COMA
NUMA --> ccNUMA
NUMA --> ncNUMA
Classification of Memory
Distributed Memory
- Architecture
- Programming Model: Message passing.
- Types
- MPP Massive Parallel System
- Specialized system
- Unified Operating system
- Most Super computers run this
- NOW Network of Workstations
- Loosely connected group of individual computers with “standard” network
- Cluster
- Network of Computers with high performance network.
- Separate instance of operating system.
- MPP Massive Parallel System
Shared Memory
- Programming model: Shared variables.
- Simultaneous access to the same variable is not allowed.
- Types
- NUMA
- Each Processor has has a local memory which can be accessed by other processors.
- DIR → Directory. Stores information regarding the files in cache. Like a registry.
- During memory access the DIR is read to know if the file should be accessed from the cache or from memory.
- ccNUMA: Cache Coherent non-Uniform Memory Access
- nccNUMA: non-ccNUMA
- COMA: Cache only memory access
- TAG → It is an identifier for each Datum.
- Datum migrates to where it is needed, hence you need a TAG to identify the data.
- Here the memory is accessed through the cache only.
- This means that this if the same data is being used, the access operation is much faster.
- NUMA
Understanding the architecture is essential to writing efficient programs.
- Example:
- Caches are accessed in form of cache lines of fixed size - like a bus transporting people, it has the same energy is used to transport 1 person and 100 passengers.
- To avoid Cache misses, data should be processed continuously. (cache miss: it occurs when the program looks at the DIR and doesn’t find what’s required hence it searches for it in memory which is expensive)
- Code example
data2
is faster. Because the loop is inside (continuousj
).data1
has to come out of the inner loop. Hence it is slower.
Networks
- Distance between nodes
- Shortest path in # edges between two nodes. (#→ number of something )
- Degree
- Largest number of edges at a node.
- Diameter
- Largest distance between two nodes.
Examples
- Fully connected Network
- Linear network
- Ring network or 1D Torus
- dD Torus (d→ dimension)
- D-Dimensional hypercube
- Nodes
- Node connected with neighbours.
- indexed by bits.
- Diameter = =
- Degree = =
- Node index differ from neighbor’s index by 1 bit.
- Patch through the network changes bit at a time
- 000→001→101→111
- Distance between nodes = # Bits different in the index
- Hamming Distance: number of non-matching bits two words of same length.(How many bits have to be changed to go from one data to another data.)