The Origin family of multiprocessor systems includes the entry-level Origin200™ system, and deskside and rackmounted Origin2000™ systems.[1]
The Origin family is both modular and scalable; that is, it can be increased in size (scaled) by adding nodes (or node boards) to the interconnection fabric. Each Node board can contain up to two R10000™ processors, with accompanying cache, directory, main memory, and interfaces to both I/O devices and the interconnection fabric.
The interconnection fabric (called the CrayLink™ Interconnect) replaces the shared bus of the Everest architecture with a web of point-to-point links that simultaneously connect the nodes to each other and present a multitude of paths from one node to another. For instance, as shown in Figure 1-1. R1 can communicate with R0, R2 to R3, R4 to R6, and R5 to R7, all without having to interface with any other node.
The Origin200 comes in a server module, while the Origin2000 has several types of modules:
graphics
server
peripheral
These Origin2000 modules are used in two types of systems:
deskside
rackmounted
This hierarchy, in order of increasing complexity, is shown in Figure 1-2.
An Origin2000 system can be a single node or it can consist of a number of nodes mounted inside a deskside enclosure. Combinations of these deskside enclosures can be combined in a rack, and a system can be made up of a number of racks. Presently, the largest system available has 128 processors (a 128P system).
The MIPS® family of 64-bit processors provide a single uniform virtual address space for user processes. The Origin family uses the R10000 processor which defines a 244, 16 terabyte (TB) user-addressable virtual address space labelled xuseg.
The Origin family has a distributed shared-memory architecture, in which shared main memory is distributed amongst the nodes. This shared memory is accessible to every processor in the system.
Following is a mapping of the virtual address bits as they are decoded in Origin family. Detailed descriptions of the R10000 address spaces are given in the MIPS R10000 Microprocessor User's Manual.
VA[61:59] Cache Algorithm bits set the behavior of the processor when executing load and store instructions. There are five cache algorithms, as listed in Table 1-1:
Table 1-1. Origin Family Cache Algorithms
VA[61:59] | Cache Algorithm |
---|---|
000 | Reserved |
001 | Reserved |
010 | Uncached |
011 | Cacheable, noncoherent |
100 | Cacheable, coherent exclusive |
101 | Cacheable, coherent exclusive on write |
110 | Reserved |
111 | Uncached accelerated |
In kernel mode, when VA[63:62] are 102, the Uncached Attribute bits on VA[58:57] select among the four uncached spaces, as described in Table 2-1 in Chapter 2.
To a processor, main memory appears as a single address space containing many individually-addressable blocks, or pages. Each node is allotted a static portion of this address space — which means there is a gap in the address space if a node is not present. Figure 1-5 shows an address space in which each node is allocated 4 GB of address space, and Node 2 is missing, showing a gap from address space 4G to 8G.
Secondary cache lines are fixed in size at 32 words, or 128 bytes. Main memory page sizes are multiples of 4 KB, usually 16 KB. These two configurations are shown in Figure 1-6.
Architecturally, the physical address is divided into the fields shown in Figure 1-7. The upper 12 index bits of the NUMA Address Space Identifier (NASID) are used to select the node that contains the addressed physical memory. The lower 36 NASID offset bits are used to address memory blocks within a NASID's physical memory space.
The architectural limit of physical addressing is:
Note that Origin systems do not use this entire address space, but are implemented as subsets of these architectural limits, as described in the next section.
Although the architecture specifies a 48-bit address space, the initial implementation of the Origin family does not support this entire address range. Instead, the M Mode configuration, shown in Figure 1-8, is supported.
M Mode places an 8-bit NASID index in the upper 8 bits of the physical address. The remaining 32 bits of the physical address are used as offsets within each NASID. The NASID index addresses up to 256 nodes (512 processors), and each node addresses up to 4 GB of main memory.
Inside the Hub ASIC, a 41-bit format is used to address physical memory. An extra bit is added to the M Mode NASID index, while the lower 32 bits are used as the address offset. Figure 1-9 shows how the system converts the 40-bit M Mode address to the 41-bit Hub address: in M Mode an extra address bit is added to the NASID index, making it 9 bits wide.
M Mode operations are described in the next chapter.
[1] For more information on the hardware aspects of the Origin family, please refer to Origin and Onyx2 Theory of Operations Manual, document number 007-3439-nnn.