Chapter 2. System Control

Chapter 2. System Control
Prev		Next

This chapter describes the functions of system controllers, in the following sections:

The control system for the SGI Altix 3000 series servers manages power control and sequencing, provides environmental control and monitoring, initiates system resets, stores identification and configuration information, and provides console/diagnostic and scan interface. Figure 2-1 shows a typical system control network.

Figure 2-1. SGI Altix 3000 Server System Control Network (Example)

Two Levels of System Control

The Altix 3000 server has two levels of control, as follows:

L1: brick-level controller. The L1 system controller is designed into all bricks except the TP900 storage module and the D–brick2; controller function varies slightly by brick.
L2: rack-level controller. This controller is standard in each rack containing R-bricks. The L2 controller allows remote maintenance, controls resource sharing, controls the L1 controllers in the system, and maintains controller configuration and topology information between itself and other L2 controllers.

Note: The D–brick2, which is not monitored by the L2 controller, has its own ESI/ops panel module with a microcontroller for monitoring and controlling all elements of the D–brick2.

System Controller Interaction

In all Altix 3000 servers with L2 controllers, the L1 controllers are slave devices to the L2 controller. The controllers communicate with each other in the following ways: An L1 controller of an I/O brick communicates with an L1 controller of a C-brick.

An L1 controller of a C-brick that is connected to an R-brick communicates with the L2 controller via the R-brick.
If you have an Altix 3000 server with multiple L2 controllers, they and a system console (like the SGIconsole) can connect with each other through an Ethernet hub and through an Ethernet network as shown in Figure 2-2.

Figure 2-2 diagrams the paths for interaction between the L1 and L2 controllers.

Figure 2-2. Controller Network

L1 Controller

All bricks except TP900 storage modules and D–brick2 storage modules have L1 controllers. The following subsections describe the basic features of all L1 controllers:

“L1 Controller Functions ”
“L1 Front Panel Display”

Note: For L1 controller commands, see the SGI L1 and L2 Controller Software User's Guide (007-3938-xxx).

L1 Controller Functions

Table 2-1 summarizes the control and monitoring functions that the L1 controller performs. Many of the L1 controller functions are common across all brick types; however, some functions are applicable to a specific brick type.

Table 2-1. L1 Controller Functions

Function	C–brick	R–brick	IX-brick	PX-brick
Controls voltage regulator modules (VRMs).	X	X	X	X
Controls voltage margining within the brick.	X	X	X	X
Controls and monitors fan speed.	X		X	X
Monitors voltage and reports failures.	X	X	X	X
Monitors and reports operating temperature and status of 48-VDC input power.	X	X	X	X
Monitors and controls LEDs.	X	X	X	X
Reads system identification (ID) PROMs.	X	X	X	X
Monitors the On/Off power switch.	X	X	X	X
Monitors the reset switch and the nonmaskable interrupt (NMI) switch.	X
Provides a USB hub chip with six master ports: one port connects internally to the R-brick's L1 controller, four ports connect to the L1 controllers of four C-bricks (via the NUMAlink 3 cable), and a master port connects to the L2 controller.		X
Reports the population of the PCI cards and the power levels of the PCI slots.			X	X
Powers on the PCI slots and their associated LEDs.			X	X

L1 Front Panel Display

Figure 2-3 shows the L1 controller front panel.

Figure 2-3. L1 Front Panel

The front panel display contains the following items:

2 x 12 character liquid crystal display (LCD). The display uniquely identifies the brick, shows system status, warns of required service, and identifies a failed component.
On/Off switch with LED (button with light-emitting diode [LED]).
Service required LED.
Failure LED.
Reset switch and non-maskable interrupt (NMI) button switch.

Note: The reset and NMI switches are available on the C-brick only.

L2 Controller

The L2 controller is a rack-level controller located at the top of the rack; it is a single-board computer that runs an embedded operating system out of flash memory.

The L2 system controller is standard in Altix 3700 server systems. The L2 system controller is required in a rack in the following circumstances:

The rack contains an R-brick.
The rack has an L2 controller touch display.
Remote maintenance of the system is required.

The L2 controller performs the following functions:

Controls resource sharing.
Controls all L1 controllers.
Maintains controller configuration and topology information between the L1 and L2 controllers.
Routes data between upstream devices and downstream devices.

Upstream devices (for example, rack display, console, and modem) provide control for the system, initiate commands for the downstream devices, and act on the messages that they receive from downstream devices.

Downstream devices (for example, the USB hub of the R-brick, and L1 controllers of the bricks) perform the actions specified by the L2 controller commands, send responses to the L2 controller that indicate the status of the commands, and send error messages to the L2 controller.
Allows remote maintenance via a modem.

In a system with more than one L2 controller, all L2 controllers are peers. Each L2 controller monitors its associated L1 controllers and propagates this information to the other L2 controllers.

Figure 2-4 diagrams the L2 controller and its interactions.

Figure 2-4. L2 Controller Interactions

The L2 controller is mounted in the top rack; it does not use configurable rack space. Figure 2-6 shows its location in a tall rack. The L2 controller consists of a touch display controller and L2 controller ports, which are described in these subsections:

“L2 Controller Touch Display ”
“L2 Controller Ports”

Note: For L2 controller commands, see the SGI L1 and L2 Controller Software User's Guide (007-3938-xxx).

L2 Controller Touch Display

The L2 controller touch display (see Figure 2-5) is a touch-pad LCD (liquid crystal display) screen display. The L2 controller's touch-screen translates what the user touches into commands and displays the results of the commands. If you slide the contrast control to the right, the contrast increases.

Figure 2-5. L2 Touch Display

The L2 controller touch display is located on the rear door of rack 001. The display is not visible when the rear door of the cabinet is closed. See Figure 2-6.

Figure 2-6. L2 Controller Touch Display

L2 Controller Ports

Table 2-2 describes the ports of the L2 controller.

Table 2-2. L2 Controller Ports

Quantity	Port	Connector Label	Connects To	Purpose or Notes
4	Standard downstream USB	L1 port 1 through L1 port 4	USB hubs of R–bricks	In a system with no R-brick, these ports connect to an L1 controller of a C-brick. The USB hub transfers status and control information between the L2 controller, which is the master of the USB ports, and the L1 controllers in the attached R-brick or C-bricks.
1	10/100-Base-T Ethernet , RJ45, autonegotiating	Enet	Ethernet hub	This port provides a means to connect multiple L2 controllers and to connect multiple L2 controllers to a console like the SGIconsole. The Ethernet hub provides eight Ethernet connectors. Any of these eight connectors can be used to cascade to another hub. See Figure 2-8.
1 1	RS-232 ports (DB-9; 115 Kbaud )	Console Modem	Dumb terminal Modem	Console and modem ports enable the user to input text-based commands and to receive text-based results. These ports operate in one of the following modes: L2 mode: L2 controller forwards all commands to the specified L2 controller. L1 mode: L2 controller forwards all commands to the specified L1 controller, except commands that are prefixed with `Ctrl-T`; the L2 controller interprets these commands. Console mode: L2 controller forwards all commands to the system console, except commands prefixed with `Ctrl-T`; the L2 controller interprets these commands.
1	L2 controller touch display	LCD display	L2 controller touch display	The LCD display is a user interface used to enter power on, power off, and reset commands to the system. The LCD display panel is located on the rear door of your system.
1	Power	PWR	Power distribution strip	This power connector is connected to the PDS to provide power to the L2 controller.

Figure 2-7 shows the ports on the L2 controller.

Figure 2-7. L2 Controller Connectors

The L2 controller connects to a modem through the modem connector on the back of the L2 controller. This connection provides a means of connecting remote support hardware to the system; however, the use of an Ethernet hub is the preferred method of connecting remote support hardware to the system.

The Ethernet hub provides eight Ethernet connectors. Figure 2-8 shows sample connections between the Ethernet hub, L2 controllers, and an SGIconsole.

Figure 2-8. Ethernet Hub System Controller Connections (Example)

If a system has more than seven L2 controllers, the Ethernet hub can be cascaded to another Ethernet hub through the cascade connector on the Ethernet hub. The cascade connector provides Ethernet signals that can be cabled to one of the eight Ethernet ports on the second Ethernet hub. The remaining seven Ethernet ports of the second Ethernet hub can be cabled to seven L2 controllers. Large systems may require additional Ethernet hubs in order to connect an SGIconsole to all of the L2 controllers in the system.

Console Hardware Requirements

The console type and how these console types are connected to the Altix 3000 servers is determined by whether the server has an L2 controller or not.

If you have an Altix 3300 server without an L2 controller, you connect a dumb terminal to the C-brick (console port). This connection enables you to view the status and error messages generated by the L1 controller and to enter L1 commands to manage and monitor your system.

If you have an Altix 3000 series server with an L2 controller, you can either connect an SGIconsole to the L2 controller (the Ethernet port) or connect a dumb terminal to the L2 controller console port. If you have multiple L2 controllers, you can interconnect the SGIconsole and the various L2 controllers with an Ethernet hub.

These console connections to the L2 controller enable you to view the status and error messages generated by both the L1 controllers and the L2 controller on your system. You can also use these consoles to input L1 and L2 commands to manage and monitor your system.

For more details on connecting a console to an Altix 3000 series server, see “Connecting a System Console” in Chapter 1. For more information on monitoring your server, see “Monitoring Your Server” in Chapter 1.

Using the L2 Controller Touch Display

The L2 controller touch display provides a simple graphical interface that allows you to perform basic functions, including the following:

Power up selected bricks or the entire system
Power down selected bricks or the entire system
Reset the system
Send a non-maskable interrupt (NMI) to the system
Select target destination(s)

Figure 2-9 illustrates the L2 controller touch display.

Figure 2-9. L2 Controller Touch Display Interface

Home Window

The home window of the L2 controller touch display, shown in Figure 2-10, includes a matrix of five buttons: Power UP, Power DOWN, RESET, NMI, and DEST:.

Power UP Button

The Power UP button powers on a single brick, multiple bricks, a partition, multiple partitions, or an entire system. The settings in the DEST: window determine the scope of the bricks powered on. See “Power UP Confirmation Window” for information on using the Power UP button.

Power DOWN Button

The Power DOWN button powers down a single brick, multiple bricks, a partition, multiple partitions, or an entire system. The settings in the DEST: window determine the scope of the bricks powered on. See “Power DOWN Confirmation Window” for information on using the Power DOWN button.

RESET Button

The RESET button resets the partition. If you issue a reset command to a partition, the main memory will be cleared and the registers will be set to default values in all bricks within the targeted partition(s). Systems with multiple partitions can reset single or multiple partitions without affecting the entire system. If you issue a reset command to a single brick within a partition, the entire partition will be reset. See “RESET Confirmation Window” for information on using the RESET button.

NMI Button

The NMI button issues a non-maskable interrupt command to a brick, multiple bricks, partition, or multiple partitions. When the system hangs, you can send the affected brick or partition an NMI interrupt via the L2 console. The interrupt goes to PROM and causes the CPU state to be captured for each C-brick targeted. This information is saved in flash PROM and the system log. This information assists SGI technicians in debugging system hangs and customer problems. See “NMI Confirmation Window” for information on using the NMI button.

DEST: Button

The DEST: button sets the target destinations for the power up, power down, reset, and nmi commands. The text to the right of the DEST: button shows the current target destination. The target destination in Figure 2-10 is all racks and all slots (r * s *). Therefore, a power up, power down, reset, or nmi command is sent to all bricks in all racks of the system. The text ([7 Bricks]) indicates that seven bricks will be affected by any command. See “Destination Selection Window” for information on using the DEST: button.

Figure 2-10. Home Window

Power UP Confirmation Window

If you press the Power UP button in the home window, the Power UP confirmation window appears, as shown in Figure 2-11.

To initiate the power up command, press the OK button. To terminate the command, press the Cancel button. The confirmation window stays visible until the command is successfully executed. An unsuccessful command results from an L1/L2 error in processing the command or a time-out in waiting for a response.

Figure 2-11. Power UP Confirmation Window

The power up command affects only the list of bricks set in the destination selection window. To set or change the target list, press the Cancel button to return to the home window. Then press the DEST: button to change the target list. See “Destination Selection Window” for instructions on using the DEST: button.

Power DOWN Confirmation Window

If you press the Power DOWN button in the home window, the Power DOWN confirmation window appears, as shown in Figure 2-11.

To initiate the power down command, press the OK button. To terminate the command, press the Cancel button. The confirmation window stays visible until the command is successfully executed. An unsuccessful command results from an L1/L2 error in processing the command or a time-out in waiting for a response.

Figure 2-12. Power DOWN Confirmation Window

The power down command affects only the list of bricks set in the destination selection window. To set or change the target list, press the Cancel button to return to the home window. Then press the DEST: button to change the target list. See “Destination Selection Window” for instructions on using the DEST: button.

RESET Confirmation Window

If you press the RESET button in the home window, the RESET confirmation window appears, as shown in Figure 2-13.

To initiate the reset command, press the OK button. To terminate the command, press the Cancel button. The confirmation window stays visible until the command is successfully executed. An unsuccessful command results from an L1/L2 error in processing the command or a time-out in waiting for a response.

Figure 2-13. RESET Confirmation Window

Note: The reset command affects all bricks in the targeted partition(s). The target list is not enforced during the processing of a reset command.

NMI Confirmation Window

If you press the NMI (non-maskable interrupt) button in the home window, the NMI confirmation window appears, as shown in Figure 2-14.

To initiate the nmi command, press the OK button. To terminate the command, press the Cancel button. The confirmation window stays visible until the command is successfully executed. An unsuccessful command results from an L1/L2 error in processing the command or a time-out in waiting for a response.

Figure 2-14. NMI Confirmation Window

Command Error/Timeout Window

The command error/time-out window, shown in Figure 2-15, appears when an unsuccessful command results in an L1/L2 error in processing the command or a time-out occurs in waiting for a response. The command error/timeout window appears in the main body of a command confirmation window. The content of the error message varies, depending on the type of error.

Figure 2-15. Command Error/Timeout Window

Destination Selection Window

If you press the DEST: button in the home window, the destination selection window appears, as shown in Figure 2-16. Use this window to select which bricks in the system will be affected by a command initiated from the home window. A brick is referenced by its rack and slot (unit position) number.

Targeting all Racks and All Bricks

To select all racks and all slots, press the ALL button. To scroll the rack list and brick list (not shown in Figure 2-16), press the arrow buttons below the Partition button on the display. The scroll buttons are active only when the number of racks exceeds the available space to display them.

Figure 2-16. Targeting All Bricks in a System

Once you have selected the bricks, press the Apply button to set the new destinations. The home window will then reappear. The new destinations are reflected in the target indicator across the bottom of the home window display. See Figure 2-10.

Before pressing the Apply button to set the new destinations, you can reset the destination window to the last applied state by pressing the Reset DEST button. The destination selection window will then revert back to the last applied status.

Targeting a Single Brick

To select a single brick within a rack, see Figure 2-18 and follow these steps:

Press the Rack/Slot button.
Press the button for the rack (example: 001).
Press the button for the slot (example: 018).

Figure 2-17. Targeting a Single Brick

The new destination is reflected in the target indicator near the bottom of the display window (r1 s18). Once you have selected the bricks, press the Apply button to set the new destinations. The home window will then reappear. The new destinations are reflected in the target indicator across the bottom of the home window display.

Targeting a Range of Bricks

To select multiple bricks within a rack, see Figure 2-19 and follow these steps:

Press the Rack/Slot button.
Press the button for the rack (example: 001).
Press the buttons for the desired slots (example: 008, 011, and 018).

Figure 2-18. Targeting Multiple Bricks in a Rack

The new destination is reflected in the target indicator near the bottom of the display window (r1 s 8, 11, 18). Once you have selected the bricks, press the Apply button to set the new destinations. The home window will then reappear. The new destinations are reflected in the target indicator across the bottom of the home window display.

Targeting All Bricks Within a Rack

To select all bricks within a rack, see Figure 2-20 and follow these steps:

Press the Rack/Slot button.
Press the button for the rack (example: 001).

Figure 2-19. Target Selection Window - 2

The new destination is reflected in the target indicator near the bottom of the display window (r1 s *). Once you have selected the rack, press the Apply button to set the new destinations. The home window will then reappear. The new destinations are reflected in the target indicator across the bottom of the home window display.

Targeting a Partition

To select all bricks within a partition, see Figure 2-21 and follow these steps:

Press the Partition button.
Press the button for the partition number (example: 001).

Note: The buttons that were the rack numbers before are now the partition numbers.

Figure 2-20. Targeting a Partition

The new partition target is reflected in the target indicator near the bottom of the display window (p1). Once you have selected the partition, press the Apply button to set the new partition target. The home window will then reappear. The new partition target is reflected in the target indicator across the bottom of the home window display.

Use the same procedure to select multiple partitions. You can select one or more partitions. Press ALL to select all partitions. If you select the Partition button, the command is sent to all bricks within the selected partition(s).

About the L2 Controller Firmware

The L2 controller hardware includes L2 controller firmware. To access the L2 controller firmware, you must connect a console such as the SGIconsole or a dumb terminal to the L2 controller. For instructions to connect a console to the L2 controller, see “Connecting a System Console” in Chapter 1.

The L2 firmware is always running as long as power is supplied to the L2 controller. If you connect a system console to the L2 controller's console port, the L2 prompt appears.

Operating the L1

The L1 operates in one of these two modes, which are discussed in the sections that follow:

L1 Mode
The L1 prompt is visible and all input is directed to the L1 command processor.

Console Mode from L1
Output from the system is visible and all input is directed to the system.

Note: The “console mode from L1” mode is supported only if the system console is connected directly to the console port of the C-brick.

L1 Mode

If you see a prompt of the following form, the L1 is ready to accept commands.

001c19-L1>

Common operations are discussed in the following sections:

Viewing System Configuration (from a Brick's Perspective)

An L1 has limited knowledge of the system configuration. A C-brick only has information about its attached I/O brick and, if another C-brick is attached to it, information about that C-brick and its attached I/O brick. An I/O brick only has information about its attached C-brick. An R-brick only has information about itself.

You can view a brick's configuration information with the config command:

001c05-L1> config 
:0 - 001c05
:1 - 004i01
:2 - 002p01
001c05-L1>

This example is a system with one C-brick and two I/O-bricks. The <number> that follows the colon (0, 1, 2, and 3, from top to bottom in this example), refers to the L1 connection relative to the local brick. (The local brick is the brick that is processing the command.)

The C-brick has the following perspective:

:0 is the local brick.

A number greater than 0 indicates that it is attached directly to or indirectly to the local brick. A higher number generally indicates a more indirect connection to the local brick.

The I/O brick has the following perspective:

:0 is the local brick.

A number greater than 0 indicates that it is attached directly to or indirectly to the local brick. A higher number generally indicates a more indirect connection to the local brick.

The R-brick has the following perspective:

:0 is the local brick.

Command Targeting

All commands entered affect only the local brick. You can target a command to all bricks (including the local brick) by prefixing the command with an asterisk (*).

001c05-L1> * version 
001c05:
L1 0.7.37 (Image A), Built 11/24/2001 14:59:42 [2MB image]
004i01:
L1 0.7.37 (Image A), Built 11/24/2001 14:59:42 [2MB image]
002c01:
L1 0.7.37 (Image A), Built 11/24/2001 14:59:42 [2MB image]
001x01:
L1 0.7.37 (Image A), Built 11/24/2001 14:59:42 [2MB image]
001c05-L1>

You can also target commands to a single attached brick with either the nia, nib, iia, or iib command:

001c05-L1> iia version 
001i01:
L1 0.7.37 (Image A), Built 05/24/2001 14:59:42 [2MB image]

Viewing Information, Warnings, and Error Messages

All information, warnings, and error messages generated by any of the system controllers are in the following form:

001c05 ERROR: invalid arguments for `ver' command, try “help ver”

The general format of the message includes a brick identification (this is not present if the command was to the local brick only), type of message, and the message. These messages can be the result of an invalid command (as shown in the example) or from tasks running on the L1, such as the environmental monitor.

Each L1 has a log of local events. Use the L1 command log to view the event on any of the L1s.

Powering On, Powering Off, and Resetting the Brick

You can power on and power off the brick with the power command:

001c05-L1> power up 
001c05-L1>

If an L2 is not present, you must power on and power off the system and reset it from one of the C-bricks. You do this by targeting all bricks:

001c05-L1> * power up 
001c05-L1>

This command can require from several seconds to several minutes to complete.

Console Mode from L1

In console mode, output from the system is visible and all input is directed to the system.

To enter console mode, press Ctrl+D at the L1 prompt:

001c05-L1> Ctrl+D 
entering console mode 001c05 console, <CTRL-T> to escape to L1
.
<system output appears here> 
.

To return to L1 mode, press Ctrl+T:

Ctrl+T 
escaping to L1 system controller
001c05-L1>

While in L1 mode, you can enter any L1 command. Once the command is executed, the L1 returns to console mode:

re-entering console mode 001c05 console, <CTRL-T> to escape to L1

To permanently engage the L1 mode, press Ctrl+T and then enter the l1 command:

Ctrl+T 
escaping to L1 system controller
001c05-L1> l1 
L1 command processor engaged, <CTRL-D> for console mode.
001c05-L1>

Console Selection

The brick with which the L1 communicates in console mode is the system console or global master, and you can view and set it with the select command. By default, the C-brick attempts to communicate with its local CPUs when console mode is entered. If the system has been powered on and either one of the bricks received a request to be the system console, then the C-brick attempts to communicate with that brick. The select command by itself shows the current console mode settings:

001c05-L1> select 
console input: 001c05 console0
console output: not filtered.

The following are six common subchannels associated with console communications:

Subchannel 0A specifies Node 0, CPU A.
Subchannel 0C specifies Node 0 CPU B.
Subchannel 1A specifies Node 1, CPU A.
Subchannel 1C specifies Node 1, CPU B.
Node 0 console subchannel.
Node 1 console subchannel.

The output console input: 001c05 console0 shows that the L2 will send console input to brick 001c05 and the subchannel to be used is the console0 subchannel.

To change system console status from one brick to the attached C-brick, use the select <rack> <slot> command:

001c05-L1> select r 2 s 1 
console input: 001c05 console
console output: not filtered.
001c05-L1>

To change the subchannel used on the selected brick, use the select command followed by the subchannel number or the word console:

001c05-L1> select sub 0A
console input: 001c05 CPU 0A
console output: not filtered.
001c05-L1>

During the boot process on a multi-rack system, there is a window of time in which both C-bricks are producing output. This output can produce a somewhat jumbled output at the L1. However, you can filter the console output so that the L1 shows output from only the brick chosen to receive console input. You can turn filtering on and off with the select filter command.

If you attempt to communicate with a brick that is not responding, a time-out condition results:

001c05-L1> 

entering console mode 001c05 console, <CTRL-T> to escape to L1
no response from 001c05 junk bus console UART:UART_TIMEOUT

When this time-out condition occurs, either the brick is hung or the subchannel is incorrect.

Operating the L2

The L2 firmware operates in one of these three modes, each of which is discussed in the sections that follow.

L2 Mode. The L2 prompt is visible and all input is directed to the L2 command processor.
Console Mode from L2. Output from the system is visible and all input is directed to the system.
L1 Mode from L2. The prompt from a single L1 is visible, and all input is directed to that L1 command processor.

L2 Mode

After the connection to the L2 controller is established, the following prompt appears, indicating that the L2 is ready to accept commands:

L2>

Common operations are discussed in the following sections:

Viewing System Configuration

You can use the L2 config command to view the current system configuration from a brick level:

L2> config
L2 127.0.0.1: - 001 (LOCAL)
L1 127.0.0.1:0:0 	- 001c18
L1 127.0.0.1:1:0 	- 001r16
L1 127.0.0.1:2:0 	- 001r14
L1 127.0.0.1:3:0 	- 001c11
L1 127.0.0.1:4:0 	- 001c08
L1 127.0.0.1:5:0 	- 001c05
L1 127.0.0.1:5:1 	- 001i01
L2>

As shown above, config produces a list of bricks and their locations in the system and the system controller address of each brick. This is similar to the output from using the config command on the L1 with the addition of the L2 IP address and USB port number. The structure of the brick's address is as follows:

a.b.c.d:x:y

where:

a.b.c.d

is the IP address of the L2. (In the example above, the IP address is 127.0.0.1.)

is the USB port number. (In the example above, the port number is 0.)

is the L1 index, as follows:

0 is the local brick (the brick to which the USB cable is attached).

A number greater than 0 indicates that it is attached directly to or indirectly to the local brick. A higher number generally indicates a more indirect connection to the local brick.

A brick is identified by its rack, type, and slot (001c05). The structure of the brick location is as follows:

rrrbss.p

where:

rrr		is the rack number.
b		is the brick type.
ss		is the slot location of the brick.
p		is the partition of the brick (not present if the system is not partitioned). R-bricks are not associated with a partition.

In the example shown above, 001c05 is a C-brick in rack 001 and slot position 05.

Setting Command Targeting

If a command is not understood by the L2 system controller, in general it is passed to the L1 system controllers. The destination determines which L1s receive the command. A destination, specified by the following, is a range of racks and slots:

rack <rack list> slot <slot list>

The <rack list> specifies a list of racks. This can be a list delimited by commas, such that 2,4,7 specifies racks 2, 4, and 7. You can use a dash to specify a range of racks, such that 2-4 specifies racks 2, 3, and 4. Both nomenclatures can be combined, such that 2-4,7 specifies racks 2, 3, 4, and 7.

You can specify the <slot list> using the same nomenclature. The slot number, sometimes referred to as a bay number, is the unit position number located on the rack, slightly above where the bottom of the brick sits. Each rack unit position number is located toward the top of the two lines that mark the unit position that the number represents. For example, the rack numbering for a brick located in slot 10 would appear on the left front side of the rack as shown in Figure 2-21:

Figure 2-21. Rack Numbering

The slot <slot list> is optional; if not given, then all slots in the specified rack(s) are implied. You should avoid specifying a rack list and a slot list that includes multiple racks and slots, such as rack 2-4,7 slot 1-8,11,13. Generally, you specify a rack and slot together to specify an individual brick.

You can use the aliases r and s to specify rack and slot, respectively. You can use the alias all or * in both the <rack list> and the <slot list>, or by themselves, to specify all racks and all slots.

To send a command to all bricks in a partition, enter the following:

partition <partition> <cmd>

Default Destination

When the L2 starts, the default destination is set to all racks and all slots. You can determine the default destination by using the destination command:

L2> destination 
all racks, all slots
L2>

The following command sets the destinations to rack 2 and 3, all slots:

L2> r 2,3 destination 
2 default destination(s) set
L2>

The following example shows what bricks are found in the default destination. If you enter a command not understood by the L2, the command is sent to these bricks.

Note: In the current implementation, if you add a brick to either rack 2 or 3, it would not be automatically included in the default destination. You would need to reset the default destination.

L2> destination 
002c05 (127.0.0.1:0:2)
003c05 (127.0.0.1:0:0)
L2>

The following command resets the default destination to all racks and all slots:

L2> destination reset 
default destination reset to all racks and slots
L2>

Current Destination

The current destination is a range of racks and slots for a given command. For example, the following command sends the command <L1 command> to all bricks in racks 2, 3, 4, and 7:

L2> r 2-4,7 <L1 command>

This is a one-time destination.

Command Interpretation

Some L2 commands are the same as the L1 commands. In many cases, this is intentional because the L2 provides sequencing that is necessary for a command to function correctly.

When L1 and L2 commands are similar, you can assure that an L1 command is entered for the bricks in the current destination by preceding the command <L1 command> with the l1 command:

L2> r 2-4,7 l1 <L1 command>

This is a one-time destination.

Viewing Information, Warnings, and Error Messages

All information, warnings, and error messages generated by any of the system controllers are in the following form:

001c05 ERROR: invalid arguments for `ver' command, try “help ver”

The general format includes a brick identification and the type of message, followed by the message. A message may be the result of an invalid command, as shown in the example, or the result of tasks running on the L1, such as the environmental monitor.

Each L1 has a log of local events. Use the L1 command log to view events on any of the L1s.

Powering On, Powering Off, and Resetting the System

You can power on and power off the system with the power command. This command is interpreted by the L2, because the bricks must be powered on in a specific order.

L2> power up 
L2>

The power command may require several seconds to several minutes to complete. In the example above, all racks and slots in the default destination are affected. Any errors or warnings are reported as described above in “Viewing Information, Warnings, and Error Messages”.

To power on or power off a specific brick, specify a current destination:

L2> r 2 s 5 power up 
L2>

To power on or power off all bricks in a partition, enter the following:

L2> partition <partition number> <power up or power down>

To reset the system, enter the following:

L2> reset
L2>

This command restarts the system by resetting all registers to their default settings and rebooting the system controllers. Resetting a running system will cause the operating system to reboot and all memory will be lost.

Console Mode from L2

In console mode, all output from the system is visible and all input is directed to the system.

To enter console mode from L2, press Ctrl+D at the L2 prompt and observe the response:

L2> Ctrl+D 
entering system console mode (001c05 console0),
<CTRL_T> to escape to L2
.
<system output appears here>
.

To return to L2 mode from console mode, press Ctrl+T:

Ctrl+T 
escaping to L2 system controller 
L2>

At this point, you can enter any L2 or L1 command. When the command completes, the L2 returns to console mode:

Re-entering system console mode (002c05 console0),
<CTRL_T> to escape to L2

To permanently engage the L2 mode, press Ctrl+T and then enter the l2 command:

Ctrl+T 
escaping to L2 system controller
L2> l2 
L2 command processor engaged, <CTRL_D> for console mode.
L2>

Console Selection

When in console mode, the L2 communicates with the C-brick set with the select command to be the system console or global master. All input from the console is directed to the C-brick. You can set and view the system console with the select command.

The L2 chooses the C-brick as the default console in the following order of priority:

The C-brick in the lowest numbered rack and slot, which has produced console output, and has an attached IX-brick.
The C-brick in the lowest numbered rack and slot, which has an attached IX-brick.
The C-brick in the lowest numbered rack and slot.

The select command by itself shows the current console mode settings:

L2> select 
known system consoles (non-partitioned)

	001c05-L2 detected

current system console

console input: 001c05 CPU 0A
console output: not filtered

The following are six common subchannels associated with console communications:

Subchannel 0A specifies Node 0, CPU A.
Subchannel 0C specifies Node 0 CPU B.
Subchannel 1A specifies Node 1, CPU A.
Subchannel 1C specifies Node 1, CPU B.
Node 0 console subchannel.
Node 1 console subchannel.

The output console input: 002c05 console0 shows that the L2 will send console input to brick 001c05 and the console subchannel will be used.

To change the brick that will be the system console, use the select <rack>.<slot> command, where <rack> is the rack and <slot> is the slot where the brick is located:

L2> select 3.1 
console input: 003c01 console
console output: no filtered
console detection: L2 detected

To change the subchannel used on the selected brick to be the system console, use the select subchannel <0A|0C|1A|1C> command. (Use the select subchannel console to select the current console as the subchannel of the brick to be the system console.) For example, to select subchannel b as the subchannel of the brick to be the system console, enter the following:

L2> select subchannel 1A
console input: 003c01 console CPU1A
console output: no filtered

During the boot process on a multibrick system, there is a window of time in which the C-bricks are all producing output. This can result in a somewhat jumbled output at the L2. However, you can filter console output so that the L2 will show output from only the brick chosen to receive console input. You can turn on filtering with the select filter on command and turn off filtering with the select filter off command.

If you attempt to communicate with a brick chosen to receive console input but that is not responding, a time-out condition results:

L2> Ctrl+D 
entering console mode 001c05 CPU1A, <CTRL_T> to escape to L2

no response from 001c05 Junk bus CPU1A system not responding
no response from 001c05 Junk bus CPU1A system not responding

When this time-out condition occurs, either the brick is hung or the subchannel is not correct.

L1 Mode from L2

In L1 mode, the prompt from a single L1 is visible, and all input is directed to that L1 command processor.

To enter L1 mode, enter the rack and a slot followed by l1:

L2> r 2 s 1 l1 
enterling L1 mode 001c05, <CTRL-T> to escape to L2

001c05-L1>

To return to L2 mode, press Ctrl+T:

001c05-L1> Ctrl+T 
escaping to L2 system controller, <CTRL-T> to send escape to L1
L2>

At this point, you can enter any L2 command. Once the command is executed, the L2 returns to L1 mode:

re-entering L1 mode 002c01, <CTRL-T> to escape to L2
001c05-L1>

To permanently engage the L2 mode, press Ctrl+T and enter the l2 command:

002c01-L1> Ctrl+T 
escaping to L2 system controller, <CTRL-T> to send escape to L1
L2> l2 
L2 command processor engaged, <CTRL-T> for console mode.
L2>

Upgrading L1/L2 Firmware

The L1/L2 firmware is currently distributed as part of the snxsc_firmware package.To determine which version of the package is installed on your system console, enter the following command:

$> rpm -q snxsc_firmware

If the package is installed, the full package name (including the revision) is returned:

snxsc_firmware-1.18.3-1

The L1 and L2 firmware binary and the utilities used to update it are stored in /usr/cpu/firmware/sysco.

Upgrading L1 Firmware

The L1 firmware consists of three parts:

Boot image
A image
B image

At boot time, the boot image validates the A and B image, and if it is not instructed otherwise, it executes the newer of the two images. Because the L1 is running one of the two images, the image not in use is the image that will be overwritten when the firmware is upgraded. You need to re-boot any L1 update either by power-cycling the brick or by using the L1 command reboot_l1.

Typically, you will upgrade the firmware through the network connection from the SGIconsole to the L2:

$> /usr/cpu/firmware/sysco/flashsc --12 10.1.1.1 -p /usr/cpu/firmware/sysco/l1.bin all

This updates all the bricks in the system. The -p at the end of the first line instructs the firmware to flash the proms in parallel.

You can update individual bricks by replacing all with a rack and slot number:

$> /usr/cpu/firmware/sysco/flashsc --12 10.1.1.1 /usr/cpu/firmware/sysco/l1.bin 1.19

This updates only the brick in rack 1, slot 19.

Upgrading L2 Firmware

The L2 firmware consists of two parts:

Boot image
Kernel image

Typically, you will upgrade the firmware through the network connection from the SGIconsole to the L2:

$> /usr/cpu/firmware/sysco/flashsc --12 10.1.1.1 /usr/cpu/firmware/sysco/l2.bin local

Once this command is executed, you must power-cycle the L2 to run the new image. You can also do this with the L2 command reboot_l2.

If the L2 update fails, there is no second image to fall back to as there is with the L1. The L2 will, however, not run the kernel image if it is not valid. The L2 is intelligent enough at this point that you can upgrade it through its console port:

$> /usr/cpu/firmware/sysco/flashsc --l2recover /usr/cpu/firmware/sysco /l2.bin <device>

Output will indicate that the firmware image is being erased and then rewritten. The flash image is quite large (almost 2 MB), so updating the flash image takes several minutes. You must power-cycle the L2 to run the new image. You can also do this with the L2 command reboot_l2.

Identifying Bricks

Bricks are referenced by their racks and slot or bay locations. These values are stored in non-volatile memory on the L1. Virtually all system controller communications require that each brick have a valid and unique rack and slot.

If a brick is not set with its rack and slot number, it appears in the output of an L2 config command, as shown in the following example:

L2> config
137.38.88.82.1.0 ---c-- (no rack/slot set)
L2>

To set the rack and slot for a brick, address it by its IP address, USB port, and L1 controller index. Note the following example:

L2> 137.38.88.82:1:0 brick rack 1
L2> 137.38.88.82:1:0 brick slot 8
L2> 137.38.88.82:1:0 reboot_l1
INFO: closed USB /dev/sgil1_0
INFO: opened USB /dev/sgil1_0
L2>config
137.38.88.82:1:0 001c08
L2.

The following example shows how to set rack 1, slot 8, for the C-brick with an IP address 127.0.0.1:

L2> config 
127.0.0.1:
127.0.0.1:0:0 - ---c--
127.0.0.1:0:0 - 001i01
127.0.0.1:0:0 - 001c05
L2> :0:0 brick rack 1 
brick rack set to 001.
L2> :0:0 brick slot 8
brick slot set to 08.
L2> :0:0 reboot_l1 
INFO: closed USB /dev/sgil1_0
INFO: opened USB /dev/sgil1_0
L2>
L2> config 
127.0.0.1:
127.0.0.1:0:0 - 001c05
127.0.0.1:0:0 - 001i01
127.0.0.1:0:0 - 001c08
L2>

To set the rack and slot from the L1 prompt, simply use the brick rack and brick slot commands. To set the rack and slot on one of the attached bricks (an attached I/O brick, C-brick, or a C-brick's I/O brick), use the L1 targeting commands nia, nib, iia, or iib.

001c05-L1> config 
:0 - 001c05
:1 - ---i--
:5 - 001c08
:6 - 001p01
001c05-L1> iia brick rack 4 
---i--:
brick rack set to 004.
001c05-l1> iia brick slot 1
---i--
brick slot set to 01
001c05-l1> iia reboot_l1 
001c05 ERROR: no response from ---i--
001c05-L1> config 
:0 - 001c05
:1 - 004i01
:5 - 001c08
:6 - 001p01
001c05-L1>

The number after the “:” indicates the following:
0 = local brick
	1 = IIA
	2 = IIB
	5 = NIA
10 = NIB

To obtain a detailed configuration explanation from the L1 perspective, enter the following:

001c05-L1> config verbose

Prev	Table of Contents	Next
Chapter 1. Operation Procedures		Chapter 3. System Overview