Document revision date: 19 July 1999

Contents

1.5.3.2 Register Conflict

When overlapping the execution of instructions, the vector processor must deal with register conflict. This occurs when one instruction is intending to write a register while previously issued instructions are reading from that register. The following is an example of vector register conflict:

VVADDF V1, V2, V3 VVMULF V4, V5, V1

In the example, the VVADDF and VVMULF cannot both begin execution simultaneously because the elements of V1 generated by the VVMULF would overwrite the original elements of V1 required as input by the VVADDF. However, a vector processor implementation can still overlap the execution of these two instructions in a number of ways. One way would be by not starting the VVMULF until the first element of V1 has been read by the VVADDF. In this manner, as the VVADDF reads the next elements from V1 and V2, the VVMULF writes its product into the previous element of V1. This process continues until all the elements have been processed by both instructions. The VVADDF will finish execution while the VVMULF still has at least one product to store.

In the case of the Vector Mask Register (VMR), the vector processor ensures that register conflict does not occur. This is often accomplished by making a copy of the VMR value under which a pending vector instruction is to execute, and using this copy when execution begins. This allows the vector processor to begin executing an instruction that writes VMR before it completes prior instructions that read VMR.

1.5.3.3 Dependencies Among Vector Results

To achieve correct results and exception reporting during overlapped execution, the vector processor must maintain certain dependencies among the register elements and control register bits produced by various vector instructions. Because of the vector processor's asynchronous exception reporting nature and out-of-order completion of instructions, these dependencies differ from those ensured by the VAX scalar processor. In addition, these dependencies are at the level of vector register elements and vector control register bits; rather than at the level of vector registers and vector control registers.

Among other things, these dependencies determine the exception reporting nature of the MFVP instruction. The value of the vector control register (VCR, VLR, VMR<31:0>, VMR<63:32>) delivered by an MFVP depends upon the value of certain vector register elements and vector control register bits. Unreported exceptions that occur in the production of these elements and control register bits are reported by the vector processor prior to the completion of the MFVP from the vector control register.

The dependencies are expressed formally for the various classes of vector instructions by the tables of pseudo-code in this section. These are the only dependencies that software should rely upon the vector processor to ensure.

A vector processor implementation is allowed to ensure more than just these dependencies providing that this larger set of dependencies yields correct results and exception reporting.

Note

Note the implications of the following sequence for Table 1-7, Table 1-8, Table 1-9, Table 1-10, Table 1-11, Table 1-12, Table 1-13, and Table 1-14:

VVSUBF V5, V6, V7 VVADDF V1, V2, V7 VVMULF V7, V7, V3 VVDIVF V1, V4, V7

Implicit in statements of the form: "result DEPENDS on B" is the requirement that the result depends only on the value of "B" generated by the most immediate previously issued instruction relative to the result's own generating instruction. For instance, in the following example, the V3 produced by the VVMULF has the dependence: "V3[i] DEPENDS on V7[i]". This means that the value of V3[i] produced by the VVMULF depends only on the value of V7[i] produced by the VVADDF.

Table 1-7 Dependencies for Vector Operate Instructions
Instructions Dependence

VVADDx, VSADDx, VVSUBx, VSSUBx, VVMULx, VSMULx, VVDIVx, VSDIVx, VVCVTxy, VVBICL, VSBICL, VVBISL, VSBISL, VVXORL, VSXORL, VVSLLL, VSSLLL, VVSRLL, VSSRLL
for i = 0 to VLR-1
begin
Vc[i] DEPENDS on VLR;
if {MOE EQL 1} then Vc[i] DEPENDS on VMR;
if ( {MOE EQL 1} AND {VMR EQL MTF} ) OR {MOE EQL 0} then
begin
Vc[i] DEPENDS on Vb[i];
if {Vector-Vector Operation} AND NOT {VVCVTxy} then
Vc[i] DEPENDS on Va[i];
end;
end;

**Table 1-7 Dependencies for Vector Operate Instructions**
Instructions	Dependence
VVADDx, VSADDx, VVSUBx, VSSUBx, VVMULx, VSMULx, VVDIVx, VSDIVx, VVCVTxy, VVBICL, VSBICL, VVBISL, VSBISL, VVXORL, VSXORL, VVSLLL, VSSLLL, VVSRLL, VSSRLL	for i = 0 to VLR-1 begin Vc[i] DEPENDS on VLR; if {MOE EQL 1} then Vc[i] DEPENDS on VMR<i>; if ( {MOE EQL 1} AND {VMR<i> EQL MTF} ) OR {MOE EQL 0} then begin Vc[i] DEPENDS on Vb[i]; if {Vector-Vector Operation} AND NOT {VVCVTxy} then Vc[i] DEPENDS on Va[i]; end; end;

Table 1-8 Dependencies for Vector Load and Gather Instructions
Instructions Dependence

VLDx, VGATHx
for i = 0 to VLR-1
begin
Vc[i] DEPENDS on VLR;
if {MOE EQL 1} then Vc[i] DEPENDS on VMR;
if ( {MOE EQL 1} AND {VMR EQL MTF} ) OR {MOE EQL 0} then
if VGATH then
begin
Vc[i] DEPENDS on Vb[i];
k = BASE + Vb[i];
end
else
k = BASE + i * STRIDE;
Vc[i] DEPENDS on LOAD_COMPLETED(k);
end;

**Table 1-8 Dependencies for Vector Load and Gather Instructions**
Instructions	Dependence
VLDx, VGATHx	for i = 0 to VLR-1 begin Vc[i] DEPENDS on VLR; if {MOE EQL 1} then Vc[i] DEPENDS on VMR<i>; if ( {MOE EQL 1} AND {VMR<i> EQL MTF} ) OR {MOE EQL 0} then if VGATH then begin Vc[i] DEPENDS on Vb[i]; k = BASE + Vb[i]; end else k = BASE + i * STRIDE; Vc[i] DEPENDS on LOAD_COMPLETED(k); end;

Table 1-9 Dependencies for Vector Store and Scatter Instructions
Instructions Dependence

VSTx, VSCATx
j = 0;
for i = 0 to VLR-1
begin
if ( {MOE EQL 1} AND {VMR EQL MTF} ) OR {MOE EQL 0} then
begin
if {MOE EQL 1} then ELEMENT_STORED[j] depends on VMR;
ELEMENT_STORED[j] DEPENDS on Vc[i];
ELEMENT_STORED[j] DEPENDS on VLR;
if VSCAT then
begin
ELEMENT_STORED[j] DEPENDS on Vb[i];
k = BASE + Vb[i];
end
else
k = BASE + i * STRIDE;
STORE_COMPLETED(k) DEPENDS on ELEMENT_STORED[j];
j = j+1;
end;
end;

**Table 1-9 Dependencies for Vector Store and Scatter Instructions**
Instructions	Dependence
VSTx, VSCATx	j = 0; for i = 0 to VLR-1 begin if ( {MOE EQL 1} AND {VMR<i> EQL MTF} ) OR {MOE EQL 0} then begin if {MOE EQL 1} then ELEMENT_STORED[j] depends on VMR<i>; ELEMENT_STORED[j] DEPENDS on Vc[i]; ELEMENT_STORED[j] DEPENDS on VLR; if VSCAT then begin ELEMENT_STORED[j] DEPENDS on Vb[i]; k = BASE + Vb[i]; end else k = BASE + i * STRIDE; STORE_COMPLETED(k) DEPENDS on ELEMENT_STORED[j]; j = j+1; end; end;

Table 1-10 Dependencies for Vector Compare Instructions
Instructions Dependence

VVCMPx, VSCMPx
for i = 0 to VLR-1
begin
VMR DEPENDS on VLR;
if {MOE EQL 1} then VMR DEPENDS on VMR
if ( {MOE EQL 1} AND {VMR EQL MTF} ) OR {MOE EQL 0} then
begin
VMR DEPENDS on Vb[i];
if VVCMP then VMR DEPENDS on Va[i];
end;
end;

**Table 1-10 Dependencies for Vector Compare Instructions**
Instructions	Dependence
VVCMPx, VSCMPx	for i = 0 to VLR-1 begin VMR<i> DEPENDS on VLR; if {MOE EQL 1} then VMR<i> DEPENDS on VMR<i> if ( {MOE EQL 1} AND {VMR<i> EQL MTF} ) OR {MOE EQL 0} then begin VMR<i> DEPENDS on Vb[i]; if VVCMP then VMR<i> DEPENDS on Va[i]; end; end;

Table 1-11 Dependencies for Vector MERGE Instructions
Instructions Dependence

VVMERGE, VSMERGE
for i = 0 to VLR-1
begin
Vc[i] DEPENDS on VLR;
Vc[i] DEPENDS on VMR;
if {VMR EQL MTF} then
begin
if VVMERGE then Vc[i] DEPENDS on Va[i];
end
else
Vc[i] DEPENDS on Vb[i];
end;

**Table 1-11 Dependencies for Vector MERGE Instructions**
Instructions	Dependence
VVMERGE, VSMERGE	for i = 0 to VLR-1 begin Vc[i] DEPENDS on VLR; Vc[i] DEPENDS on VMR<i>; if {VMR<i> EQL MTF} then begin if VVMERGE then Vc[i] DEPENDS on Va[i]; end else Vc[i] DEPENDS on Vb[i]; end;

Table 1-12 Dependencies for IOTA Instruction
Instruction Dependence

IOTA
j = 0;
for i = 0 to VLR-1
begin
Vc[j] DEPENDS on VLR;
if {VMR EQL MTF} then
begin
Vc[j] DEPENDS on VMR<0..i>;
j = j+1;
end;
end;
VCR DEPENDS on VMR<0..VLR-1>;

**Table 1-12 Dependencies for IOTA Instruction**
Instruction	Dependence
IOTA	j = 0; for i = 0 to VLR-1 begin Vc[j] DEPENDS on VLR; if {VMR<i> EQL MTF} then begin Vc[j] DEPENDS on VMR<0..i>; j = j+1; end; end; VCR DEPENDS on VMR<0..VLR-1>;

Table 1-13 Dependencies for MFVP Instructions
Instructions Dependence

MSYNC DEPENDS on the following:

All STORE_COMPLETED(x) of previously issued VST and VSCAT instructions
All LOAD_COMPLETED(X) of previously issued VLD and VGATH instructions

SYNC DEPENDS on the vector register elements and vector control register bits produced and stored by all previous vector instructions

MFVMRLO DEPENDS on VMR<0..31>

MFVMRHI DEPENDS on VMR<32..63>

MFVCR DEPENDS on VCR

MFVLR DEPENDS on VLR

**Table 1-13 Dependencies for MFVP Instructions**
Instructions	Dependence
MSYNC	DEPENDS on the following: All STORE_COMPLETED(x) of previously issued VST and VSCAT instructions All LOAD_COMPLETED(X) of previously issued VLD and VGATH instructions
SYNC	DEPENDS on the vector register elements and vector control register bits produced and stored by all previous vector instructions
MFVMRLO	DEPENDS on VMR<0..31>
MFVMRHI	DEPENDS on VMR<32..63>
MFVCR	DEPENDS on VCR
MFVLR	DEPENDS on VLR

Table 1-14 Miscellaneous Dependencies
Item Dependence

VSYNC Depends on nothing, but for each memory location, x forces all subsequent LOAD_COMPLETED(x) and STORE_COMPLETED(x) to DEPEND on all previous LOAD_COMPLETED(x) and STORE_COMPLETED(x).

MTVP DEPENDS on nothing.

Value of a memory location The value of a memory location DEPENDS on nothing and is not DEPENDED on by any vector instruction.

Transitive dependence
if {a DEPENDS on b} AND {b DEPENDS on c} then a DEPENDS on c

**Table 1-14 Miscellaneous Dependencies**
Item	Dependence
VSYNC	Depends on nothing, but for each memory location, x forces all subsequent LOAD_COMPLETED(x) and STORE_COMPLETED(x) to DEPEND on all previous LOAD_COMPLETED(x) and STORE_COMPLETED(x).
MTVP	DEPENDS on nothing.
Value of a memory location	The value of a memory location DEPENDS on nothing and is not DEPENDED on by any vector instruction.
Transitive dependence	if {a DEPENDS on b} AND {b DEPENDS on c} then a DEPENDS on c

1.6 Vector Processor Exceptions

There are two major classes of vector processor exceptions as follows:

Vector memory management
- Access control violation
  - Vector access control violation
  - Vector alignment
  - Vector I/O space reference
- Translation not valid
- Modify
Vector Arithmetic
- Floating underflow
- Floating divide by zero
- Floating reserved operand
- Floating overflow
- Integer overflow
Floating underflow and integer overflow can be disabled on a per-instruction basis by clearing cntrl<EXC>.

Vector processor arithmetic exceptions cause the vector processor to disable itself (see Section 1.6.3, Vector Processor Disabled). The vector processor does not disable itself for vector processor memory management exceptions.

1.6.1 Vector Memory Management Exception Handling

Vector processor memory management exceptions are taken through the system control block (SCB) vector for their scalar counterparts. Figure 1-12 illustrates the memory management fault stack frame that contains the memory management fault parameter.

Figure 1-12 Memory Management Fault Stack Frame (as Sent by the Vector Processor)

The length (L) bit, the Page Table Entry (PTE) reference (P) bit, and the modify or write intent (M) bit are defined in the VAX Architecture Reference Manual. Vector processor memory management exceptions set these bits in the same way as required for scalar memory management exceptions.
The vector alignment exception (VAL) bit must be set when an access control violation has occurred due to a vector element not being properly aligned in memory.
The vector I/O space reference (VIO) bit is set by some implementations to indicate that an access control violation has occurred due to a vector instruction reference to I/O space.
The vector asynchronous memory management exception (VAS) bit must be set to indicate that a vector processor memory management exception has occurred when the following asynchronous memory management scheme is implemented.

If more than one kind of memory management exception could occur on a reference to a single page, then access control violation takes precedence over both translation not valid and modify. If more than one kind of access control violation could occur, the precedence of vector access control violation, vector alignment exception, and vector I/O space reference is UNPREDICTABLE.

The architecture allows an implementation to choose one of two methods for dealing with vector processor memory management exceptions. The two methods are as follows:

Synchronous memory management handling and restart from the beginning.
Asynchronous memory management handling and store/reload implementation-specific state using VSAR.

With the synchronous method, no new instructions are processed by the vector or the scalar processor until the vector memory access instruction is guaranteed to complete without incurring memory management exceptions. In such an implementation, the vector memory access instruction is backed up when a memory management exception occurs and a normal VAX memory management (access control violation, translation not valid, modify) fault taken with the program counter (PC) pointing to the faulting vector memory access instruction. If the synchronous method is implemented, VSAR is omitted. After fixing the vector processor memory management exception, software may REI back to the faulting vector instruction. Alternately, software may context switch to another process. For further details, see Section 1.6.4.

With the asynchronous method, vector memory management exceptions set VPSR<PMF> and VPSR<MF>. The vector processor does not inform the scalar processor of the exception condition; the scalar processor continues processing instructions. All pending vector instructions that have started execution are allowed to complete if their source data is valid. The scalar processor is notified of an exception condition or conditions when it sends the next vector instruction to the vector processor and a normal VAX memory management fault is taken. The saved PC points to this instruction, which is not the vector memory access instruction that incurred the memory management exception. At this point, the vector processor clears VPSR<PMF>. After fixing the vector processor memory management exception, software may allow the current scalar/vector process to continue. Before vector processor instruction execution resumes using state that already exists in the vector processor, the vector processor clears VPSR<MF> and the faulting memory reference is retried. Alternately, software may context switch to another process. For further details, see Section 1.6.4.

When a vector processor memory management exception is encountered by a VLD or VGATH instruction, the contents of the destination vector register elements are UNPREDICTABLE. When a vector processor memory management exception is encountered by a VSTL or VSCAT instruction, it is UNPREDICTABLE whether the vector processor writes any result location for which an exception did not occur. In either case, if the fault condition can be eliminated by software and the instruction restarted, then the vector processor will ensure that all destination register elements or result locations are written.

1.6.2 Vector Arithmetic Exceptions

Vector operate instructions are always executed to completion, even if a vector arithmetic exception occurs. If an exception occurs, a default result is written. The default result is as follows:

The low-order 32 bits of the true result for integer overflow.
Zero for floating underflow if exceptions are disabled.
An encoded reserved operand for floating divide by zero, floating overflow, reserved operand, and enabled floating underflow. For vector convert instructions that convert floating-point data to integer data, where the source element is a reserved operand, the value written to the destination element is UNPREDICTABLE.

The exception condition type and destination register number are always recorded in the Vector Arithmetic Exception Register (VAER) when a vector arithmetic exception occurs. Refer to Section 1.2.3, Internal Processor Registers, for more information.

1.6.3 Vector Processor Disabled

As a result of error conditions or software control, the vector processor signals the scalar processor not to issue any more vector instructions. The vector processor is disabled when this signal is generated and its state is reflected in VPSR<VEN>. Because the scalar and vector processors can execute asynchronously, the scalar processor may not receive this signal immediately. As a result, the scalar processor may continue to view the vector processor as enabled and send it vector instructions. Once the scalar processor receives this signal, it will view the vector processor as disabled and will not send it any more vector instructions (including MFVP/MTVP). While the vector processor is disabled, and in the absence of hardware errors, it will complete all pending instructions in its instruction queue including those sent by the scalar processor after the vector processor became disabled.

The vector processor can either disable itself or be disabled by software. The following error conditions cause the vector processor to disable itself:

Vector arithmetic exception (flagged by VPSR<AEX>)
Hardware error (flagged by VPSR<IMP> in some implementations)
On some implementations, receipt of an illegal vector opcode (flagged by VPSR<IVO>)

In these cases, the vector processor clears VPSR<VEN> and flags the error condition by setting the appropriate bit in VPSR. (See Table 1-1.)

Software disables the vector processor by writing a zero into VPSR<VEN> using an MTPR instruction. Once the vector processor is disabled, only software can enable it. The software does this by writing a one to VPSR<VEN> using an MTPR. Recall that after performing an MTPR to VPSR, software must then issue an MFPR from VPSR to ensure that the new state of VPSR will affect the execution of subsequently issued vector instructions. The MFPR will not complete in this case until the new state of the vector processor becomes visible to the scalar processor.

When the vector processor disables itself due to a hardware error, it is implementation dependent whether the vector processor completes any pending vector instruction. However, in this case, the vector processor ensures when it is reenabled that all incompleted instructions have been flushed from the instruction queue.

If the scalar processor attempts to issue a vector instruction after it views the vector processor as disabled, then a vector processor disabled fault occurs. The vector processor disabled fault uses SCB offset 68 (hex). The exception handling software (running on the scalar processor) can then read the vector internal processor registers (IPRs) with MFPR instructions to determine what exception conditions are recorded in the vector processor and if the vector processor is still busy processing other unfinished instructions.

Once the scalar processor views the vector processor as disabled, the only operations that can be issued to the vector processor are MTPR and MFPR to and from the vector IPRs.

1.6.4 Handling Disabled Faults and Vector Context Switching

The following flow outlines the required steps for handling a vector processor disabled fault.

If the new process executing on the scalar processor has a vector instruction to execute, saving and restoring the state of the vector processor---that is, vector context switching---is done as part of handling a subsequent vector processor disabled fault.

If a vector processor disabled fault occurs and the current scalar process is also the current vector process, then software must perform the following procedure:

Obtain the vector processor status by reading the VPSR using the MFPR instruction.
Perform the following checks to see if any of these conditions caused the vector processor to be disabled. If any of these conditions exist, a decision to not continue this flow may occur.
1. If VPSR<IVO> is set, then write one to clear VPSR<IVO> using the MTPR instruction, and report an illegal vector opcode error.
2. If VPSR<IMP> is set, then write one to clear VPSR<IMP> using the MTPR instruction, and report an implementation-specific error.
3. If VPSR<AEX> is set, then write one to clear VPSR<AEX> using the MTPR instruction, and enter the vector arithmetic exception handler with information in VAER.
If the software scalar-context-switch flag is set, indicating that a scalar context switch has been done, then perform the following:
1. Make sure the vector processor has access to correct P0LR, P0BR, P1LR, and P1BR values.
2. If any vector translation buffer needs to be invalidated, then write zero into the VTBIA IPR using the MTPR instruction. Vector translation buffer flushing is required if the process was swapped out and the mapping change has not yet been made known to the vector translation buffer.
3. Clear the software scalar-context-switch flag.
Enable the vector processor by writing one to VPSR<VEN> using the MTPR instruction. Ensure the new state of the vector processor becomes visible to the scalar processor by reading VPSR with the MFPR instruction.
REI to retry the vector instruction at the time of the vector processor disabled fault. If there is an asynchronous memory management exception pending, it is taken when that vector instruction is reissued to the vector processor.

If a vector processor disabled fault occurs and the current scalar process is not the current vector process, then software must perform the following procedure:

Check if there is a current vector process. If there is one, then perform the following procedure:
1. Wait for VPSR<BSY> to be clear using the MFPR instruction.
2. Perform the following check to see if this condition caused the vector processor to be disabled. If this condition exists, a decision to not continue this flow may occur.
 1. If VPSR<IMP> is set, then report an implementation-specific error.
 2. If VPSR<IVO> is set, then set a software IVO flag for this process. The illegal vector opcode error is handled when this process next tries to execute in the vector processor.
 3. If VPSR<AEX> is set, then set a software AEX flag for this process, and save vector arithmetic exception state from VAER using the MFPR instruction. Any vector arithmetic exception conditions are handled when this process next tries to execute in the vector processor.
3. At this point there cannot be a synchronous memory management exception pending. But, if asynchronous memory management handling is implemented, there may be an asynchronous memory management exception pending. Because scalar/vector memory synchronization was required before scalar context switching, all such pending exceptions are known at this time. So, if VPSR<PMF> is set, then perform the following procedure:
 1. Set a software asynch-memory-exception-pending flag for this process.
 2. Store implementation-specific vector state in memory starting at the address in VSAR by writing one to VPSR<STS> using the MTPR instruction.
4. Reset the vector processor state to clear VAER and VPSR, and enable the vector processor. Writing a one to both VPSR<RST> and VPSR<VEN> using the same MTPR instruction accomplishes this. Ensure the new state of the vector processor becomes visible to the scalar processor by reading VPSR with the MFPR instruction.
5. Store the current vector (V0--V15) and vector control (VLR, VMR, and VCR) register values using VST and MFVP instructions.
6. Read the VMAC IPR using the MFPR instruction. This ensures scalar/vector memory synchronization and that all hardware errors encountered by previous vector memory instructions have been reported.
Make the current scalar process also the current vector process.
Clear the software scalar-context-switch flag.
Make sure the vector processor has access to correct P0LR, P0BR, P1LR, and P1BR values, and invalidate any vector translation buffer by writing zero to the VTBIA IPR using the MTPR instruction.
Load the saved vector (V0--V15) and vector control (VLR, VMR, and VCR) register values using VLD and MTVP instructions.
If the software IMP, IVO, or AEX flags for this process are set, perform the following procedure:
1. Disable the vector processor by writing zero to VPSR<VEN> using the MTPR instruction. Ensure the new state of the vector processor becomes visible to the scalar processor by reading VPSR with the MFPR instruction.
2. If set, clear the software IMP flag for this process and finish handling the implementation-specific error. A decision to not continue this flow may occur.
3. If set, clear the software IVO flag for this process and report an illegal vector opcode error occurred. A decision to not continue this flow may occur.
4. If set, clear the software AEX flag for this process and enter the vector arithmetic exception handler with saved VAER state. A decision to not continue this flow may occur.
If the software async-memory-exception-pending flag for this process is set, perform the following procedure:
1. Clear the software async-memory-exception-pending flag for this process.
2. Send the vector processor the memory address that points to implementation-specific vector state for this process by writing VSAR using the MTPR instruction.
3. Reload the implementation-specific vector state for this process and leave the vector processor enabled by writing one to both VPSR<RLD> and VPSR<VEN> using the same MTPR instruction. From this state, the vector processor determines if VPSR<PMF>, VPSR<MF>, or both need to be set, and does it. Ensure the new state of the vector processor becomes visible to the scalar processor by reading VPSR with the MFPR instruction.
REI to retry the vector instruction at the time of the vector processor disabled fault. If there is an asynchronous memory management exception pending, it is taken when that vector instruction is reissued to the vector processor.

Contents

Index

privacy and legal statement

4515CH10_002.HTML