Method mm256_dp_ps
mm256_dp_ps(v256, v256, int)
Conditionally multiply the packed single-precision (32-bit) floating-point elements in a and b using the high 4 bits in imm8, sum the four products, and conditionally store the sum in dst using the low 4 bits of imm8.
Declaration
public static v256 mm256_dp_ps(v256 a, v256 b, int imm8)
Parameters
Type | Name | Description |
---|---|---|
v256 | a | Vector a |
v256 | b | Vector b |
int | imm8 | imm8 |
Returns
Type | Description |
---|---|
v256 | Vector |
Remarks
**** VDPPS ymm1, ymm2, ymm3/v256, imm8 Multiplies the packed single precision floating point values in the first source operand with the packed single-precision floats in the second source. Each of the four resulting single-precision values is conditionally summed depending on a mask extracted from the high 4 bits of the immediate operand. This sum is broadcast to each of 4 positions in the destination if the corresponding bit of the mask selected from the low 4 bits of the immediate operand is "1". If the corresponding low bit 0-3 of the mask is zero, the destination is set to zero. The process is replicated for the high elements of the destination.