Method mm256_dp_ps

mm256_dp_ps(v256, v256, int)

Conditionally multiply the packed single-precision (32-bit) floating-point elements in a and b using the high 4 bits in imm8, sum the four products, and conditionally store the sum in dst using the low 4 bits of imm8.

Declaration

public static v256 mm256_dp_ps(v256 a, v256 b, int imm8)

Parameters

Type	Name	Description
v256	a	Vector a
v256	b	Vector b
int	imm8	imm8

Returns

Type	Description
v256	Vector

Remarks

**** VDPPS ymm1, ymm2, ymm3/v256, imm8 Multiplies the packed single precision floating point values in the first source operand with the packed single-precision floats in the second source. Each of the four resulting single-precision values is conditionally summed depending on a mask extracted from the high 4 bits of the immediate operand. This sum is broadcast to each of 4 positions in the destination if the corresponding bit of the mask selected from the low 4 bits of the immediate operand is "1". If the corresponding low bit 0-3 of the mask is zero, the destination is set to zero. The process is replicated for the high elements of the destination.