Class X86.Avx

AVX intrinsics

Inheritance

Object

X86.Avx

Inherited Members

Object.Equals(Object)

Object.Equals(Object, Object)

Object.GetHashCode()

Object.GetType()

Object.MemberwiseClone()

Object.ReferenceEquals(Object, Object)

Object.ToString()

Namespace: Unity.Burst.Intrinsics

Syntax

public static class Avx

Properties

IsAvxSupported

Evaluates to true at compile time if AVX intrinsics are supported.

Declaration

public static bool IsAvxSupported { get; }

Property Value

Type	Description
Boolean

Methods

broadcast_ss(Void*)

Broadcast a single-precision (32-bit) floating-point element from memory to all elements of dst.

Declaration

public static v128 broadcast_ss(void *ptr)

Parameters

Type	Name	Description
Void*	ptr

Returns

Type	Description
v128

Remarks

**** VBROADCASTSS xmm1, m32

cmp_pd(v128, v128, Int32)

Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

Declaration

public static v128 cmp_pd(v128 a, v128 b, int imm8)

Parameters

Type	Name	Description
v128	a
v128	b
Int32	imm8

Returns

Type	Description
v128

Remarks

**** VCMPPD xmm1, xmm2, xmm3/v128, imm8 Performs an SIMD compare of the four packed double-precision floating-point values in the second source operand (third operand) and the first source operand (second operand) and returns the results of the comparison to the destination operand (first operand). The comparison predicate operand (immediate) specifies the type of comparison performed on each of the pairs of packed values. For 128-bit intrinsic function with compare predicate values in range 0-7 compiler may generate SSE2 instructions if it is warranted for performance reasons.

cmp_ps(v128, v128, Int32)

Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

Declaration

public static v128 cmp_ps(v128 a, v128 b, int imm8)

Parameters

Type	Name	Description
v128	a
v128	b
Int32	imm8

Returns

Type	Description
v128

Remarks

**** VCMPPS xmm1, xmm2, xmm3/v256, imm8 Performs a SIMD compare of the packed single-precision floating-point values in the second source operand (third operand) and the first source operand (second operand) and returns the results of the comparison to the destination operand (first operand). The comparison predicate operand (immediate) specifies the type of comparison performed on each of the pairs of packed values. For 128-bit intrinsic function with compare predicate values in range 0-7 compiler may generate SSE2 instructions if it is warranted for performance reasons.

cmp_sd(v128, v128, Int32)

Compare the lower double-precision (64-bit) floating-point element in a and b based on the comparison operand specified by imm8, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.

Declaration

public static v128 cmp_sd(v128 a, v128 b, int imm8)

Parameters

Type	Name	Description
v128	a
v128	b
Int32	imm8

Returns

Type	Description
v128

Remarks

**** VCMPSD xmm1, xmm2, xmm3/m64, imm8 Compares the low double-precision floating-point values in the second source operand (third operand) and the first source operand (second operand) and returns the results in of the comparison to the destination operand (first operand). The comparison predicate operand (immediate operand) specifies the type of comparison performed. For compare predicate values in range 0-7 compiler may generate SSE2 instructions if it is warranted for performance reasons.

cmp_ss(v128, v128, Int32)

Compare the lower single-precision (32-bit) floating-point element in a and b based on the comparison operand specified by imm8, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.

Declaration

public static v128 cmp_ss(v128 a, v128 b, int imm8)

Parameters

Type	Name	Description
v128	a
v128	b
Int32	imm8

Returns

Type	Description
v128

Remarks

**** VCMPSS xmm1, xmm2, xmm3/m64, imm8 Compares the low single-precision floating-point values in the second source operand (third operand) and the first source operand (second operand) and returns the results of the comparison to the destination operand (first operand). The comparison predicate operand (immediate operand) specifies the type of comparison performed. For compare predicate values in range 0-7 compiler may generate SSE2 instructions if it is warranted for performance reasons.

maskload_pd(Void*, v128)

Load packed double-precision (64-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

Declaration

public static v128 maskload_pd(void *mem_addr, v128 mask)

Parameters

Type	Name	Description
Void*	mem_addr
v128	mask

Returns

Type	Description
v128

Remarks

**** VMASKMOVPD xmm1, xmm2, v128

maskload_ps(Void*, v128)

Load packed single-precision (32-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

Declaration

public static v128 maskload_ps(void *mem_addr, v128 mask)

Parameters

Type	Name	Description
Void*	mem_addr
v128	mask

Returns

Type	Description
v128

Remarks

**** VMASKMOVPS xmm1, xmm2, v128

maskstore_pd(Void*, v128, v128)

Store packed double-precision (64-bit) floating-point elements from a into memory using mask.

Declaration

public static void maskstore_pd(void *mem_addr, v128 mask, v128 a)

Parameters

Type	Name	Description
Void*	mem_addr
v128	mask
v128	a

Remarks

**** VMASKMOVPD v128, xmm1, xmm2

maskstore_ps(Void*, v128, v128)

Store packed single-precision (32-bit) floating-point elements from a into memory using mask.

Declaration

public static void maskstore_ps(void *mem_addr, v128 mask, v128 a)

Parameters

Type	Name	Description
Void*	mem_addr
v128	mask
v128	a

Remarks

**** VMASKMOVPS v128, xmm1, xmm2

mm256_add_pd(v256, v256)

Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

Declaration

public static v256 mm256_add_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

mm256_add_ps(v256, v256)

Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

Declaration

public static v256 mm256_add_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

mm256_addsub_pd(v256, v256)

Alternatively add and subtract packed double-precision (64-bit) floating-point elements in a to/from packed elements in b, and store the results in dst.

Declaration

public static v256 mm256_addsub_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

mm256_addsub_ps(v256, v256)

Alternatively add and subtract packed single-precision (32-bit) floating-point elements in a to/from packed elements in b, and store the results in dst.

Declaration

public static v256 mm256_addsub_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

mm256_and_pd(v256, v256)

Compute the bitwise AND of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

Declaration

public static v256 mm256_and_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

mm256_and_ps(v256, v256)

Compute the bitwise AND of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

Declaration

public static v256 mm256_and_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

mm256_andnot_pd(v256, v256)

Compute the bitwise NOT of packed double-precision (64-bit) floating-point elements in a and then AND with b, and store the results in dst.

Declaration

public static v256 mm256_andnot_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

mm256_andnot_ps(v256, v256)

Compute the bitwise NOT of packed single-precision (32-bit) floating-point elements in a and then AND with b, and store the results in dst.

Declaration

public static v256 mm256_andnot_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

mm256_blend_pd(v256, v256, Int32)

Blend packed double-precision (64-bit) floating-point elements from a and b using control mask imm8, and store the results in dst.

Declaration

public static v256 mm256_blend_pd(v256 a, v256 b, int imm8)

Parameters

Type	Name	Description
v256	a
v256	b
Int32	imm8

Returns

Type	Description
v256

Remarks

**** VBLENDPD ymm1, ymm2, ymm3/v256, imm8 Double-Precision Floating-Point values from the second source operand are conditionally merged with values from the first source operand and written to the destination. The immediate bits [3:0] determine whether the corresponding Double-Precision Floating Point value in the destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is "1", then the Double-Precision Floating-Point value in the second source operand is copied, else the value in the first source operand is copied

mm256_blend_ps(v256, v256, Int32)

Blend packed single-precision (32-bit) floating-point elements from a and b using control mask imm8, and store the results in dst.

Declaration

public static v256 mm256_blend_ps(v256 a, v256 b, int imm8)

Parameters

Type	Name	Description
v256	a
v256	b
Int32	imm8

Returns

Type	Description
v256

Remarks

**** VBLENDPS ymm1, ymm2, ymm3/v256, imm8 Single precision floating point values from the second source operand are conditionally merged with values from the first source operand and written to the destination. The immediate bits [7:0] determine whether the corresponding single precision floating-point value in the destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is "1", then the single-precision floating-point value in the second source operand is copied, else the value in the first source operand is copied

mm256_blendv_pd(v256, v256, v256)

Blend packed double-precision (64-bit) floating-point elements from a and b using mask, and store the results in dst.

Declaration

public static v256 mm256_blendv_pd(v256 a, v256 b, v256 mask)

Parameters

Type	Name	Description
v256	a
v256	b
v256	mask

Returns

Type	Description
v256

Remarks

**** VBLENDVPD ymm1, ymm2, ymm3/v256, ymm4 Conditionally copy each quadword data element of double-precision floating-point value from the second source operand (third operand) and the first source operand (second operand) depending on mask bits defined in the mask register operand (fourth operand).

mm256_blendv_ps(v256, v256, v256)

Blend packed single-precision (32-bit) floating-point elements from a and b using mask, and store the results in dst.

Declaration

public static v256 mm256_blendv_ps(v256 a, v256 b, v256 mask)

Parameters

Type	Name	Description
v256	a
v256	b
v256	mask

Returns

Type	Description
v256

Remarks

Blend Packed Single Precision Floating-Point Values **** VBLENDVPS ymm1, ymm2, ymm3/v256, ymm4 Conditionally copy each dword data element of single-precision floating-point value from the second source operand (third operand) and the first source operand (second operand) depending on mask bits defined in the mask register operand (fourth operand).

mm256_broadcast_pd(Void*)

Broadcast 128 bits from memory (composed of 2 packed double-precision (64-bit) floating-point elements) to all elements of dst.

Declaration

public static v256 mm256_broadcast_pd(void *ptr)

Parameters

Type	Name	Description
Void*	ptr

Returns

Type	Description
v256	**** VBROADCASTF128 ymm1, v128

mm256_broadcast_ps(Void*)

Broadcast 128 bits from memory (composed of 4 packed single-precision (32-bit) floating-point elements) to all elements of dst.

Declaration

public static v256 mm256_broadcast_ps(void *ptr)

Parameters

Type	Name	Description
Void*	ptr

Returns

Type	Description
v256

Remarks

**** VBROADCASTF128 ymm1, v128

mm256_broadcast_sd(Void*)

Broadcast a double-precision (64-bit) floating-point element from memory to all elements of dst.

Declaration

public static v256 mm256_broadcast_sd(void *ptr)

Parameters

Type	Name	Description
Void*	ptr

Returns

Type	Description
v256

Remarks

**** VBROADCASTSD ymm1, m64

mm256_broadcast_ss(Void*)

Broadcast a single-precision (32-bit) floating-point element from memory to all elements of dst.

Declaration

public static v256 mm256_broadcast_ss(void *ptr)

Parameters

Type	Name	Description
Void*	ptr

Returns

Type	Description
v256

Remarks

**** VBROADCASTSS ymm1, m32

mm256_castpd_ps(v256)

For compatibility with C++ code only. This is a no-op in Burst.

Declaration

public static v256 mm256_castpd_ps(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

mm256_castpd_si256(v256)

For compatibility with C++ code only. This is a no-op in Burst.

Declaration

public static v256 mm256_castpd_si256(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

mm256_castpd128_pd256(v128)

For compatibility with C++ code only. This is a no-op in Burst.

Declaration

public static v256 mm256_castpd128_pd256(v128 a)

Parameters

Type	Name	Description
v128	a

Returns

Type	Description
v256

mm256_castpd256_pd128(v256)

For compatibility with C++ code only. This is a no-op in Burst.

Declaration

public static v128 mm256_castpd256_pd128(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v128

mm256_castps_pd(v256)

For compatibility with C++ code only. This is a no-op in Burst.

Declaration

public static v256 mm256_castps_pd(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

mm256_castps_si256(v256)

For compatibility with C++ code only. This is a no-op in Burst.

Declaration

public static v256 mm256_castps_si256(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

mm256_castps128_ps256(v128)

For compatibility with C++ code only. This is a no-op in Burst.

Declaration

public static v256 mm256_castps128_ps256(v128 a)

Parameters

Type	Name	Description
v128	a

Returns

Type	Description
v256

mm256_castps256_ps128(v256)

For compatibility with C++ code only. This is a no-op in Burst.

Declaration

public static v128 mm256_castps256_ps128(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v128

mm256_castsi128_si256(v128)

For compatibility with C++ code only. This is a no-op in Burst.

Declaration

public static v256 mm256_castsi128_si256(v128 a)

Parameters

Type	Name	Description
v128	a

Returns

Type	Description
v256

mm256_castsi256_pd(v256)

For compatibility with C++ code only. This is a no-op in Burst.

Declaration

public static v256 mm256_castsi256_pd(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

mm256_castsi256_ps(v256)

For compatibility with C++ code only. This is a no-op in Burst.

Declaration

public static v256 mm256_castsi256_ps(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

mm256_castsi256_si128(v256)

For compatibility with C++ code only. This is a no-op in Burst.

Declaration

public static v128 mm256_castsi256_si128(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v128

mm256_ceil_pd(v256)

Round the packed double-precision (64-bit) floating-point elements in a up to an integer value, and store the results as packed double-precision floating-point elements in dst.

Declaration

public static v256 mm256_ceil_pd(v256 val)

Parameters

Type	Name	Description
v256	val

Returns

Type	Description
v256

mm256_ceil_ps(v256)

Round the packed single-precision (32-bit) floating-point elements in a up to an integer value, and store the results as packed single-precision floating-point elements in dst.

Declaration

public static v256 mm256_ceil_ps(v256 val)

Parameters

Type	Name	Description
v256	val

Returns

Type	Description
v256

mm256_cmp_pd(v256, v256, Int32)

Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

Declaration

public static v256 mm256_cmp_pd(v256 a, v256 b, int imm8)

Parameters

Type	Name	Description
v256	a
v256	b
Int32	imm8

Returns

Type	Description
v256

Remarks

**** VCMPPD ymm1, ymm2, ymm3/v256, imm8 Performs an SIMD compare of the four packed double-precision floating-point values in the second source operand (third operand) and the first source operand (second operand) and returns the results of the comparison to the destination operand (first operand). The comparison predicate operand (immediate) specifies the type of comparison performed on each of the pairs of packed values.

mm256_cmp_ps(v256, v256, Int32)

Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

Declaration

public static v256 mm256_cmp_ps(v256 a, v256 b, int imm8)

Parameters

Type	Name	Description
v256	a
v256	b
Int32	imm8

Returns

Type	Description
v256

Remarks

mm256_cvtepi32_pd(v128)

Convert packed 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.

Declaration

public static v256 mm256_cvtepi32_pd(v128 a)

Parameters

Type	Name	Description
v128	a

Returns

Type	Description
v256

Remarks

**** VCVTDQ2PD ymm1, xmm2/v128 Converts four packed signed doubleword integers in the source operand to four packed double-precision floating-point values in the destination

mm256_cvtepi32_ps(v256)

Convert packed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.

Declaration

public static v256 mm256_cvtepi32_ps(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

Remarks

**** VCVTDQ2PS ymm1, ymm2/v256 Converts eight packed signed doubleword integers in the source operand to eight packed double-precision floating-point values in the destination

mm256_cvtpd_epi32(v256)

Convert packed double-precision(64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.

Declaration

public static v128 mm256_cvtpd_epi32(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v128

Remarks

**** VCVTPD2DQ xmm1, ymm2/v256 Converts four packed double-precision floating-point values in the source operand to four packed signed doubleword integers in the destination

mm256_cvtpd_ps(v256)

Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.

Declaration

public static v128 mm256_cvtpd_ps(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v128

Remarks

**** VCVTPD2PS xmm1, ymm2/v256 Converts four packed double-precision floating-point values in the source operand to four packed single-precision floating-point values in the destination

mm256_cvtps_epi32(v256)

Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.

Declaration

public static v256 mm256_cvtps_epi32(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

Remarks

**** VCVTPS2DQ ymm1, ymm2/v256 Converts eight packed single-precision floating-point values in the source operand to eight signed doubleword integers in the destination

mm256_cvtps_pd(v128)

Convert packed single-precision (32-bit) floating-point elements in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.

Declaration

public static v256 mm256_cvtps_pd(v128 a)

Parameters

Type	Name	Description
v128	a

Returns

Type	Description
v256

Remarks

**** VCVTPS2PD ymm1, xmm2/v128 Converts four packed single-precision floating-point values in the source operand to four packed double-precision floating-point values in the destination

mm256_cvtss_f32(v256)

Copy the lower single-precision (32-bit) floating-point element of a to dst.

Declaration

public static float mm256_cvtss_f32(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
Single

Remarks

Identical in HPC# to accessing Float0, kept for compatibility with existing code while porting.

mm256_cvttpd_epi32(v256)

Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.

Declaration

public static v128 mm256_cvttpd_epi32(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v128

Remarks

**** VCVTTPD2DQ xmm1, ymm2/v256 Converts four packed double-precision floating-point values in the source operand to four packed signed doubleword integers in the destination. When a conversion is inexact, a truncated (round toward zero) value is returned. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned

mm256_cvttps_epi32(v256)

Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.

Declaration

public static v256 mm256_cvttps_epi32(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

Remarks

**** VCVTTPS2DQ ymm1, ymm2/v256 Converts eight packed single-precision floating-point values in the source operand to eight signed doubleword integers in the destination. When a conversion is inexact, a truncated (round toward zero) value is returned. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned

mm256_div_pd(v256, v256)

Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst.

Declaration

public static v256 mm256_div_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VDIVPD ymm1, ymm2, ymm3/v256 Performs an SIMD divide of the four packed double-precision floating-point values in the first source operand by the four packed double-precision floating-point values in the second source operand

mm256_div_ps(v256, v256)

Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst.

Declaration

public static v256 mm256_div_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

Divide Packed Single-Precision Floating-Point Values **** VDIVPS ymm1, ymm2, ymm3/v256 Performs an SIMD divide of the eight packed single-precision floating-point values in the first source operand by the eight packed single-precision floating-point values in the second source operand

mm256_dp_ps(v256, v256, Int32)

Conditionally multiply the packed single-precision (32-bit) floating-point elements in a and b using the high 4 bits in imm8, sum the four products, and conditionally store the sum in dst using the low 4 bits of imm8.

Declaration

public static v256 mm256_dp_ps(v256 a, v256 b, int imm8)

Parameters

Type	Name	Description
v256	a
v256	b
Int32	imm8

Returns

Type	Description
v256

Remarks

**** VDPPS ymm1, ymm2, ymm3/v256, imm8 Multiplies the packed single precision floating point values in the first source operand with the packed single-precision floats in the second source. Each of the four resulting single-precision values is conditionally summed depending on a mask extracted from the high 4 bits of the immediate operand. This sum is broadcast to each of 4 positions in the destination if the corresponding bit of the mask selected from the low 4 bits of the immediate operand is "1". If the corresponding low bit 0-3 of the mask is zero, the destination is set to zero. The process is replicated for the high elements of the destination.

mm256_extract_epi32(v256, Int32)

Extract a 32-bit integer from a, selected with index (which must be a constant), and store the result in dst.

Declaration

public static int mm256_extract_epi32(v256 a, int index)

Parameters

Type	Name	Description
v256	a
Int32	index

Returns

Type	Description
Int32

mm256_extract_epi64(v256, Int32)

Extract a 64-bit integer from a, selected with index (which must be a constant), and store the result in dst.

Declaration

public static long mm256_extract_epi64(v256 a, int index)

Parameters

Type	Name	Description
v256	a
Int32	index

Returns

Type	Description
Int64

mm256_extractf128_pd(v256, Int32)

Extract 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with imm8, and store the result in dst.

Declaration

public static v128 mm256_extractf128_pd(v256 a, int imm8)

Parameters

Type	Name	Description
v256	a
Int32	imm8

Returns

Type	Description
v128

Remarks

**** VEXTRACTF128 xmm1/v128, ymm2, imm8

mm256_extractf128_ps(v256, Int32)

Extract 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a, selected with imm8, and store the result in dst.

Declaration

public static v128 mm256_extractf128_ps(v256 a, int imm8)

Parameters

Type	Name	Description
v256	a
Int32	imm8

Returns

Type	Description
v128

Remarks

**** VEXTRACTF128 xmm1/v128, ymm2, imm8

mm256_extractf128_si256(v256, Int32)

Extract 128 bits (composed of integer data) from a, selected with imm8, and store the result in dst.

Declaration

public static v128 mm256_extractf128_si256(v256 a, int imm8)

Parameters

Type	Name	Description
v256	a
Int32	imm8

Returns

Type	Description
v128

Remarks

**** VEXTRACTF128 xmm1/v128, ymm2, imm8

mm256_floor_pd(v256)

Round the packed double-precision (64-bit) floating-point elements in a down to an integer value, and store the results as packed double-precision floating-point elements in dst.

Declaration

public static v256 mm256_floor_pd(v256 val)

Parameters

Type	Name	Description
v256	val

Returns

Type	Description
v256

mm256_floor_ps(v256)

Round the packed single-precision (32-bit) floating-point elements in a down to an integer value, and store the results as packed single-precision floating-point elements in dst.

Declaration

public static v256 mm256_floor_ps(v256 val)

Parameters

Type	Name	Description
v256	val

Returns

Type	Description
v256

mm256_hadd_pd(v256, v256)

Horizontally add adjacent pairs of double-precision (64-bit) floating-point elements in a and b, and pack the results in dst.

Declaration

public static v256 mm256_hadd_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VHADDPD ymm1, ymm2, ymm3/v256 Adds pairs of adjacent double-precision floating-point values in the first source operand and second source operand and stores results in the destination

mm256_hadd_ps(v256, v256)

Horizontally add adjacent pairs of single-precision (32-bit) floating-point elements in a and b, and pack the results in dst.

Declaration

public static v256 mm256_hadd_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VHADDPS ymm1, ymm2, ymm3/v256 Adds pairs of adjacent single-precision floating-point values in the first source operand and second source operand and stores results in the destination

mm256_hsub_pd(v256, v256)

Horizontally subtract adjacent pairs of double-precision (64-bit) floating-point elements in a and b, and pack the results in dst.

Declaration

public static v256 mm256_hsub_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VHSUBPD ymm1, ymm2, ymm3/v256 Subtract pairs of adjacent double-precision floating-point values in the first source operand and second source operand and stores results in the destination

mm256_hsub_ps(v256, v256)

Horizontally add adjacent pairs of single-precision (32-bit) floating-point elements in a and b, and pack the results in dst.

Declaration

public static v256 mm256_hsub_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VHSUBPS ymm1, ymm2, ymm3/v256 Subtract pairs of adjacent single-precision floating-point values in the first source operand and second source operand and stores results in the destination.

mm256_insert_epi16(v256, Int32, Int32)

Copy a to dst, and insert the 16-bit integer i into dst at the location specified by index (which must be a constant).

Declaration

public static v256 mm256_insert_epi16(v256 a, int i, int index)

Parameters

Type	Name	Description
v256	a
Int32	i
Int32	index

Returns

Type	Description
v256

mm256_insert_epi32(v256, Int32, Int32)

Copy a to dst, and insert the 32-bit integer i into dst at the location specified by index (which must be a constant).

Declaration

public static v256 mm256_insert_epi32(v256 a, int i, int index)

Parameters

Type	Name	Description
v256	a
Int32	i
Int32	index

Returns

Type	Description
v256

mm256_insert_epi64(v256, Int64, Int32)

Copy a to dst, and insert the 64-bit integer i into dst at the location specified by index (which must be a constant).

Declaration

public static v256 mm256_insert_epi64(v256 a, long i, int index)

Parameters

Type	Name	Description
v256	a
Int64	i
Int32	index

Returns

Type	Description
v256

Remarks

This intrinsic requires a 64-bit processor.

mm256_insert_epi8(v256, Int32, Int32)

Copy a to dst, and insert the 8-bit integer i into dst at the location specified by index (which must be a constant).

Declaration

public static v256 mm256_insert_epi8(v256 a, int i, int index)

Parameters

Type	Name	Description
v256	a
Int32	i
Int32	index

Returns

Type	Description
v256

mm256_insertf128_pd(v256, v128, Int32)

Copy a to dst, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into dst at the location specified by imm8.

Declaration

public static v256 mm256_insertf128_pd(v256 a, v128 b, int imm8)

Parameters

Type	Name	Description
v256	a
v128	b
Int32	imm8

Returns

Type	Description
v256

Remarks

**** VINSERTF128 ymm1, ymm2, xmm3/v128, imm8 Performs an insertion of 128-bits of packed floating-point values from the second source operand into an the destination at an 128-bit offset from imm8[0]. The remaining portions of the destination are written by the corresponding fields of the first source operand

mm256_insertf128_ps(v256, v128, Int32)

Copy a to dst, then insert 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from b into dst at the location specified by imm8.

Declaration

public static v256 mm256_insertf128_ps(v256 a, v128 b, int imm8)

Parameters

Type	Name	Description
v256	a
v128	b
Int32	imm8

Returns

Type	Description
v256

Remarks

mm256_insertf128_si256(v256, v128, Int32)

Copy a to dst, then insert 128 bits of integer data from b into dst at the location specified by imm8.

Declaration

public static v256 mm256_insertf128_si256(v256 a, v128 b, int imm8)

Parameters

Type	Name	Description
v256	a
v128	b
Int32	imm8

Returns

Type	Description
v256

Remarks

mm256_lddqu_si256(Void*)

Load 256-bits of integer data from unaligned memory into dst. This intrinsic may perform better than mm256_loadu_si256 when the data crosses a cache line boundary.

Declaration

public static v256 mm256_lddqu_si256(void *mem_addr)

Parameters

Type	Name	Description
Void*	mem_addr

Returns

Type	Description
v256

Remarks

**** VLDDQU ymm1, v256

mm256_load_pd(Void*)

Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory

Declaration

public static v256 mm256_load_pd(void *ptr)

Parameters

Type	Name	Description
Void*	ptr

Returns

Type	Description
v256

Remarks

**** VMOVUPS ymm1, v256 Burst only generates unaligned stores.

mm256_load_ps(Void*)

Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory

Declaration

public static v256 mm256_load_ps(void *ptr)

Parameters

Type	Name	Description
Void*	ptr

Returns

Type	Description
v256

Remarks

**** VMOVUPS ymm1, v256 Burst only generates unaligned stores.

mm256_load_si256(Void*)

Load 256-bits (composed of 8 packed 32-bit integers elements) from memory

Declaration

public static v256 mm256_load_si256(void *ptr)

Parameters

Type	Name	Description
Void*	ptr

Returns

Type	Description
v256

Remarks

**** VMOVDQU ymm1, v256 Burst only generates unaligned stores.

mm256_loadu_pd(Void*)

Load 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from memory

Declaration

public static v256 mm256_loadu_pd(void *ptr)

Parameters

Type	Name	Description
Void*	ptr

Returns

Type	Description
v256

Remarks

**** VMOVUPS ymm1, v256 Burst only generates unaligned stores.

mm256_loadu_ps(Void*)

Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory

Declaration

public static v256 mm256_loadu_ps(void *ptr)

Parameters

Type	Name	Description
Void*	ptr

Returns

Type	Description
v256

Remarks

**** VMOVUPS ymm1, v256 Burst only generates unaligned stores.

mm256_loadu_si256(Void*)

Load 256-bits (composed of 8 packed 32-bit integers elements) from memory

Declaration

public static v256 mm256_loadu_si256(void *ptr)

Parameters

Type	Name	Description
Void*	ptr

Returns

Type	Description
v256

Remarks

**** VMOVDQU ymm1, v256 Burst only generates unaligned stores.

mm256_loadu2_m128(Void, Void)

Load two 128-bit values (composed of 4 packed single-precision (32-bit) floating-point elements) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.

Declaration

public static v256 mm256_loadu2_m128(void *hiaddr, void *loaddr)

Parameters

Type	Name	Description
Void*	hiaddr
Void*	loaddr

Returns

Type	Description
v256

Remarks

This is a composite function which can generate more than one instruction.

mm256_loadu2_m128d(Void, Void)

Load two 128-bit values (composed of 2 packed double-precision (64-bit) floating-point elements) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.

Declaration

public static v256 mm256_loadu2_m128d(void *hiaddr, void *loaddr)

Parameters

Type	Name	Description
Void*	hiaddr
Void*	loaddr

Returns

Type	Description
v256

Remarks

This is a composite function which can generate more than one instruction.

mm256_loadu2_m128i(Void, Void)

Load two 128-bit values (composed of integer data) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.

Declaration

public static v256 mm256_loadu2_m128i(void *hiaddr, void *loaddr)

Parameters

Type	Name	Description
Void*	hiaddr
Void*	loaddr

Returns

Type	Description
v256

Remarks

This is a composite function which can generate more than one instruction.

mm256_maskload_pd(Void*, v256)

Load packed double-precision (64-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

Declaration

public static v256 mm256_maskload_pd(void *mem_addr, v256 mask)

Parameters

Type	Name	Description
Void*	mem_addr
v256	mask

Returns

Type	Description
v256

Remarks

**** VMASKMOVPD ymm1, ymm2, v256

mm256_maskload_ps(Void*, v256)

Load packed single-precision (32-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

Declaration

public static v256 mm256_maskload_ps(void *mem_addr, v256 mask)

Parameters

Type	Name	Description
Void*	mem_addr
v256	mask

Returns

Type	Description
v256

Remarks

**** VMASKMOVPS ymm1, ymm2, v256

mm256_maskstore_pd(Void*, v256, v256)

Store packed double-precision (64-bit) floating-point elements from a into memory using mask.

Declaration

public static void mm256_maskstore_pd(void *mem_addr, v256 mask, v256 a)

Parameters

Type	Name	Description
Void*	mem_addr
v256	mask
v256	a

Remarks

**** VMASKMOVPD v256, ymm1, ymm2

mm256_maskstore_ps(Void*, v256, v256)

Store packed single-precision (32-bit) floating-point elements from a into memory using mask.

Declaration

public static void mm256_maskstore_ps(void *mem_addr, v256 mask, v256 a)

Parameters

Type	Name	Description
Void*	mem_addr
v256	mask
v256	a

Remarks

**** VMASKMOVPS v256, ymm1, ymm2

mm256_max_pd(v256, v256)

Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst.

Declaration

public static v256 mm256_max_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VMAXPD ymm1, ymm2, ymm3/v256 Performs an SIMD compare of the packed double-precision floating-point values in the first source operand and the second source operand and returns the maximum value for each pair of values to the destination

mm256_max_ps(v256, v256)

Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst.

Declaration

public static v256 mm256_max_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VMAXPS ymm1, ymm2, ymm3/v256 Performs an SIMD compare of the packed single-precision floating-point values in the first source operand and the second source operand and returns the maximum value for each pair of values to the destination

mm256_min_pd(v256, v256)

Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst.

Declaration

public static v256 mm256_min_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VMINPD ymm1, ymm2, ymm3/v256 Performs an SIMD compare of the packed double-precision floating-point values in the first source operand and the second source operand and returns the minimum value for each pair of values to the destination

mm256_min_ps(v256, v256)

Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst.

Declaration

public static v256 mm256_min_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VMINPS ymm1, ymm2, ymm3/v256 Performs an SIMD compare of the packed single-precision floating-point values in the first source operand and the second source operand and returns the minimum value for each pair of values to the destination

mm256_movedup_pd(v256)

Duplicate even-indexed double-precision (64-bit) floating-point elements from a, and store the results in dst.

Declaration

public static v256 mm256_movedup_pd(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

Remarks

**** VMOVDDUP ymm1, ymm2/v256

mm256_movehdup_ps(v256)

Duplicate odd-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst.

Declaration

public static v256 mm256_movehdup_ps(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

Remarks

**** VMOVSHDUP ymm1, ymm2/v256

mm256_moveldup_ps(v256)

Duplicate even-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst.

Declaration

public static v256 mm256_moveldup_ps(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

Remarks

**** VMOVSLDUP ymm1, ymm2/v256

mm256_movemask_pd(v256)

Set each bit of mask dst based on the most significant bit of the corresponding packed double-precision (64-bit) floating-point element in a.

Declaration

public static int mm256_movemask_pd(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
Int32

Remarks

**** VMOVMSKPD r32, ymm2 Extracts the sign bits from the packed double-precision floating-point values in the source operand, formats them into a 4-bit mask, and stores the mask in the destination

mm256_movemask_ps(v256)

Set each bit of mask dst based on the most significant bit of the corresponding packed single-precision (32-bit) floating-point element in a.

Declaration

public static int mm256_movemask_ps(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
Int32

Remarks

**** VMOVMSKPS r32, ymm2 Extracts the sign bits from the packed single-precision floating-point values in the source operand, formats them into a 8-bit mask, and stores the mask in the destination

mm256_mul_pd(v256, v256)

Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

Declaration

public static v256 mm256_mul_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VMULPD ymm1, ymm2, ymm3/v256 Performs a SIMD multiply of the four packed double-precision floating-point values from the first Source operand to the Second Source operand, and stores the packed double-precision floating-point results in the destination

mm256_mul_ps(v256, v256)

Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

Declaration

public static v256 mm256_mul_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VMULPS ymm1, ymm2, ymm3/v256 Performs an SIMD multiply of the eight packed single-precision floating-point values from the first source operand to the second source operand, and stores the packed double-precision floating-point results in the destination

mm256_or_pd(v256, v256)

Compute the bitwise OR of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

Declaration

public static v256 mm256_or_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VORPD ymm1, ymm2, ymm3/v256 Performs a bitwise logical OR of the four packed double-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination

mm256_or_ps(v256, v256)

Compute the bitwise OR of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

Declaration

public static v256 mm256_or_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VORPS ymm1, ymm2, ymm3/v256 Performs a bitwise logical OR of the eight packed single-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination

mm256_permute_pd(v256, Int32)

Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.

Declaration

public static v256 mm256_permute_pd(v256 a, int imm8)

Parameters

Type	Name	Description
v256	a
Int32	imm8

Returns

Type	Description
v256

Remarks

**** VPERMILPD ymm1, ymm2/v256, imm8 Permute Double-Precision Floating-Point values in the first source operand using two, 1-bit control fields in the low 2 bits of the 8-bit immediate and store results in the destination

mm256_permute_ps(v256, Int32)

Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.

Declaration

public static v256 mm256_permute_ps(v256 a, int imm8)

Parameters

Type	Name	Description
v256	a
Int32	imm8

Returns

Type	Description
v256

Remarks

**** VPERMILPS ymm1, ymm2/v256, imm8 Permute Single-Precision Floating-Point values in the first source operand using four 2-bit control fields in the 8-bit immediate and store results in the destination

mm256_permute2f128_pd(v256, v256, Int32)

Shuffle 128-bits (composed of 2 packed double-precision (64-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.

Declaration

public static v256 mm256_permute2f128_pd(v256 a, v256 b, int imm8)

Parameters

Type	Name	Description
v256	a
v256	b
Int32	imm8

Returns

Type	Description
v256

Remarks

**** VPERM2F128 ymm1, ymm2, ymm3/v256, imm8 Permute 128 bit floating-point-containing fields from the first source operand and second source operand using bits in the 8-bit immediate and store results in the destination

mm256_permute2f128_ps(v256, v256, Int32)

Shuffle 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.

Declaration

public static v256 mm256_permute2f128_ps(v256 a, v256 b, int imm8)

Parameters

Type	Name	Description
v256	a
v256	b
Int32	imm8

Returns

Type	Description
v256

Remarks

mm256_permute2f128_si256(v256, v256, Int32)

Shuffle 128-bits (composed of integer data) selected by imm8 from a and b, and store the results in dst.

Declaration

public static v256 mm256_permute2f128_si256(v256 a, v256 b, int imm8)

Parameters

Type	Name	Description
v256	a
v256	b
Int32	imm8

Returns

Type	Description
v256

Remarks

mm256_permutevar_pd(v256, v256)

Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst.

Declaration

public static v256 mm256_permutevar_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VPERMILPD ymm1, ymm2, ymm3/v256 Permute Double-Precision Floating-Point values in the first source operand using 8-bit control fields in the low bytes of the second source operand and store results in the destination

mm256_permutevar_ps(v256, v256)

Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst.

Declaration

public static v256 mm256_permutevar_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VPERMILPS ymm1, ymm2, ymm3/v256 Permute Single-Precision Floating-Point values in the first source operand using 8-bit control fields in the low bytes of corresponding elements the shuffle control and store results in the destination

mm256_rcp_ps(v256)

Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 1.5*2^-12.

Declaration

public static v256 mm256_rcp_ps(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

Remarks

**** VRCPPS ymm1, ymm2/v256

mm256_round_pd(v256, Int32)

Round the packed double-precision (64-bit) floating-point elements in a using the rounding parameter, and store the results as packed double-precision floating-point elements in dst.

Declaration

public static v256 mm256_round_pd(v256 a, int rounding)

Parameters

Type	Name	Description
v256	a
Int32	rounding

Returns

Type	Description
v256

Remarks

**** VROUNDPD ymm1,ymm2/v256,imm8 Rounding is done according to the rounding parameter, which can be one of: (_MM_FROUND_TO_NEAREST_INT |_MM_FROUND_NO_EXC) // round to nearest, and suppress exceptions (_MM_FROUND_TO_NEG_INF |_MM_FROUND_NO_EXC) // round down, and suppress exceptions (_MM_FROUND_TO_POS_INF |_MM_FROUND_NO_EXC) // round up, and suppress exceptions (_MM_FROUND_TO_ZERO |_MM_FROUND_NO_EXC) // truncate, and suppress exceptions _MM_FROUND_CUR_DIRECTION // use MXCSR.RC; see _MM_SET_ROUNDING_MODE

mm256_round_ps(v256, Int32)

Round the packed single-precision (32-bit) floating-point elements in a using the rounding parameter, and store the results as packed single-precision floating-point elements in dst.

Declaration

public static v256 mm256_round_ps(v256 a, int rounding)

Parameters

Type	Name	Description
v256	a
Int32	rounding

Returns

Type	Description
v256

Remarks

**** VROUNDPS ymm1,ymm2/v256,imm8 Round the four single-precision floating-point values values in the source operand by the rounding mode specified in the immediate operand and place the result in the destination. The rounding process rounds the input to an integral value and returns the result as a double-precision floating-point value. The Precision Floating Point Exception is signaled according to the immediate operand. If any source operand is an SNaN then it will be converted to a QNaN.

mm256_rsqrt_ps(v256)

Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 1.5*2^-12.

Declaration

public static v256 mm256_rsqrt_ps(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

Remarks

**** VRSQRTPS ymm1, ymm2/v256

mm256_set_epi16(Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16)

Set packed short elements in dst with the supplied values.

Declaration

public static v256 mm256_set_epi16(short e15_, short e14_, short e13_, short e12_, short e11_, short e10_, short e9_, short e8_, short e7_, short e6_, short e5_, short e4_, short e3_, short e2_, short e1_, short e0_)

Parameters

Type	Name	Description
Int16	e15_
Int16	e14_
Int16	e13_
Int16	e12_
Int16	e11_
Int16	e10_
Int16	e9_
Int16	e8_
Int16	e7_
Int16	e6_
Int16	e5_
Int16	e4_
Int16	e3_
Int16	e2_
Int16	e1_
Int16	e0_

Returns

Type	Description
v256

mm256_set_epi32(Int32, Int32, Int32, Int32, Int32, Int32, Int32, Int32)

Set packed int elements in dst with the supplied values.

Declaration

public static v256 mm256_set_epi32(int e7, int e6, int e5, int e4, int e3, int e2, int e1, int e0)

Parameters

Type	Name	Description
Int32	e7
Int32	e6
Int32	e5
Int32	e4
Int32	e3
Int32	e2
Int32	e1
Int32	e0

Returns

Type	Description
v256

mm256_set_epi64x(Int64, Int64, Int64, Int64)

Set packed 64-bit integers in dst with the supplied values.

Declaration

public static v256 mm256_set_epi64x(long e3, long e2, long e1, long e0)

Parameters

Type	Name	Description
Int64	e3
Int64	e2
Int64	e1
Int64	e0

Returns

Type	Description
v256

mm256_set_epi8(Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte)

Set packed byte elements in dst with the supplied values.

Declaration

public static v256 mm256_set_epi8(byte e31_, byte e30_, byte e29_, byte e28_, byte e27_, byte e26_, byte e25_, byte e24_, byte e23_, byte e22_, byte e21_, byte e20_, byte e19_, byte e18_, byte e17_, byte e16_, byte e15_, byte e14_, byte e13_, byte e12_, byte e11_, byte e10_, byte e9_, byte e8_, byte e7_, byte e6_, byte e5_, byte e4_, byte e3_, byte e2_, byte e1_, byte e0_)

Parameters

Type	Name	Description
Byte	e31_
Byte	e30_
Byte	e29_
Byte	e28_
Byte	e27_
Byte	e26_
Byte	e25_
Byte	e24_
Byte	e23_
Byte	e22_
Byte	e21_
Byte	e20_
Byte	e19_
Byte	e18_
Byte	e17_
Byte	e16_
Byte	e15_
Byte	e14_
Byte	e13_
Byte	e12_
Byte	e11_
Byte	e10_
Byte	e9_
Byte	e8_
Byte	e7_
Byte	e6_
Byte	e5_
Byte	e4_
Byte	e3_
Byte	e2_
Byte	e1_
Byte	e0_

Returns

Type	Description
v256

mm256_set_m128(v128, v128)

Declaration

public static v256 mm256_set_m128(v128 hi, v128 lo)

Parameters

Type	Name	Description
v128	hi
v128	lo

Returns

Type	Description
v256

mm256_set_m128d(v128, v128)

Set packed v256 vector with the supplied values.

Declaration

public static v256 mm256_set_m128d(v128 hi, v128 lo)

Parameters

Type	Name	Description
v128	hi
v128	lo

Returns

Type	Description
v256

mm256_set_m128i(v128, v128)

Set packed v256 vector with the supplied values.

Declaration

public static v256 mm256_set_m128i(v128 hi, v128 lo)

Parameters

Type	Name	Description
v128	hi
v128	lo

Returns

Type	Description
v256

mm256_set_pd(Double, Double, Double, Double)

Set packed double-precision (64-bit) floating-point elements in dst with the supplied values.

Declaration

public static v256 mm256_set_pd(double d, double c, double b, double a)

Parameters

Type	Name	Description
Double	d
Double	c
Double	b
Double	a

Returns

Type	Description
v256

mm256_set_ps(Single, Single, Single, Single, Single, Single, Single, Single)

Set packed single-precision (32-bit) floating-point elements in dst with the supplied values.

Declaration

public static v256 mm256_set_ps(float e7, float e6, float e5, float e4, float e3, float e2, float e1, float e0)

Parameters

Type	Name	Description
Single	e7
Single	e6
Single	e5
Single	e4
Single	e3
Single	e2
Single	e1
Single	e0

Returns

Type	Description
v256

mm256_set1_epi16(Int16)

Broadcast 16-bit integer a to all all elements of dst. This intrinsic may generate the vpbroadcastw instruction.

Declaration

public static v256 mm256_set1_epi16(short a)

Parameters

Type	Name	Description
Int16	a

Returns

Type	Description
v256

mm256_set1_epi32(Int32)

Broadcast 32-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastd instruction.

Declaration

public static v256 mm256_set1_epi32(int a)

Parameters

Type	Name	Description
Int32	a

Returns

Type	Description
v256

mm256_set1_epi64x(Int64)

Broadcast 64-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastq instruction.

Declaration

public static v256 mm256_set1_epi64x(long a)

Parameters

Type	Name	Description
Int64	a

Returns

Type	Description
v256

mm256_set1_epi8(Char)

Broadcast 8-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastb instruction.

Declaration

public static v256 mm256_set1_epi8(char a)

Parameters

Type	Name	Description
Char	a

Returns

Type	Description
v256

mm256_set1_pd(Double)

Broadcast double-precision (64-bit) floating-point value a to all elements of dst.

Declaration

public static v256 mm256_set1_pd(double a)

Parameters

Type	Name	Description
Double	a

Returns

Type	Description
v256

mm256_set1_ps(Single)

Broadcast single-precision (32-bit) floating-point value a to all elements of dst.

Declaration

public static v256 mm256_set1_ps(float a)

Parameters

Type	Name	Description
Single	a

Returns

Type	Description
v256

mm256_setr_epi16(Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16)

Set packed short elements in dst with the supplied values in reverse order.

Declaration

public static v256 mm256_setr_epi16(short e15_, short e14_, short e13_, short e12_, short e11_, short e10_, short e9_, short e8_, short e7_, short e6_, short e5_, short e4_, short e3_, short e2_, short e1_, short e0_)

Parameters

Type	Name	Description
Int16	e15_
Int16	e14_
Int16	e13_
Int16	e12_
Int16	e11_
Int16	e10_
Int16	e9_
Int16	e8_
Int16	e7_
Int16	e6_
Int16	e5_
Int16	e4_
Int16	e3_
Int16	e2_
Int16	e1_
Int16	e0_

Returns

Type	Description
v256

mm256_setr_epi32(Int32, Int32, Int32, Int32, Int32, Int32, Int32, Int32)

Set packed int elements in dst with the supplied values in reverse order.

Declaration

public static v256 mm256_setr_epi32(int e7, int e6, int e5, int e4, int e3, int e2, int e1, int e0)

Parameters

Type	Name	Description
Int32	e7
Int32	e6
Int32	e5
Int32	e4
Int32	e3
Int32	e2
Int32	e1
Int32	e0

Returns

Type	Description
v256

mm256_setr_epi64x(Int64, Int64, Int64, Int64)

Set packed 64-bit integers in dst with the supplied values in reverse order.

Declaration

public static v256 mm256_setr_epi64x(long e3, long e2, long e1, long e0)

Parameters

Type	Name	Description
Int64	e3
Int64	e2
Int64	e1
Int64	e0

Returns

Type	Description
v256

mm256_setr_epi8(Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte)

Set packed byte elements in dst with the supplied values in reverse order.

Declaration

public static v256 mm256_setr_epi8(byte e31_, byte e30_, byte e29_, byte e28_, byte e27_, byte e26_, byte e25_, byte e24_, byte e23_, byte e22_, byte e21_, byte e20_, byte e19_, byte e18_, byte e17_, byte e16_, byte e15_, byte e14_, byte e13_, byte e12_, byte e11_, byte e10_, byte e9_, byte e8_, byte e7_, byte e6_, byte e5_, byte e4_, byte e3_, byte e2_, byte e1_, byte e0_)

Parameters

Type	Name	Description
Byte	e31_
Byte	e30_
Byte	e29_
Byte	e28_
Byte	e27_
Byte	e26_
Byte	e25_
Byte	e24_
Byte	e23_
Byte	e22_
Byte	e21_
Byte	e20_
Byte	e19_
Byte	e18_
Byte	e17_
Byte	e16_
Byte	e15_
Byte	e14_
Byte	e13_
Byte	e12_
Byte	e11_
Byte	e10_
Byte	e9_
Byte	e8_
Byte	e7_
Byte	e6_
Byte	e5_
Byte	e4_
Byte	e3_
Byte	e2_
Byte	e1_
Byte	e0_

Returns

Type	Description
v256

mm256_setr_m128(v128, v128)

Set packed v256 vector with the supplied values in reverse order.

Declaration

public static v256 mm256_setr_m128(v128 hi, v128 lo)

Parameters

Type	Name	Description
v128	hi
v128	lo

Returns

Type	Description
v256

mm256_setr_m128d(v128, v128)

Set packed v256 vector with the supplied values in reverse order.

Declaration

public static v256 mm256_setr_m128d(v128 hi, v128 lo)

Parameters

Type	Name	Description
v128	hi
v128	lo

Returns

Type	Description
v256

mm256_setr_m128i(v128, v128)

Set packed v256 vector with the supplied values in reverse order.

Declaration

public static v256 mm256_setr_m128i(v128 hi, v128 lo)

Parameters

Type	Name	Description
v128	hi
v128	lo

Returns

Type	Description
v256

mm256_setr_pd(Double, Double, Double, Double)

Set packed double-precision (64-bit) floating-point elements in dst with the supplied values in reverse order.

Declaration

public static v256 mm256_setr_pd(double d, double c, double b, double a)

Parameters

Type	Name	Description
Double	d
Double	c
Double	b
Double	a

Returns

Type	Description
v256

mm256_setr_ps(Single, Single, Single, Single, Single, Single, Single, Single)

Set packed single-precision (32-bit) floating-point elements in dst with the supplied values in reverse order.

Declaration

public static v256 mm256_setr_ps(float e7, float e6, float e5, float e4, float e3, float e2, float e1, float e0)

Parameters

Type	Name	Description
Single	e7
Single	e6
Single	e5
Single	e4
Single	e3
Single	e2
Single	e1
Single	e0

Returns

Type	Description
v256

mm256_setzero_pd()

Return a vector with all elements set to zero.

Declaration

public static v256 mm256_setzero_pd()

Returns

Type	Description
v256

mm256_setzero_ps()

Return a vector with all elements set to zero.

Declaration

public static v256 mm256_setzero_ps()

Returns

Type	Description
v256

mm256_setzero_si256()

Return a vector with all elements set to zero.

Declaration

public static v256 mm256_setzero_si256()

Returns

Type	Description
v256

mm256_shuffle_pd(v256, v256, Int32)

Shuffle double-precision (64-bit) floating-point elements within 128-bit lanes using the control in imm8, and store the results in dst.

Declaration

public static v256 mm256_shuffle_pd(v256 a, v256 b, int imm8)

Parameters

Type	Name	Description
v256	a
v256	b
Int32	imm8

Returns

Type	Description
v256

Remarks

**** VSHUFPD ymm1, ymm2, ymm3/v256, imm8 Moves either of the two packed double-precision floating-point values from each double quadword in the first source operand into the low quadword of each double quadword of the destination; moves either of the two packed double-precision floating-point values from the second source operand into the high quadword of each double quadword of the destination operand. The selector operand determines which values are moved to the destination

mm256_shuffle_ps(v256, v256, Int32)

Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.

Declaration

public static v256 mm256_shuffle_ps(v256 a, v256 b, int imm8)

Parameters

Type	Name	Description
v256	a
v256	b
Int32	imm8

Returns

Type	Description
v256

Remarks

**** VSHUFPS ymm1, ymm2, ymm3/v256, imm8 Moves two of the four packed single-precision floating-point values from each double qword of the first source operand into the low quadword of each double qword of the destination; moves two of the four packed single-precision floating-point values from each double qword of the second source operand into to the high quadword of each double qword of the destination. The selector operand determines which values are moved to the destination.

mm256_sqrt_pd(v256)

Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst.

Declaration

public static v256 mm256_sqrt_pd(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

Remarks

**** VSQRTPD ymm1, ymm2/v256

mm256_sqrt_ps(v256)

Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst.

Declaration

public static v256 mm256_sqrt_ps(v256 a)

Parameters

Type	Name	Description
v256	a

Returns

Type	Description
v256

Remarks

**** VSQRTPS ymm1, ymm2/v256

mm256_store_pd(Void*, v256)

Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory

Declaration

public static void mm256_store_pd(void *ptr, v256 a)

Parameters

Type	Name	Description
Void*	ptr
v256	a

Remarks

**** VMOVUPS v256, ymm1 Burst only generates unaligned stores.

mm256_store_ps(Void*, v256)

Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory

Declaration

public static void mm256_store_ps(void *ptr, v256 val)

Parameters

Type	Name	Description
Void*	ptr
v256	val

Remarks

**** VMOVUPS v256, ymm1 Burst only generates unaligned stores.

mm256_store_si256(Void*, v256)

Store 256-bits (composed of 8 packed 32-bit integer elements) from a into memory

Declaration

public static void mm256_store_si256(void *ptr, v256 v)

Parameters

Type	Name	Description
Void*	ptr
v256	v

Remarks

**** VMOVDQU v256, ymm1 Burst only generates unaligned stores.

mm256_storeu_pd(Void*, v256)

Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory

Declaration

public static void mm256_storeu_pd(void *ptr, v256 a)

Parameters

Type	Name	Description
Void*	ptr
v256	a

Remarks

**** VMOVUPS v256, ymm1 Burst only generates unaligned stores.

mm256_storeu_ps(Void*, v256)

Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory

Declaration

public static void mm256_storeu_ps(void *ptr, v256 a)

Parameters

Type	Name	Description
Void*	ptr
v256	a

Remarks

**** VMOVUPS v256, ymm1 Burst only generates unaligned stores.

mm256_storeu_si256(Void*, v256)

Store 256-bits (composed of 8 packed 32-bit integer elements) from a into memory

Declaration

public static void mm256_storeu_si256(void *ptr, v256 v)

Parameters

Type	Name	Description
Void*	ptr
v256	v

Remarks

**** VMOVDQU v256, ymm1 Burst only generates unaligned stores.

mm256_storeu2_m128(Void, Void, v256)

Store the high and low 128-bit halves (each composed of 4 packed single-precision (32-bit) floating-point elements) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.

Declaration

public static void mm256_storeu2_m128(void *hiaddr, void *loaddr, v256 val)

Parameters

Type	Name	Description
Void*	hiaddr
Void*	loaddr
v256	val

Remarks

This is a composite function which can generate more than one instruction.

mm256_storeu2_m128d(Void, Void, v256)

Store the high and low 128-bit halves (each composed of 2 packed double-precision (64-bit) floating-point elements) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.

Declaration

public static void mm256_storeu2_m128d(void *hiaddr, void *loaddr, v256 val)

Parameters

Type	Name	Description
Void*	hiaddr
Void*	loaddr
v256	val

Remarks

This is a composite function which can generate more than one instruction.

mm256_storeu2_m128i(Void, Void, v256)

Store the high and low 128-bit halves (each composed of integer data) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.

Declaration

public static void mm256_storeu2_m128i(void *hiaddr, void *loaddr, v256 val)

Parameters

Type	Name	Description
Void*	hiaddr
Void*	loaddr
v256	val

Remarks

This is a composite function which can generate more than one instruction.

mm256_stream_pd(Void*, v256)

Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

Declaration

public static void mm256_stream_pd(void *mem_addr, v256 a)

Parameters

Type	Name	Description
Void*	mem_addr
v256	a

Remarks

**** VMOVNTPD v256, ymm1

mm256_stream_ps(Void*, v256)

Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

Declaration

public static void mm256_stream_ps(void *mem_addr, v256 a)

Parameters

Type	Name	Description
Void*	mem_addr
v256	a

Remarks

**** VMOVNTPS v256, ymm1

mm256_stream_si256(Void*, v256)

Store 256-bits of integer data from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

Declaration

public static void mm256_stream_si256(void *mem_addr, v256 a)

Parameters

Type	Name	Description
Void*	mem_addr
v256	a

Remarks

**** VMOVNTDQ v256, ymm1

mm256_sub_pd(v256, v256)

Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst.

Declaration

public static v256 mm256_sub_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VSUBPD ymm1, ymm2, ymm3/v256 Performs an SIMD subtract of the four packed double-precision floating-point values of the second Source operand from the first Source operand, and stores the packed double-precision floating-point results in the destination

mm256_sub_ps(v256, v256)

Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst.

Declaration

public static v256 mm256_sub_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VSUBPS ymm1, ymm2, ymm3/v256 Performs an SIMD subtract of the eight packed single-precision floating-point values in the second Source operand from the First Source operand, and stores the packed single-precision floating-point results in the destination

mm256_testc_pd(v256, v256)

Compute the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

Declaration

public static int mm256_testc_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
Int32

Remarks

**** VTESTPD ymm1, ymm2/v256

mm256_testc_ps(v256, v256)

Compute the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

Declaration

public static int mm256_testc_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
Int32

Remarks

**** VTESTPS ymm1, ymm2/v256

mm256_testc_si256(v256, v256)

Compute the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return the CF value.

Declaration

public static int mm256_testc_si256(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
Int32

mm256_testnzc_pd(v256, v256)

Declaration

public static int mm256_testnzc_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
Int32

Remarks

**** VTESTPD ymm1, ymm2/v256

mm256_testnzc_ps(v256, v256)

Declaration

public static int mm256_testnzc_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
Int32

Remarks

**** VTESTPS ymm1, ymm2/v256

mm256_testnzc_si256(v256, v256)

Declaration

public static int mm256_testnzc_si256(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
Int32

mm256_testz_pd(v256, v256)

Declaration

public static int mm256_testz_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
Int32

Remarks

**** VTESTPD ymm1, ymm2/v256

mm256_testz_ps(v256, v256)

Declaration

public static int mm256_testz_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
Int32

Remarks

**** VTESTPS ymm1, ymm2/v256

mm256_testz_si256(v256, v256)

Declaration

public static int mm256_testz_si256(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
Int32

mm256_undefined_pd()

Return a 256-bit vector with undefined contents.

Declaration

public static v256 mm256_undefined_pd()

Returns

Type	Description
v256

mm256_undefined_ps()

Return a 256-bit vector with undefined contents.

Declaration

public static v256 mm256_undefined_ps()

Returns

Type	Description
v256

mm256_undefined_si256()

Return a 256-bit vector with undefined contents.

Declaration

public static v256 mm256_undefined_si256()

Returns

Type	Description
v256

mm256_unpackhi_pd(v256, v256)

Unpack and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst.

Declaration

public static v256 mm256_unpackhi_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VUNPCKHPD ymm1,ymm2,ymm3/v256

mm256_unpackhi_ps(v256, v256)

Unpack and interleave single-precision(32-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst.

Declaration

public static v256 mm256_unpackhi_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VUNPCKHPS ymm1,ymm2,ymm3/v256

mm256_unpacklo_pd(v256, v256)

Unpack and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst.

Declaration

public static v256 mm256_unpacklo_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VUNPCKLPD ymm1,ymm2,ymm3/v256

mm256_unpacklo_ps(v256, v256)

Unpack and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst.

Declaration

public static v256 mm256_unpacklo_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VUNPCKLPS ymm1,ymm2,ymm3/v256

mm256_xor_pd(v256, v256)

Compute the bitwise XOR of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

Declaration

public static v256 mm256_xor_pd(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VXORPD ymm1, ymm2, ymm3/v256 Performs a bitwise logical XOR of the four packed double-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination

mm256_xor_ps(v256, v256)

Compute the bitwise XOR of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

Declaration

public static v256 mm256_xor_ps(v256 a, v256 b)

Parameters

Type	Name	Description
v256	a
v256	b

Returns

Type	Description
v256

Remarks

**** VXORPS ymm1, ymm2, ymm3/v256 Performs a bitwise logical XOR of the eight packed single-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination

mm256_zeroall()

Zeros the contents of all YMM registers

Declaration

public static void mm256_zeroall()

Remarks

**** VZEROALL

mm256_zeroupper()

Zero the upper 128 bits of all YMM registers; the lower 128-bits of the registers are unmodified.

Declaration

public static void mm256_zeroupper()

Remarks

**** VZEROUPPER

mm256_zextpd128_pd256(v128)

Casts vector of type v128 to type v256; the upper 128 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.

Declaration

public static v256 mm256_zextpd128_pd256(v128 a)

Parameters

Type	Name	Description
v128	a

Returns

Type	Description
v256

mm256_zextps128_ps256(v128)

Casts vector of type v128 to type v256; the upper 128 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.

Declaration

public static v256 mm256_zextps128_ps256(v128 a)

Parameters

Type	Name	Description
v128	a

Returns

Type	Description
v256

mm256_zextsi128_si256(v128)

Casts vector of type v128 to type v256; the upper 128 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.

Declaration

public static v256 mm256_zextsi128_si256(v128 a)

Parameters

Type	Name	Description
v128	a

Returns

Type	Description
v256

permute_pd(v128, Int32)

Shuffle double-precision (64-bit) floating-point elements in a using the control in imm8, and store the results in dst.

Declaration

public static v128 permute_pd(v128 a, int imm8)

Parameters

Type	Name	Description
v128	a
Int32	imm8

Returns

Type	Description
v128

Remarks

**** VPERMILPD xmm1, xmm2/v128, imm8 Permute Double-Precision Floating-Point values in the first source operand using two, 1-bit control fields in the low 2 bits of the 8-bit immediate and store results in the destination

permute_ps(v128, Int32)

Shuffle single-precision (32-bit) floating-point elements in a using the control in imm8, and store the results in dst.

Declaration

public static v128 permute_ps(v128 a, int imm8)

Parameters

Type	Name	Description
v128	a
Int32	imm8

Returns

Type	Description
v128

Remarks

**** VPERMILPS xmm1, xmm2/v128, imm8 Permute Single-Precision Floating-Point values in the first source operand using four 2-bit control fields in the 8-bit immediate and store results in the destination

permutevar_pd(v128, v128)

Shuffle double-precision (64-bit) floating-point elements in a using the control in b, and store the results in dst.

Declaration

public static v128 permutevar_pd(v128 a, v128 b)

Parameters

Type	Name	Description
v128	a
v128	b

Returns

Type	Description
v128

Remarks

**** VPERMILPD xmm1, xmm2, xmm3/v128 Permute Double-Precision Floating-Point values in the first source operand using 8-bit control fields in the low bytes of the second source operand and store results in the destination

permutevar_ps(v128, v128)

Shuffle single-precision (32-bit) floating-point elements in a using the control in b, and store the results in dst.

Declaration

public static v128 permutevar_ps(v128 a, v128 b)

Parameters

Type	Name	Description
v128	a
v128	b

Returns

Type	Description
v128

Remarks

**** VPERMILPS xmm1, xmm2, xmm3/v128 Permute Single-Precision Floating-Point values in the first source operand using 8-bit control fields in the low bytes of corresponding elements the shuffle control and store results in the destination

testc_pd(v128, v128)

Compute the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

Declaration

public static int testc_pd(v128 a, v128 b)

Parameters

Type	Name	Description
v128	a
v128	b

Returns

Type	Description
Int32

Remarks

**** VTESTPD xmm1, xmm2/v128

testc_ps(v128, v128)

Compute the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

Declaration

public static int testc_ps(v128 a, v128 b)

Parameters

Type	Name	Description
v128	a
v128	b

Returns

Type	Description
Int32

Remarks

**** VTESTPS xmm1, xmm2/v128

testnzc_pd(v128, v128)

Declaration

public static int testnzc_pd(v128 a, v128 b)

Parameters

Type	Name	Description
v128	a
v128	b

Returns

Type	Description
Int32

Remarks

**** VTESTPD xmm1, xmm2/v128

testnzc_ps(v128, v128)

Declaration

public static int testnzc_ps(v128 a, v128 b)

Parameters

Type	Name	Description
v128	a
v128	b

Returns

Type	Description
Int32

Remarks

**** VTESTPS xmm1, xmm2/v128

testz_pd(v128, v128)

Declaration

public static int testz_pd(v128 a, v128 b)

Parameters

Type	Name	Description
v128	a
v128	b

Returns

Type	Description
Int32

Remarks

**** VTESTPD xmm1, xmm2/v128

testz_ps(v128, v128)

Declaration

public static int testz_ps(v128 a, v128 b)

Parameters

Type	Name	Description
v128	a
v128	b

Returns

Type	Description
Int32

Remarks

**** VTESTPS xmm1, xmm2/v128

undefined_pd()

Return a 128-bit vector with undefined contents.

Declaration

public static v128 undefined_pd()

Returns

Type	Description
v128

undefined_ps()

Return a 128-bit vector with undefined contents.

Declaration

public static v128 undefined_ps()

Returns

Type	Description
v128

undefined_si128()

Return a 128-bit vector with undefined contents.

Declaration

public static v128 undefined_si128()

Returns

Type	Description
v128