Class X86.Avx | Burst | 1.3.9
docs.unity3d.com
    Show / Hide Table of Contents

    Class X86.Avx

    AVX intrinsics

    Inheritance
    Object
    X86.Avx
    Inherited Members
    Object.Equals(Object)
    Object.Equals(Object, Object)
    Object.GetHashCode()
    Object.GetType()
    Object.MemberwiseClone()
    Object.ReferenceEquals(Object, Object)
    Object.ToString()
    Namespace: Unity.Burst.Intrinsics
    Syntax
    public static class Avx

    Properties

    IsAvxSupported

    Evaluates to true at compile time if AVX intrinsics are supported.

    Declaration
    public static bool IsAvxSupported { get; }
    Property Value
    Type Description
    Boolean

    Methods

    broadcast_ss(Void*)

    Broadcast a single-precision (32-bit) floating-point element from memory to all elements of dst.

    Declaration
    public static v128 broadcast_ss(void *ptr)
    Parameters
    Type Name Description
    Void* ptr
    Returns
    Type Description
    v128
    Remarks

    **** VBROADCASTSS xmm1, m32

    cmp_pd(v128, v128, Int32)

    Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

    Declaration
    public static v128 cmp_pd(v128 a, v128 b, int imm8)
    Parameters
    Type Name Description
    v128 a
    v128 b
    Int32 imm8
    Returns
    Type Description
    v128
    Remarks

    **** VCMPPD xmm1, xmm2, xmm3/v128, imm8 Performs an SIMD compare of the four packed double-precision floating-point values in the second source operand (third operand) and the first source operand (second operand) and returns the results of the comparison to the destination operand (first operand). The comparison predicate operand (immediate) specifies the type of comparison performed on each of the pairs of packed values. For 128-bit intrinsic function with compare predicate values in range 0-7 compiler may generate SSE2 instructions if it is warranted for performance reasons.

    cmp_ps(v128, v128, Int32)

    Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

    Declaration
    public static v128 cmp_ps(v128 a, v128 b, int imm8)
    Parameters
    Type Name Description
    v128 a
    v128 b
    Int32 imm8
    Returns
    Type Description
    v128
    Remarks

    **** VCMPPS xmm1, xmm2, xmm3/v256, imm8 Performs a SIMD compare of the packed single-precision floating-point values in the second source operand (third operand) and the first source operand (second operand) and returns the results of the comparison to the destination operand (first operand). The comparison predicate operand (immediate) specifies the type of comparison performed on each of the pairs of packed values. For 128-bit intrinsic function with compare predicate values in range 0-7 compiler may generate SSE2 instructions if it is warranted for performance reasons.

    cmp_sd(v128, v128, Int32)

    Compare the lower double-precision (64-bit) floating-point element in a and b based on the comparison operand specified by imm8, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.

    Declaration
    public static v128 cmp_sd(v128 a, v128 b, int imm8)
    Parameters
    Type Name Description
    v128 a
    v128 b
    Int32 imm8
    Returns
    Type Description
    v128
    Remarks

    **** VCMPSD xmm1, xmm2, xmm3/m64, imm8 Compares the low double-precision floating-point values in the second source operand (third operand) and the first source operand (second operand) and returns the results in of the comparison to the destination operand (first operand). The comparison predicate operand (immediate operand) specifies the type of comparison performed. For compare predicate values in range 0-7 compiler may generate SSE2 instructions if it is warranted for performance reasons.

    cmp_ss(v128, v128, Int32)

    Compare the lower single-precision (32-bit) floating-point element in a and b based on the comparison operand specified by imm8, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.

    Declaration
    public static v128 cmp_ss(v128 a, v128 b, int imm8)
    Parameters
    Type Name Description
    v128 a
    v128 b
    Int32 imm8
    Returns
    Type Description
    v128
    Remarks

    **** VCMPSS xmm1, xmm2, xmm3/m64, imm8 Compares the low single-precision floating-point values in the second source operand (third operand) and the first source operand (second operand) and returns the results of the comparison to the destination operand (first operand). The comparison predicate operand (immediate operand) specifies the type of comparison performed. For compare predicate values in range 0-7 compiler may generate SSE2 instructions if it is warranted for performance reasons.

    maskload_pd(Void*, v128)

    Load packed double-precision (64-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

    Declaration
    public static v128 maskload_pd(void *mem_addr, v128 mask)
    Parameters
    Type Name Description
    Void* mem_addr
    v128 mask
    Returns
    Type Description
    v128
    Remarks

    **** VMASKMOVPD xmm1, xmm2, v128

    maskload_ps(Void*, v128)

    Load packed single-precision (32-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

    Declaration
    public static v128 maskload_ps(void *mem_addr, v128 mask)
    Parameters
    Type Name Description
    Void* mem_addr
    v128 mask
    Returns
    Type Description
    v128
    Remarks

    **** VMASKMOVPS xmm1, xmm2, v128

    maskstore_pd(Void*, v128, v128)

    Store packed double-precision (64-bit) floating-point elements from a into memory using mask.

    Declaration
    public static void maskstore_pd(void *mem_addr, v128 mask, v128 a)
    Parameters
    Type Name Description
    Void* mem_addr
    v128 mask
    v128 a
    Remarks

    **** VMASKMOVPD v128, xmm1, xmm2

    maskstore_ps(Void*, v128, v128)

    Store packed single-precision (32-bit) floating-point elements from a into memory using mask.

    Declaration
    public static void maskstore_ps(void *mem_addr, v128 mask, v128 a)
    Parameters
    Type Name Description
    Void* mem_addr
    v128 mask
    v128 a
    Remarks

    **** VMASKMOVPS v128, xmm1, xmm2

    mm256_add_pd(v256, v256)

    Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_add_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256

    mm256_add_ps(v256, v256)

    Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_add_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256

    mm256_addsub_pd(v256, v256)

    Alternatively add and subtract packed double-precision (64-bit) floating-point elements in a to/from packed elements in b, and store the results in dst.

    Declaration
    public static v256 mm256_addsub_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256

    mm256_addsub_ps(v256, v256)

    Alternatively add and subtract packed single-precision (32-bit) floating-point elements in a to/from packed elements in b, and store the results in dst.

    Declaration
    public static v256 mm256_addsub_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256

    mm256_and_pd(v256, v256)

    Compute the bitwise AND of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_and_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256

    mm256_and_ps(v256, v256)

    Compute the bitwise AND of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_and_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256

    mm256_andnot_pd(v256, v256)

    Compute the bitwise NOT of packed double-precision (64-bit) floating-point elements in a and then AND with b, and store the results in dst.

    Declaration
    public static v256 mm256_andnot_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256

    mm256_andnot_ps(v256, v256)

    Compute the bitwise NOT of packed single-precision (32-bit) floating-point elements in a and then AND with b, and store the results in dst.

    Declaration
    public static v256 mm256_andnot_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256

    mm256_blend_pd(v256, v256, Int32)

    Blend packed double-precision (64-bit) floating-point elements from a and b using control mask imm8, and store the results in dst.

    Declaration
    public static v256 mm256_blend_pd(v256 a, v256 b, int imm8)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VBLENDPD ymm1, ymm2, ymm3/v256, imm8 Double-Precision Floating-Point values from the second source operand are conditionally merged with values from the first source operand and written to the destination. The immediate bits [3:0] determine whether the corresponding Double-Precision Floating Point value in the destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is "1", then the Double-Precision Floating-Point value in the second source operand is copied, else the value in the first source operand is copied

    mm256_blend_ps(v256, v256, Int32)

    Blend packed single-precision (32-bit) floating-point elements from a and b using control mask imm8, and store the results in dst.

    Declaration
    public static v256 mm256_blend_ps(v256 a, v256 b, int imm8)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VBLENDPS ymm1, ymm2, ymm3/v256, imm8 Single precision floating point values from the second source operand are conditionally merged with values from the first source operand and written to the destination. The immediate bits [7:0] determine whether the corresponding single precision floating-point value in the destination is copied from the second source or first source. If a bit in the mask, corresponding to a word, is "1", then the single-precision floating-point value in the second source operand is copied, else the value in the first source operand is copied

    mm256_blendv_pd(v256, v256, v256)

    Blend packed double-precision (64-bit) floating-point elements from a and b using mask, and store the results in dst.

    Declaration
    public static v256 mm256_blendv_pd(v256 a, v256 b, v256 mask)
    Parameters
    Type Name Description
    v256 a
    v256 b
    v256 mask
    Returns
    Type Description
    v256
    Remarks

    **** VBLENDVPD ymm1, ymm2, ymm3/v256, ymm4 Conditionally copy each quadword data element of double-precision floating-point value from the second source operand (third operand) and the first source operand (second operand) depending on mask bits defined in the mask register operand (fourth operand).

    mm256_blendv_ps(v256, v256, v256)

    Blend packed single-precision (32-bit) floating-point elements from a and b using mask, and store the results in dst.

    Declaration
    public static v256 mm256_blendv_ps(v256 a, v256 b, v256 mask)
    Parameters
    Type Name Description
    v256 a
    v256 b
    v256 mask
    Returns
    Type Description
    v256
    Remarks

    Blend Packed Single Precision Floating-Point Values **** VBLENDVPS ymm1, ymm2, ymm3/v256, ymm4 Conditionally copy each dword data element of single-precision floating-point value from the second source operand (third operand) and the first source operand (second operand) depending on mask bits defined in the mask register operand (fourth operand).

    mm256_broadcast_pd(Void*)

    Broadcast 128 bits from memory (composed of 2 packed double-precision (64-bit) floating-point elements) to all elements of dst.

    Declaration
    public static v256 mm256_broadcast_pd(void *ptr)
    Parameters
    Type Name Description
    Void* ptr
    Returns
    Type Description
    v256

    **** VBROADCASTF128 ymm1, v128

    mm256_broadcast_ps(Void*)

    Broadcast 128 bits from memory (composed of 4 packed single-precision (32-bit) floating-point elements) to all elements of dst.

    Declaration
    public static v256 mm256_broadcast_ps(void *ptr)
    Parameters
    Type Name Description
    Void* ptr
    Returns
    Type Description
    v256
    Remarks

    **** VBROADCASTF128 ymm1, v128

    mm256_broadcast_sd(Void*)

    Broadcast a double-precision (64-bit) floating-point element from memory to all elements of dst.

    Declaration
    public static v256 mm256_broadcast_sd(void *ptr)
    Parameters
    Type Name Description
    Void* ptr
    Returns
    Type Description
    v256
    Remarks

    **** VBROADCASTSD ymm1, m64

    mm256_broadcast_ss(Void*)

    Broadcast a single-precision (32-bit) floating-point element from memory to all elements of dst.

    Declaration
    public static v256 mm256_broadcast_ss(void *ptr)
    Parameters
    Type Name Description
    Void* ptr
    Returns
    Type Description
    v256
    Remarks

    **** VBROADCASTSS ymm1, m32

    mm256_castpd_ps(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    Declaration
    public static v256 mm256_castpd_ps(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256

    mm256_castpd_si256(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    Declaration
    public static v256 mm256_castpd_si256(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256

    mm256_castpd128_pd256(v128)

    For compatibility with C++ code only. This is a no-op in Burst.

    Declaration
    public static v256 mm256_castpd128_pd256(v128 a)
    Parameters
    Type Name Description
    v128 a
    Returns
    Type Description
    v256

    mm256_castpd256_pd128(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    Declaration
    public static v128 mm256_castpd256_pd128(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v128

    mm256_castps_pd(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    Declaration
    public static v256 mm256_castps_pd(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256

    mm256_castps_si256(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    Declaration
    public static v256 mm256_castps_si256(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256

    mm256_castps128_ps256(v128)

    For compatibility with C++ code only. This is a no-op in Burst.

    Declaration
    public static v256 mm256_castps128_ps256(v128 a)
    Parameters
    Type Name Description
    v128 a
    Returns
    Type Description
    v256

    mm256_castps256_ps128(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    Declaration
    public static v128 mm256_castps256_ps128(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v128

    mm256_castsi128_si256(v128)

    For compatibility with C++ code only. This is a no-op in Burst.

    Declaration
    public static v256 mm256_castsi128_si256(v128 a)
    Parameters
    Type Name Description
    v128 a
    Returns
    Type Description
    v256

    mm256_castsi256_pd(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    Declaration
    public static v256 mm256_castsi256_pd(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256

    mm256_castsi256_ps(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    Declaration
    public static v256 mm256_castsi256_ps(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256

    mm256_castsi256_si128(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    Declaration
    public static v128 mm256_castsi256_si128(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v128

    mm256_ceil_pd(v256)

    Round the packed double-precision (64-bit) floating-point elements in a up to an integer value, and store the results as packed double-precision floating-point elements in dst.

    Declaration
    public static v256 mm256_ceil_pd(v256 val)
    Parameters
    Type Name Description
    v256 val
    Returns
    Type Description
    v256

    mm256_ceil_ps(v256)

    Round the packed single-precision (32-bit) floating-point elements in a up to an integer value, and store the results as packed single-precision floating-point elements in dst.

    Declaration
    public static v256 mm256_ceil_ps(v256 val)
    Parameters
    Type Name Description
    v256 val
    Returns
    Type Description
    v256

    mm256_cmp_pd(v256, v256, Int32)

    Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

    Declaration
    public static v256 mm256_cmp_pd(v256 a, v256 b, int imm8)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VCMPPD ymm1, ymm2, ymm3/v256, imm8 Performs an SIMD compare of the four packed double-precision floating-point values in the second source operand (third operand) and the first source operand (second operand) and returns the results of the comparison to the destination operand (first operand). The comparison predicate operand (immediate) specifies the type of comparison performed on each of the pairs of packed values.

    mm256_cmp_ps(v256, v256, Int32)

    Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

    Declaration
    public static v256 mm256_cmp_ps(v256 a, v256 b, int imm8)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VCMPPS xmm1, xmm2, xmm3/v256, imm8 Performs a SIMD compare of the packed single-precision floating-point values in the second source operand (third operand) and the first source operand (second operand) and returns the results of the comparison to the destination operand (first operand). The comparison predicate operand (immediate) specifies the type of comparison performed on each of the pairs of packed values.

    mm256_cvtepi32_pd(v128)

    Convert packed 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.

    Declaration
    public static v256 mm256_cvtepi32_pd(v128 a)
    Parameters
    Type Name Description
    v128 a
    Returns
    Type Description
    v256
    Remarks

    **** VCVTDQ2PD ymm1, xmm2/v128 Converts four packed signed doubleword integers in the source operand to four packed double-precision floating-point values in the destination

    mm256_cvtepi32_ps(v256)

    Convert packed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.

    Declaration
    public static v256 mm256_cvtepi32_ps(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256
    Remarks

    **** VCVTDQ2PS ymm1, ymm2/v256 Converts eight packed signed doubleword integers in the source operand to eight packed double-precision floating-point values in the destination

    mm256_cvtpd_epi32(v256)

    Convert packed double-precision(64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.

    Declaration
    public static v128 mm256_cvtpd_epi32(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v128
    Remarks

    **** VCVTPD2DQ xmm1, ymm2/v256 Converts four packed double-precision floating-point values in the source operand to four packed signed doubleword integers in the destination

    mm256_cvtpd_ps(v256)

    Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.

    Declaration
    public static v128 mm256_cvtpd_ps(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v128
    Remarks

    **** VCVTPD2PS xmm1, ymm2/v256 Converts four packed double-precision floating-point values in the source operand to four packed single-precision floating-point values in the destination

    mm256_cvtps_epi32(v256)

    Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.

    Declaration
    public static v256 mm256_cvtps_epi32(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256
    Remarks

    **** VCVTPS2DQ ymm1, ymm2/v256 Converts eight packed single-precision floating-point values in the source operand to eight signed doubleword integers in the destination

    mm256_cvtps_pd(v128)

    Convert packed single-precision (32-bit) floating-point elements in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.

    Declaration
    public static v256 mm256_cvtps_pd(v128 a)
    Parameters
    Type Name Description
    v128 a
    Returns
    Type Description
    v256
    Remarks

    **** VCVTPS2PD ymm1, xmm2/v128 Converts four packed single-precision floating-point values in the source operand to four packed double-precision floating-point values in the destination

    mm256_cvtss_f32(v256)

    Copy the lower single-precision (32-bit) floating-point element of a to dst.

    Declaration
    public static float mm256_cvtss_f32(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    Single
    Remarks

    Identical in HPC# to accessing Float0, kept for compatibility with existing code while porting.

    mm256_cvttpd_epi32(v256)

    Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.

    Declaration
    public static v128 mm256_cvttpd_epi32(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v128
    Remarks

    **** VCVTTPD2DQ xmm1, ymm2/v256 Converts four packed double-precision floating-point values in the source operand to four packed signed doubleword integers in the destination. When a conversion is inexact, a truncated (round toward zero) value is returned. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned

    mm256_cvttps_epi32(v256)

    Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.

    Declaration
    public static v256 mm256_cvttps_epi32(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256
    Remarks

    **** VCVTTPS2DQ ymm1, ymm2/v256 Converts eight packed single-precision floating-point values in the source operand to eight signed doubleword integers in the destination. When a conversion is inexact, a truncated (round toward zero) value is returned. If a converted result is larger than the maximum signed doubleword integer, the floating-point invalid exception is raised, and if this exception is masked, the indefinite integer value (80000000H) is returned

    mm256_div_pd(v256, v256)

    Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst.

    Declaration
    public static v256 mm256_div_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VDIVPD ymm1, ymm2, ymm3/v256 Performs an SIMD divide of the four packed double-precision floating-point values in the first source operand by the four packed double-precision floating-point values in the second source operand

    mm256_div_ps(v256, v256)

    Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst.

    Declaration
    public static v256 mm256_div_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    Divide Packed Single-Precision Floating-Point Values **** VDIVPS ymm1, ymm2, ymm3/v256 Performs an SIMD divide of the eight packed single-precision floating-point values in the first source operand by the eight packed single-precision floating-point values in the second source operand

    mm256_dp_ps(v256, v256, Int32)

    Conditionally multiply the packed single-precision (32-bit) floating-point elements in a and b using the high 4 bits in imm8, sum the four products, and conditionally store the sum in dst using the low 4 bits of imm8.

    Declaration
    public static v256 mm256_dp_ps(v256 a, v256 b, int imm8)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VDPPS ymm1, ymm2, ymm3/v256, imm8 Multiplies the packed single precision floating point values in the first source operand with the packed single-precision floats in the second source. Each of the four resulting single-precision values is conditionally summed depending on a mask extracted from the high 4 bits of the immediate operand. This sum is broadcast to each of 4 positions in the destination if the corresponding bit of the mask selected from the low 4 bits of the immediate operand is "1". If the corresponding low bit 0-3 of the mask is zero, the destination is set to zero. The process is replicated for the high elements of the destination.

    mm256_extract_epi32(v256, Int32)

    Extract a 32-bit integer from a, selected with index (which must be a constant), and store the result in dst.

    Declaration
    public static int mm256_extract_epi32(v256 a, int index)
    Parameters
    Type Name Description
    v256 a
    Int32 index
    Returns
    Type Description
    Int32

    mm256_extract_epi64(v256, Int32)

    Extract a 64-bit integer from a, selected with index (which must be a constant), and store the result in dst.

    Declaration
    public static long mm256_extract_epi64(v256 a, int index)
    Parameters
    Type Name Description
    v256 a
    Int32 index
    Returns
    Type Description
    Int64

    mm256_extractf128_pd(v256, Int32)

    Extract 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with imm8, and store the result in dst.

    Declaration
    public static v128 mm256_extractf128_pd(v256 a, int imm8)
    Parameters
    Type Name Description
    v256 a
    Int32 imm8
    Returns
    Type Description
    v128
    Remarks

    **** VEXTRACTF128 xmm1/v128, ymm2, imm8

    mm256_extractf128_ps(v256, Int32)

    Extract 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a, selected with imm8, and store the result in dst.

    Declaration
    public static v128 mm256_extractf128_ps(v256 a, int imm8)
    Parameters
    Type Name Description
    v256 a
    Int32 imm8
    Returns
    Type Description
    v128
    Remarks

    **** VEXTRACTF128 xmm1/v128, ymm2, imm8

    mm256_extractf128_si256(v256, Int32)

    Extract 128 bits (composed of integer data) from a, selected with imm8, and store the result in dst.

    Declaration
    public static v128 mm256_extractf128_si256(v256 a, int imm8)
    Parameters
    Type Name Description
    v256 a
    Int32 imm8
    Returns
    Type Description
    v128
    Remarks

    **** VEXTRACTF128 xmm1/v128, ymm2, imm8

    mm256_floor_pd(v256)

    Round the packed double-precision (64-bit) floating-point elements in a down to an integer value, and store the results as packed double-precision floating-point elements in dst.

    Declaration
    public static v256 mm256_floor_pd(v256 val)
    Parameters
    Type Name Description
    v256 val
    Returns
    Type Description
    v256

    mm256_floor_ps(v256)

    Round the packed single-precision (32-bit) floating-point elements in a down to an integer value, and store the results as packed single-precision floating-point elements in dst.

    Declaration
    public static v256 mm256_floor_ps(v256 val)
    Parameters
    Type Name Description
    v256 val
    Returns
    Type Description
    v256

    mm256_hadd_pd(v256, v256)

    Horizontally add adjacent pairs of double-precision (64-bit) floating-point elements in a and b, and pack the results in dst.

    Declaration
    public static v256 mm256_hadd_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VHADDPD ymm1, ymm2, ymm3/v256 Adds pairs of adjacent double-precision floating-point values in the first source operand and second source operand and stores results in the destination

    mm256_hadd_ps(v256, v256)

    Horizontally add adjacent pairs of single-precision (32-bit) floating-point elements in a and b, and pack the results in dst.

    Declaration
    public static v256 mm256_hadd_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VHADDPS ymm1, ymm2, ymm3/v256 Adds pairs of adjacent single-precision floating-point values in the first source operand and second source operand and stores results in the destination

    mm256_hsub_pd(v256, v256)

    Horizontally subtract adjacent pairs of double-precision (64-bit) floating-point elements in a and b, and pack the results in dst.

    Declaration
    public static v256 mm256_hsub_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VHSUBPD ymm1, ymm2, ymm3/v256 Subtract pairs of adjacent double-precision floating-point values in the first source operand and second source operand and stores results in the destination

    mm256_hsub_ps(v256, v256)

    Horizontally add adjacent pairs of single-precision (32-bit) floating-point elements in a and b, and pack the results in dst.

    Declaration
    public static v256 mm256_hsub_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VHSUBPS ymm1, ymm2, ymm3/v256 Subtract pairs of adjacent single-precision floating-point values in the first source operand and second source operand and stores results in the destination.

    mm256_insert_epi16(v256, Int32, Int32)

    Copy a to dst, and insert the 16-bit integer i into dst at the location specified by index (which must be a constant).

    Declaration
    public static v256 mm256_insert_epi16(v256 a, int i, int index)
    Parameters
    Type Name Description
    v256 a
    Int32 i
    Int32 index
    Returns
    Type Description
    v256

    mm256_insert_epi32(v256, Int32, Int32)

    Copy a to dst, and insert the 32-bit integer i into dst at the location specified by index (which must be a constant).

    Declaration
    public static v256 mm256_insert_epi32(v256 a, int i, int index)
    Parameters
    Type Name Description
    v256 a
    Int32 i
    Int32 index
    Returns
    Type Description
    v256

    mm256_insert_epi64(v256, Int64, Int32)

    Copy a to dst, and insert the 64-bit integer i into dst at the location specified by index (which must be a constant).

    Declaration
    public static v256 mm256_insert_epi64(v256 a, long i, int index)
    Parameters
    Type Name Description
    v256 a
    Int64 i
    Int32 index
    Returns
    Type Description
    v256
    Remarks

    This intrinsic requires a 64-bit processor.

    mm256_insert_epi8(v256, Int32, Int32)

    Copy a to dst, and insert the 8-bit integer i into dst at the location specified by index (which must be a constant).

    Declaration
    public static v256 mm256_insert_epi8(v256 a, int i, int index)
    Parameters
    Type Name Description
    v256 a
    Int32 i
    Int32 index
    Returns
    Type Description
    v256

    mm256_insertf128_pd(v256, v128, Int32)

    Copy a to dst, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into dst at the location specified by imm8.

    Declaration
    public static v256 mm256_insertf128_pd(v256 a, v128 b, int imm8)
    Parameters
    Type Name Description
    v256 a
    v128 b
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VINSERTF128 ymm1, ymm2, xmm3/v128, imm8 Performs an insertion of 128-bits of packed floating-point values from the second source operand into an the destination at an 128-bit offset from imm8[0]. The remaining portions of the destination are written by the corresponding fields of the first source operand

    mm256_insertf128_ps(v256, v128, Int32)

    Copy a to dst, then insert 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from b into dst at the location specified by imm8.

    Declaration
    public static v256 mm256_insertf128_ps(v256 a, v128 b, int imm8)
    Parameters
    Type Name Description
    v256 a
    v128 b
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VINSERTF128 ymm1, ymm2, xmm3/v128, imm8 Performs an insertion of 128-bits of packed floating-point values from the second source operand into an the destination at an 128-bit offset from imm8[0]. The remaining portions of the destination are written by the corresponding fields of the first source operand

    mm256_insertf128_si256(v256, v128, Int32)

    Copy a to dst, then insert 128 bits of integer data from b into dst at the location specified by imm8.

    Declaration
    public static v256 mm256_insertf128_si256(v256 a, v128 b, int imm8)
    Parameters
    Type Name Description
    v256 a
    v128 b
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VINSERTF128 ymm1, ymm2, xmm3/v128, imm8 Performs an insertion of 128-bits of packed floating-point values from the second source operand into an the destination at an 128-bit offset from imm8[0]. The remaining portions of the destination are written by the corresponding fields of the first source operand

    mm256_lddqu_si256(Void*)

    Load 256-bits of integer data from unaligned memory into dst. This intrinsic may perform better than mm256_loadu_si256 when the data crosses a cache line boundary.

    Declaration
    public static v256 mm256_lddqu_si256(void *mem_addr)
    Parameters
    Type Name Description
    Void* mem_addr
    Returns
    Type Description
    v256
    Remarks

    **** VLDDQU ymm1, v256

    mm256_load_pd(Void*)

    Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory

    Declaration
    public static v256 mm256_load_pd(void *ptr)
    Parameters
    Type Name Description
    Void* ptr
    Returns
    Type Description
    v256
    Remarks

    **** VMOVUPS ymm1, v256 Burst only generates unaligned stores.

    mm256_load_ps(Void*)

    Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory

    Declaration
    public static v256 mm256_load_ps(void *ptr)
    Parameters
    Type Name Description
    Void* ptr
    Returns
    Type Description
    v256
    Remarks

    **** VMOVUPS ymm1, v256 Burst only generates unaligned stores.

    mm256_load_si256(Void*)

    Load 256-bits (composed of 8 packed 32-bit integers elements) from memory

    Declaration
    public static v256 mm256_load_si256(void *ptr)
    Parameters
    Type Name Description
    Void* ptr
    Returns
    Type Description
    v256
    Remarks

    **** VMOVDQU ymm1, v256 Burst only generates unaligned stores.

    mm256_loadu_pd(Void*)

    Load 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from memory

    Declaration
    public static v256 mm256_loadu_pd(void *ptr)
    Parameters
    Type Name Description
    Void* ptr
    Returns
    Type Description
    v256
    Remarks

    **** VMOVUPS ymm1, v256 Burst only generates unaligned stores.

    mm256_loadu_ps(Void*)

    Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory

    Declaration
    public static v256 mm256_loadu_ps(void *ptr)
    Parameters
    Type Name Description
    Void* ptr
    Returns
    Type Description
    v256
    Remarks

    **** VMOVUPS ymm1, v256 Burst only generates unaligned stores.

    mm256_loadu_si256(Void*)

    Load 256-bits (composed of 8 packed 32-bit integers elements) from memory

    Declaration
    public static v256 mm256_loadu_si256(void *ptr)
    Parameters
    Type Name Description
    Void* ptr
    Returns
    Type Description
    v256
    Remarks

    **** VMOVDQU ymm1, v256 Burst only generates unaligned stores.

    mm256_loadu2_m128(Void*, Void*)

    Load two 128-bit values (composed of 4 packed single-precision (32-bit) floating-point elements) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.

    Declaration
    public static v256 mm256_loadu2_m128(void *hiaddr, void *loaddr)
    Parameters
    Type Name Description
    Void* hiaddr
    Void* loaddr
    Returns
    Type Description
    v256
    Remarks

    This is a composite function which can generate more than one instruction.

    mm256_loadu2_m128d(Void*, Void*)

    Load two 128-bit values (composed of 2 packed double-precision (64-bit) floating-point elements) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.

    Declaration
    public static v256 mm256_loadu2_m128d(void *hiaddr, void *loaddr)
    Parameters
    Type Name Description
    Void* hiaddr
    Void* loaddr
    Returns
    Type Description
    v256
    Remarks

    This is a composite function which can generate more than one instruction.

    mm256_loadu2_m128i(Void*, Void*)

    Load two 128-bit values (composed of integer data) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.

    Declaration
    public static v256 mm256_loadu2_m128i(void *hiaddr, void *loaddr)
    Parameters
    Type Name Description
    Void* hiaddr
    Void* loaddr
    Returns
    Type Description
    v256
    Remarks

    This is a composite function which can generate more than one instruction.

    mm256_maskload_pd(Void*, v256)

    Load packed double-precision (64-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

    Declaration
    public static v256 mm256_maskload_pd(void *mem_addr, v256 mask)
    Parameters
    Type Name Description
    Void* mem_addr
    v256 mask
    Returns
    Type Description
    v256
    Remarks

    **** VMASKMOVPD ymm1, ymm2, v256

    mm256_maskload_ps(Void*, v256)

    Load packed single-precision (32-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

    Declaration
    public static v256 mm256_maskload_ps(void *mem_addr, v256 mask)
    Parameters
    Type Name Description
    Void* mem_addr
    v256 mask
    Returns
    Type Description
    v256
    Remarks

    **** VMASKMOVPS ymm1, ymm2, v256

    mm256_maskstore_pd(Void*, v256, v256)

    Store packed double-precision (64-bit) floating-point elements from a into memory using mask.

    Declaration
    public static void mm256_maskstore_pd(void *mem_addr, v256 mask, v256 a)
    Parameters
    Type Name Description
    Void* mem_addr
    v256 mask
    v256 a
    Remarks

    **** VMASKMOVPD v256, ymm1, ymm2

    mm256_maskstore_ps(Void*, v256, v256)

    Store packed single-precision (32-bit) floating-point elements from a into memory using mask.

    Declaration
    public static void mm256_maskstore_ps(void *mem_addr, v256 mask, v256 a)
    Parameters
    Type Name Description
    Void* mem_addr
    v256 mask
    v256 a
    Remarks

    **** VMASKMOVPS v256, ymm1, ymm2

    mm256_max_pd(v256, v256)

    Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst.

    Declaration
    public static v256 mm256_max_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VMAXPD ymm1, ymm2, ymm3/v256 Performs an SIMD compare of the packed double-precision floating-point values in the first source operand and the second source operand and returns the maximum value for each pair of values to the destination

    mm256_max_ps(v256, v256)

    Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst.

    Declaration
    public static v256 mm256_max_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VMAXPS ymm1, ymm2, ymm3/v256 Performs an SIMD compare of the packed single-precision floating-point values in the first source operand and the second source operand and returns the maximum value for each pair of values to the destination

    mm256_min_pd(v256, v256)

    Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst.

    Declaration
    public static v256 mm256_min_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VMINPD ymm1, ymm2, ymm3/v256 Performs an SIMD compare of the packed double-precision floating-point values in the first source operand and the second source operand and returns the minimum value for each pair of values to the destination

    mm256_min_ps(v256, v256)

    Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst.

    Declaration
    public static v256 mm256_min_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VMINPS ymm1, ymm2, ymm3/v256 Performs an SIMD compare of the packed single-precision floating-point values in the first source operand and the second source operand and returns the minimum value for each pair of values to the destination

    mm256_movedup_pd(v256)

    Duplicate even-indexed double-precision (64-bit) floating-point elements from a, and store the results in dst.

    Declaration
    public static v256 mm256_movedup_pd(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256
    Remarks

    **** VMOVDDUP ymm1, ymm2/v256

    mm256_movehdup_ps(v256)

    Duplicate odd-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst.

    Declaration
    public static v256 mm256_movehdup_ps(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256
    Remarks

    **** VMOVSHDUP ymm1, ymm2/v256

    mm256_moveldup_ps(v256)

    Duplicate even-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst.

    Declaration
    public static v256 mm256_moveldup_ps(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256
    Remarks

    **** VMOVSLDUP ymm1, ymm2/v256

    mm256_movemask_pd(v256)

    Set each bit of mask dst based on the most significant bit of the corresponding packed double-precision (64-bit) floating-point element in a.

    Declaration
    public static int mm256_movemask_pd(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    Int32
    Remarks

    **** VMOVMSKPD r32, ymm2 Extracts the sign bits from the packed double-precision floating-point values in the source operand, formats them into a 4-bit mask, and stores the mask in the destination

    mm256_movemask_ps(v256)

    Set each bit of mask dst based on the most significant bit of the corresponding packed single-precision (32-bit) floating-point element in a.

    Declaration
    public static int mm256_movemask_ps(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    Int32
    Remarks

    **** VMOVMSKPS r32, ymm2 Extracts the sign bits from the packed single-precision floating-point values in the source operand, formats them into a 8-bit mask, and stores the mask in the destination

    mm256_mul_pd(v256, v256)

    Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_mul_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VMULPD ymm1, ymm2, ymm3/v256 Performs a SIMD multiply of the four packed double-precision floating-point values from the first Source operand to the Second Source operand, and stores the packed double-precision floating-point results in the destination

    mm256_mul_ps(v256, v256)

    Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_mul_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VMULPS ymm1, ymm2, ymm3/v256 Performs an SIMD multiply of the eight packed single-precision floating-point values from the first source operand to the second source operand, and stores the packed double-precision floating-point results in the destination

    mm256_or_pd(v256, v256)

    Compute the bitwise OR of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_or_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VORPD ymm1, ymm2, ymm3/v256 Performs a bitwise logical OR of the four packed double-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination

    mm256_or_ps(v256, v256)

    Compute the bitwise OR of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_or_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VORPS ymm1, ymm2, ymm3/v256 Performs a bitwise logical OR of the eight packed single-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination

    mm256_permute_pd(v256, Int32)

    Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.

    Declaration
    public static v256 mm256_permute_pd(v256 a, int imm8)
    Parameters
    Type Name Description
    v256 a
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VPERMILPD ymm1, ymm2/v256, imm8 Permute Double-Precision Floating-Point values in the first source operand using two, 1-bit control fields in the low 2 bits of the 8-bit immediate and store results in the destination

    mm256_permute_ps(v256, Int32)

    Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.

    Declaration
    public static v256 mm256_permute_ps(v256 a, int imm8)
    Parameters
    Type Name Description
    v256 a
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VPERMILPS ymm1, ymm2/v256, imm8 Permute Single-Precision Floating-Point values in the first source operand using four 2-bit control fields in the 8-bit immediate and store results in the destination

    mm256_permute2f128_pd(v256, v256, Int32)

    Shuffle 128-bits (composed of 2 packed double-precision (64-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.

    Declaration
    public static v256 mm256_permute2f128_pd(v256 a, v256 b, int imm8)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VPERM2F128 ymm1, ymm2, ymm3/v256, imm8 Permute 128 bit floating-point-containing fields from the first source operand and second source operand using bits in the 8-bit immediate and store results in the destination

    mm256_permute2f128_ps(v256, v256, Int32)

    Shuffle 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.

    Declaration
    public static v256 mm256_permute2f128_ps(v256 a, v256 b, int imm8)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VPERM2F128 ymm1, ymm2, ymm3/v256, imm8 Permute 128 bit floating-point-containing fields from the first source operand and second source operand using bits in the 8-bit immediate and store results in the destination

    mm256_permute2f128_si256(v256, v256, Int32)

    Shuffle 128-bits (composed of integer data) selected by imm8 from a and b, and store the results in dst.

    Declaration
    public static v256 mm256_permute2f128_si256(v256 a, v256 b, int imm8)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VPERM2F128 ymm1, ymm2, ymm3/v256, imm8 Permute 128 bit floating-point-containing fields from the first source operand and second source operand using bits in the 8-bit immediate and store results in the destination

    mm256_permutevar_pd(v256, v256)

    Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst.

    Declaration
    public static v256 mm256_permutevar_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VPERMILPD ymm1, ymm2, ymm3/v256 Permute Double-Precision Floating-Point values in the first source operand using 8-bit control fields in the low bytes of the second source operand and store results in the destination

    mm256_permutevar_ps(v256, v256)

    Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst.

    Declaration
    public static v256 mm256_permutevar_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VPERMILPS ymm1, ymm2, ymm3/v256 Permute Single-Precision Floating-Point values in the first source operand using 8-bit control fields in the low bytes of corresponding elements the shuffle control and store results in the destination

    mm256_rcp_ps(v256)

    Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 1.5*2^-12.

    Declaration
    public static v256 mm256_rcp_ps(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256
    Remarks

    **** VRCPPS ymm1, ymm2/v256

    mm256_round_pd(v256, Int32)

    Round the packed double-precision (64-bit) floating-point elements in a using the rounding parameter, and store the results as packed double-precision floating-point elements in dst.

    Declaration
    public static v256 mm256_round_pd(v256 a, int rounding)
    Parameters
    Type Name Description
    v256 a
    Int32 rounding
    Returns
    Type Description
    v256
    Remarks

    **** VROUNDPD ymm1,ymm2/v256,imm8 Rounding is done according to the rounding parameter, which can be one of: (_MM_FROUND_TO_NEAREST_INT |_MM_FROUND_NO_EXC) // round to nearest, and suppress exceptions (_MM_FROUND_TO_NEG_INF |_MM_FROUND_NO_EXC) // round down, and suppress exceptions (_MM_FROUND_TO_POS_INF |_MM_FROUND_NO_EXC) // round up, and suppress exceptions (_MM_FROUND_TO_ZERO |_MM_FROUND_NO_EXC) // truncate, and suppress exceptions _MM_FROUND_CUR_DIRECTION // use MXCSR.RC; see _MM_SET_ROUNDING_MODE

    mm256_round_ps(v256, Int32)

    Round the packed single-precision (32-bit) floating-point elements in a using the rounding parameter, and store the results as packed single-precision floating-point elements in dst.

    Declaration
    public static v256 mm256_round_ps(v256 a, int rounding)
    Parameters
    Type Name Description
    v256 a
    Int32 rounding
    Returns
    Type Description
    v256
    Remarks

    **** VROUNDPS ymm1,ymm2/v256,imm8 Round the four single-precision floating-point values values in the source operand by the rounding mode specified in the immediate operand and place the result in the destination. The rounding process rounds the input to an integral value and returns the result as a double-precision floating-point value. The Precision Floating Point Exception is signaled according to the immediate operand. If any source operand is an SNaN then it will be converted to a QNaN.

    mm256_rsqrt_ps(v256)

    Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 1.5*2^-12.

    Declaration
    public static v256 mm256_rsqrt_ps(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256
    Remarks

    **** VRSQRTPS ymm1, ymm2/v256

    mm256_set_epi16(Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16)

    Set packed short elements in dst with the supplied values.

    Declaration
    public static v256 mm256_set_epi16(short e15_, short e14_, short e13_, short e12_, short e11_, short e10_, short e9_, short e8_, short e7_, short e6_, short e5_, short e4_, short e3_, short e2_, short e1_, short e0_)
    Parameters
    Type Name Description
    Int16 e15_
    Int16 e14_
    Int16 e13_
    Int16 e12_
    Int16 e11_
    Int16 e10_
    Int16 e9_
    Int16 e8_
    Int16 e7_
    Int16 e6_
    Int16 e5_
    Int16 e4_
    Int16 e3_
    Int16 e2_
    Int16 e1_
    Int16 e0_
    Returns
    Type Description
    v256

    mm256_set_epi32(Int32, Int32, Int32, Int32, Int32, Int32, Int32, Int32)

    Set packed int elements in dst with the supplied values.

    Declaration
    public static v256 mm256_set_epi32(int e7, int e6, int e5, int e4, int e3, int e2, int e1, int e0)
    Parameters
    Type Name Description
    Int32 e7
    Int32 e6
    Int32 e5
    Int32 e4
    Int32 e3
    Int32 e2
    Int32 e1
    Int32 e0
    Returns
    Type Description
    v256

    mm256_set_epi64x(Int64, Int64, Int64, Int64)

    Set packed 64-bit integers in dst with the supplied values.

    Declaration
    public static v256 mm256_set_epi64x(long e3, long e2, long e1, long e0)
    Parameters
    Type Name Description
    Int64 e3
    Int64 e2
    Int64 e1
    Int64 e0
    Returns
    Type Description
    v256

    mm256_set_epi8(Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte)

    Set packed byte elements in dst with the supplied values.

    Declaration
    public static v256 mm256_set_epi8(byte e31_, byte e30_, byte e29_, byte e28_, byte e27_, byte e26_, byte e25_, byte e24_, byte e23_, byte e22_, byte e21_, byte e20_, byte e19_, byte e18_, byte e17_, byte e16_, byte e15_, byte e14_, byte e13_, byte e12_, byte e11_, byte e10_, byte e9_, byte e8_, byte e7_, byte e6_, byte e5_, byte e4_, byte e3_, byte e2_, byte e1_, byte e0_)
    Parameters
    Type Name Description
    Byte e31_
    Byte e30_
    Byte e29_
    Byte e28_
    Byte e27_
    Byte e26_
    Byte e25_
    Byte e24_
    Byte e23_
    Byte e22_
    Byte e21_
    Byte e20_
    Byte e19_
    Byte e18_
    Byte e17_
    Byte e16_
    Byte e15_
    Byte e14_
    Byte e13_
    Byte e12_
    Byte e11_
    Byte e10_
    Byte e9_
    Byte e8_
    Byte e7_
    Byte e6_
    Byte e5_
    Byte e4_
    Byte e3_
    Byte e2_
    Byte e1_
    Byte e0_
    Returns
    Type Description
    v256

    mm256_set_m128(v128, v128)

    Declaration
    public static v256 mm256_set_m128(v128 hi, v128 lo)
    Parameters
    Type Name Description
    v128 hi
    v128 lo
    Returns
    Type Description
    v256

    mm256_set_m128d(v128, v128)

    Set packed v256 vector with the supplied values.

    Declaration
    public static v256 mm256_set_m128d(v128 hi, v128 lo)
    Parameters
    Type Name Description
    v128 hi
    v128 lo
    Returns
    Type Description
    v256

    mm256_set_m128i(v128, v128)

    Set packed v256 vector with the supplied values.

    Declaration
    public static v256 mm256_set_m128i(v128 hi, v128 lo)
    Parameters
    Type Name Description
    v128 hi
    v128 lo
    Returns
    Type Description
    v256

    mm256_set_pd(Double, Double, Double, Double)

    Set packed double-precision (64-bit) floating-point elements in dst with the supplied values.

    Declaration
    public static v256 mm256_set_pd(double d, double c, double b, double a)
    Parameters
    Type Name Description
    Double d
    Double c
    Double b
    Double a
    Returns
    Type Description
    v256

    mm256_set_ps(Single, Single, Single, Single, Single, Single, Single, Single)

    Set packed single-precision (32-bit) floating-point elements in dst with the supplied values.

    Declaration
    public static v256 mm256_set_ps(float e7, float e6, float e5, float e4, float e3, float e2, float e1, float e0)
    Parameters
    Type Name Description
    Single e7
    Single e6
    Single e5
    Single e4
    Single e3
    Single e2
    Single e1
    Single e0
    Returns
    Type Description
    v256

    mm256_set1_epi16(Int16)

    Broadcast 16-bit integer a to all all elements of dst. This intrinsic may generate the vpbroadcastw instruction.

    Declaration
    public static v256 mm256_set1_epi16(short a)
    Parameters
    Type Name Description
    Int16 a
    Returns
    Type Description
    v256

    mm256_set1_epi32(Int32)

    Broadcast 32-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastd instruction.

    Declaration
    public static v256 mm256_set1_epi32(int a)
    Parameters
    Type Name Description
    Int32 a
    Returns
    Type Description
    v256

    mm256_set1_epi64x(Int64)

    Broadcast 64-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastq instruction.

    Declaration
    public static v256 mm256_set1_epi64x(long a)
    Parameters
    Type Name Description
    Int64 a
    Returns
    Type Description
    v256

    mm256_set1_epi8(Char)

    Broadcast 8-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastb instruction.

    Declaration
    public static v256 mm256_set1_epi8(char a)
    Parameters
    Type Name Description
    Char a
    Returns
    Type Description
    v256

    mm256_set1_pd(Double)

    Broadcast double-precision (64-bit) floating-point value a to all elements of dst.

    Declaration
    public static v256 mm256_set1_pd(double a)
    Parameters
    Type Name Description
    Double a
    Returns
    Type Description
    v256

    mm256_set1_ps(Single)

    Broadcast single-precision (32-bit) floating-point value a to all elements of dst.

    Declaration
    public static v256 mm256_set1_ps(float a)
    Parameters
    Type Name Description
    Single a
    Returns
    Type Description
    v256

    mm256_setr_epi16(Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16, Int16)

    Set packed short elements in dst with the supplied values in reverse order.

    Declaration
    public static v256 mm256_setr_epi16(short e15_, short e14_, short e13_, short e12_, short e11_, short e10_, short e9_, short e8_, short e7_, short e6_, short e5_, short e4_, short e3_, short e2_, short e1_, short e0_)
    Parameters
    Type Name Description
    Int16 e15_
    Int16 e14_
    Int16 e13_
    Int16 e12_
    Int16 e11_
    Int16 e10_
    Int16 e9_
    Int16 e8_
    Int16 e7_
    Int16 e6_
    Int16 e5_
    Int16 e4_
    Int16 e3_
    Int16 e2_
    Int16 e1_
    Int16 e0_
    Returns
    Type Description
    v256

    mm256_setr_epi32(Int32, Int32, Int32, Int32, Int32, Int32, Int32, Int32)

    Set packed int elements in dst with the supplied values in reverse order.

    Declaration
    public static v256 mm256_setr_epi32(int e7, int e6, int e5, int e4, int e3, int e2, int e1, int e0)
    Parameters
    Type Name Description
    Int32 e7
    Int32 e6
    Int32 e5
    Int32 e4
    Int32 e3
    Int32 e2
    Int32 e1
    Int32 e0
    Returns
    Type Description
    v256

    mm256_setr_epi64x(Int64, Int64, Int64, Int64)

    Set packed 64-bit integers in dst with the supplied values in reverse order.

    Declaration
    public static v256 mm256_setr_epi64x(long e3, long e2, long e1, long e0)
    Parameters
    Type Name Description
    Int64 e3
    Int64 e2
    Int64 e1
    Int64 e0
    Returns
    Type Description
    v256

    mm256_setr_epi8(Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte, Byte)

    Set packed byte elements in dst with the supplied values in reverse order.

    Declaration
    public static v256 mm256_setr_epi8(byte e31_, byte e30_, byte e29_, byte e28_, byte e27_, byte e26_, byte e25_, byte e24_, byte e23_, byte e22_, byte e21_, byte e20_, byte e19_, byte e18_, byte e17_, byte e16_, byte e15_, byte e14_, byte e13_, byte e12_, byte e11_, byte e10_, byte e9_, byte e8_, byte e7_, byte e6_, byte e5_, byte e4_, byte e3_, byte e2_, byte e1_, byte e0_)
    Parameters
    Type Name Description
    Byte e31_
    Byte e30_
    Byte e29_
    Byte e28_
    Byte e27_
    Byte e26_
    Byte e25_
    Byte e24_
    Byte e23_
    Byte e22_
    Byte e21_
    Byte e20_
    Byte e19_
    Byte e18_
    Byte e17_
    Byte e16_
    Byte e15_
    Byte e14_
    Byte e13_
    Byte e12_
    Byte e11_
    Byte e10_
    Byte e9_
    Byte e8_
    Byte e7_
    Byte e6_
    Byte e5_
    Byte e4_
    Byte e3_
    Byte e2_
    Byte e1_
    Byte e0_
    Returns
    Type Description
    v256

    mm256_setr_m128(v128, v128)

    Set packed v256 vector with the supplied values in reverse order.

    Declaration
    public static v256 mm256_setr_m128(v128 hi, v128 lo)
    Parameters
    Type Name Description
    v128 hi
    v128 lo
    Returns
    Type Description
    v256

    mm256_setr_m128d(v128, v128)

    Set packed v256 vector with the supplied values in reverse order.

    Declaration
    public static v256 mm256_setr_m128d(v128 hi, v128 lo)
    Parameters
    Type Name Description
    v128 hi
    v128 lo
    Returns
    Type Description
    v256

    mm256_setr_m128i(v128, v128)

    Set packed v256 vector with the supplied values in reverse order.

    Declaration
    public static v256 mm256_setr_m128i(v128 hi, v128 lo)
    Parameters
    Type Name Description
    v128 hi
    v128 lo
    Returns
    Type Description
    v256

    mm256_setr_pd(Double, Double, Double, Double)

    Set packed double-precision (64-bit) floating-point elements in dst with the supplied values in reverse order.

    Declaration
    public static v256 mm256_setr_pd(double d, double c, double b, double a)
    Parameters
    Type Name Description
    Double d
    Double c
    Double b
    Double a
    Returns
    Type Description
    v256

    mm256_setr_ps(Single, Single, Single, Single, Single, Single, Single, Single)

    Set packed single-precision (32-bit) floating-point elements in dst with the supplied values in reverse order.

    Declaration
    public static v256 mm256_setr_ps(float e7, float e6, float e5, float e4, float e3, float e2, float e1, float e0)
    Parameters
    Type Name Description
    Single e7
    Single e6
    Single e5
    Single e4
    Single e3
    Single e2
    Single e1
    Single e0
    Returns
    Type Description
    v256

    mm256_setzero_pd()

    Return a vector with all elements set to zero.

    Declaration
    public static v256 mm256_setzero_pd()
    Returns
    Type Description
    v256

    mm256_setzero_ps()

    Return a vector with all elements set to zero.

    Declaration
    public static v256 mm256_setzero_ps()
    Returns
    Type Description
    v256

    mm256_setzero_si256()

    Return a vector with all elements set to zero.

    Declaration
    public static v256 mm256_setzero_si256()
    Returns
    Type Description
    v256

    mm256_shuffle_pd(v256, v256, Int32)

    Shuffle double-precision (64-bit) floating-point elements within 128-bit lanes using the control in imm8, and store the results in dst.

    Declaration
    public static v256 mm256_shuffle_pd(v256 a, v256 b, int imm8)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VSHUFPD ymm1, ymm2, ymm3/v256, imm8 Moves either of the two packed double-precision floating-point values from each double quadword in the first source operand into the low quadword of each double quadword of the destination; moves either of the two packed double-precision floating-point values from the second source operand into the high quadword of each double quadword of the destination operand. The selector operand determines which values are moved to the destination

    mm256_shuffle_ps(v256, v256, Int32)

    Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.

    Declaration
    public static v256 mm256_shuffle_ps(v256 a, v256 b, int imm8)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Int32 imm8
    Returns
    Type Description
    v256
    Remarks

    **** VSHUFPS ymm1, ymm2, ymm3/v256, imm8 Moves two of the four packed single-precision floating-point values from each double qword of the first source operand into the low quadword of each double qword of the destination; moves two of the four packed single-precision floating-point values from each double qword of the second source operand into to the high quadword of each double qword of the destination. The selector operand determines which values are moved to the destination.

    mm256_sqrt_pd(v256)

    Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst.

    Declaration
    public static v256 mm256_sqrt_pd(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256
    Remarks

    **** VSQRTPD ymm1, ymm2/v256

    mm256_sqrt_ps(v256)

    Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst.

    Declaration
    public static v256 mm256_sqrt_ps(v256 a)
    Parameters
    Type Name Description
    v256 a
    Returns
    Type Description
    v256
    Remarks

    **** VSQRTPS ymm1, ymm2/v256

    mm256_store_pd(Void*, v256)

    Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory

    Declaration
    public static void mm256_store_pd(void *ptr, v256 a)
    Parameters
    Type Name Description
    Void* ptr
    v256 a
    Remarks

    **** VMOVUPS v256, ymm1 Burst only generates unaligned stores.

    mm256_store_ps(Void*, v256)

    Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory

    Declaration
    public static void mm256_store_ps(void *ptr, v256 val)
    Parameters
    Type Name Description
    Void* ptr
    v256 val
    Remarks

    **** VMOVUPS v256, ymm1 Burst only generates unaligned stores.

    mm256_store_si256(Void*, v256)

    Store 256-bits (composed of 8 packed 32-bit integer elements) from a into memory

    Declaration
    public static void mm256_store_si256(void *ptr, v256 v)
    Parameters
    Type Name Description
    Void* ptr
    v256 v
    Remarks

    **** VMOVDQU v256, ymm1 Burst only generates unaligned stores.

    mm256_storeu_pd(Void*, v256)

    Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory

    Declaration
    public static void mm256_storeu_pd(void *ptr, v256 a)
    Parameters
    Type Name Description
    Void* ptr
    v256 a
    Remarks

    **** VMOVUPS v256, ymm1 Burst only generates unaligned stores.

    mm256_storeu_ps(Void*, v256)

    Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory

    Declaration
    public static void mm256_storeu_ps(void *ptr, v256 a)
    Parameters
    Type Name Description
    Void* ptr
    v256 a
    Remarks

    **** VMOVUPS v256, ymm1 Burst only generates unaligned stores.

    mm256_storeu_si256(Void*, v256)

    Store 256-bits (composed of 8 packed 32-bit integer elements) from a into memory

    Declaration
    public static void mm256_storeu_si256(void *ptr, v256 v)
    Parameters
    Type Name Description
    Void* ptr
    v256 v
    Remarks

    **** VMOVDQU v256, ymm1 Burst only generates unaligned stores.

    mm256_storeu2_m128(Void*, Void*, v256)

    Store the high and low 128-bit halves (each composed of 4 packed single-precision (32-bit) floating-point elements) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.

    Declaration
    public static void mm256_storeu2_m128(void *hiaddr, void *loaddr, v256 val)
    Parameters
    Type Name Description
    Void* hiaddr
    Void* loaddr
    v256 val
    Remarks

    This is a composite function which can generate more than one instruction.

    mm256_storeu2_m128d(Void*, Void*, v256)

    Store the high and low 128-bit halves (each composed of 2 packed double-precision (64-bit) floating-point elements) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.

    Declaration
    public static void mm256_storeu2_m128d(void *hiaddr, void *loaddr, v256 val)
    Parameters
    Type Name Description
    Void* hiaddr
    Void* loaddr
    v256 val
    Remarks

    This is a composite function which can generate more than one instruction.

    mm256_storeu2_m128i(Void*, Void*, v256)

    Store the high and low 128-bit halves (each composed of integer data) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.

    Declaration
    public static void mm256_storeu2_m128i(void *hiaddr, void *loaddr, v256 val)
    Parameters
    Type Name Description
    Void* hiaddr
    Void* loaddr
    v256 val
    Remarks

    This is a composite function which can generate more than one instruction.

    mm256_stream_pd(Void*, v256)

    Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

    Declaration
    public static void mm256_stream_pd(void *mem_addr, v256 a)
    Parameters
    Type Name Description
    Void* mem_addr
    v256 a
    Remarks

    **** VMOVNTPD v256, ymm1

    mm256_stream_ps(Void*, v256)

    Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

    Declaration
    public static void mm256_stream_ps(void *mem_addr, v256 a)
    Parameters
    Type Name Description
    Void* mem_addr
    v256 a
    Remarks

    **** VMOVNTPS v256, ymm1

    mm256_stream_si256(Void*, v256)

    Store 256-bits of integer data from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

    Declaration
    public static void mm256_stream_si256(void *mem_addr, v256 a)
    Parameters
    Type Name Description
    Void* mem_addr
    v256 a
    Remarks

    **** VMOVNTDQ v256, ymm1

    mm256_sub_pd(v256, v256)

    Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst.

    Declaration
    public static v256 mm256_sub_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VSUBPD ymm1, ymm2, ymm3/v256 Performs an SIMD subtract of the four packed double-precision floating-point values of the second Source operand from the first Source operand, and stores the packed double-precision floating-point results in the destination

    mm256_sub_ps(v256, v256)

    Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst.

    Declaration
    public static v256 mm256_sub_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VSUBPS ymm1, ymm2, ymm3/v256 Performs an SIMD subtract of the eight packed single-precision floating-point values in the second Source operand from the First Source operand, and stores the packed single-precision floating-point results in the destination

    mm256_testc_pd(v256, v256)

    Compute the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

    Declaration
    public static int mm256_testc_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    Int32
    Remarks

    **** VTESTPD ymm1, ymm2/v256

    mm256_testc_ps(v256, v256)

    Compute the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

    Declaration
    public static int mm256_testc_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    Int32
    Remarks

    **** VTESTPS ymm1, ymm2/v256

    mm256_testc_si256(v256, v256)

    Compute the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return the CF value.

    Declaration
    public static int mm256_testc_si256(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    Int32

    mm256_testnzc_pd(v256, v256)

    Compute the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.

    Declaration
    public static int mm256_testnzc_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    Int32
    Remarks

    **** VTESTPD ymm1, ymm2/v256

    mm256_testnzc_ps(v256, v256)

    Compute the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.

    Declaration
    public static int mm256_testnzc_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    Int32
    Remarks

    **** VTESTPS ymm1, ymm2/v256

    mm256_testnzc_si256(v256, v256)

    Compute the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.

    Declaration
    public static int mm256_testnzc_si256(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    Int32

    mm256_testz_pd(v256, v256)

    Compute the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.

    Declaration
    public static int mm256_testz_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    Int32
    Remarks

    **** VTESTPD ymm1, ymm2/v256

    mm256_testz_ps(v256, v256)

    Compute the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.

    Declaration
    public static int mm256_testz_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    Int32
    Remarks

    **** VTESTPS ymm1, ymm2/v256

    mm256_testz_si256(v256, v256)

    Compute the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return the ZF value.

    Declaration
    public static int mm256_testz_si256(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    Int32

    mm256_undefined_pd()

    Return a 256-bit vector with undefined contents.

    Declaration
    public static v256 mm256_undefined_pd()
    Returns
    Type Description
    v256

    mm256_undefined_ps()

    Return a 256-bit vector with undefined contents.

    Declaration
    public static v256 mm256_undefined_ps()
    Returns
    Type Description
    v256

    mm256_undefined_si256()

    Return a 256-bit vector with undefined contents.

    Declaration
    public static v256 mm256_undefined_si256()
    Returns
    Type Description
    v256

    mm256_unpackhi_pd(v256, v256)

    Unpack and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_unpackhi_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VUNPCKHPD ymm1,ymm2,ymm3/v256

    mm256_unpackhi_ps(v256, v256)

    Unpack and interleave single-precision(32-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_unpackhi_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VUNPCKHPS ymm1,ymm2,ymm3/v256

    mm256_unpacklo_pd(v256, v256)

    Unpack and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_unpacklo_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VUNPCKLPD ymm1,ymm2,ymm3/v256

    mm256_unpacklo_ps(v256, v256)

    Unpack and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_unpacklo_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VUNPCKLPS ymm1,ymm2,ymm3/v256

    mm256_xor_pd(v256, v256)

    Compute the bitwise XOR of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_xor_pd(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VXORPD ymm1, ymm2, ymm3/v256 Performs a bitwise logical XOR of the four packed double-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination

    mm256_xor_ps(v256, v256)

    Compute the bitwise XOR of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

    Declaration
    public static v256 mm256_xor_ps(v256 a, v256 b)
    Parameters
    Type Name Description
    v256 a
    v256 b
    Returns
    Type Description
    v256
    Remarks

    **** VXORPS ymm1, ymm2, ymm3/v256 Performs a bitwise logical XOR of the eight packed single-precision floating-point values from the first source operand and the second source operand, and stores the result in the destination

    mm256_zeroall()

    Zeros the contents of all YMM registers

    Declaration
    public static void mm256_zeroall()
    Remarks

    **** VZEROALL

    mm256_zeroupper()

    Zero the upper 128 bits of all YMM registers; the lower 128-bits of the registers are unmodified.

    Declaration
    public static void mm256_zeroupper()
    Remarks

    **** VZEROUPPER

    mm256_zextpd128_pd256(v128)

    Casts vector of type v128 to type v256; the upper 128 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.

    Declaration
    public static v256 mm256_zextpd128_pd256(v128 a)
    Parameters
    Type Name Description
    v128 a
    Returns
    Type Description
    v256

    mm256_zextps128_ps256(v128)

    Casts vector of type v128 to type v256; the upper 128 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.

    Declaration
    public static v256 mm256_zextps128_ps256(v128 a)
    Parameters
    Type Name Description
    v128 a
    Returns
    Type Description
    v256

    mm256_zextsi128_si256(v128)

    Casts vector of type v128 to type v256; the upper 128 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.

    Declaration
    public static v256 mm256_zextsi128_si256(v128 a)
    Parameters
    Type Name Description
    v128 a
    Returns
    Type Description
    v256

    permute_pd(v128, Int32)

    Shuffle double-precision (64-bit) floating-point elements in a using the control in imm8, and store the results in dst.

    Declaration
    public static v128 permute_pd(v128 a, int imm8)
    Parameters
    Type Name Description
    v128 a
    Int32 imm8
    Returns
    Type Description
    v128
    Remarks

    **** VPERMILPD xmm1, xmm2/v128, imm8 Permute Double-Precision Floating-Point values in the first source operand using two, 1-bit control fields in the low 2 bits of the 8-bit immediate and store results in the destination

    permute_ps(v128, Int32)

    Shuffle single-precision (32-bit) floating-point elements in a using the control in imm8, and store the results in dst.

    Declaration
    public static v128 permute_ps(v128 a, int imm8)
    Parameters
    Type Name Description
    v128 a
    Int32 imm8
    Returns
    Type Description
    v128
    Remarks

    **** VPERMILPS xmm1, xmm2/v128, imm8 Permute Single-Precision Floating-Point values in the first source operand using four 2-bit control fields in the 8-bit immediate and store results in the destination

    permutevar_pd(v128, v128)

    Shuffle double-precision (64-bit) floating-point elements in a using the control in b, and store the results in dst.

    Declaration
    public static v128 permutevar_pd(v128 a, v128 b)
    Parameters
    Type Name Description
    v128 a
    v128 b
    Returns
    Type Description
    v128
    Remarks

    **** VPERMILPD xmm1, xmm2, xmm3/v128 Permute Double-Precision Floating-Point values in the first source operand using 8-bit control fields in the low bytes of the second source operand and store results in the destination

    permutevar_ps(v128, v128)

    Shuffle single-precision (32-bit) floating-point elements in a using the control in b, and store the results in dst.

    Declaration
    public static v128 permutevar_ps(v128 a, v128 b)
    Parameters
    Type Name Description
    v128 a
    v128 b
    Returns
    Type Description
    v128
    Remarks

    **** VPERMILPS xmm1, xmm2, xmm3/v128 Permute Single-Precision Floating-Point values in the first source operand using 8-bit control fields in the low bytes of corresponding elements the shuffle control and store results in the destination

    testc_pd(v128, v128)

    Compute the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

    Declaration
    public static int testc_pd(v128 a, v128 b)
    Parameters
    Type Name Description
    v128 a
    v128 b
    Returns
    Type Description
    Int32
    Remarks

    **** VTESTPD xmm1, xmm2/v128

    testc_ps(v128, v128)

    Compute the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

    Declaration
    public static int testc_ps(v128 a, v128 b)
    Parameters
    Type Name Description
    v128 a
    v128 b
    Returns
    Type Description
    Int32
    Remarks

    **** VTESTPS xmm1, xmm2/v128

    testnzc_pd(v128, v128)

    Compute the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.

    Declaration
    public static int testnzc_pd(v128 a, v128 b)
    Parameters
    Type Name Description
    v128 a
    v128 b
    Returns
    Type Description
    Int32
    Remarks

    **** VTESTPD xmm1, xmm2/v128

    testnzc_ps(v128, v128)

    Compute the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.

    Declaration
    public static int testnzc_ps(v128 a, v128 b)
    Parameters
    Type Name Description
    v128 a
    v128 b
    Returns
    Type Description
    Int32
    Remarks

    **** VTESTPS xmm1, xmm2/v128

    testz_pd(v128, v128)

    Compute the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.

    Declaration
    public static int testz_pd(v128 a, v128 b)
    Parameters
    Type Name Description
    v128 a
    v128 b
    Returns
    Type Description
    Int32
    Remarks

    **** VTESTPD xmm1, xmm2/v128

    testz_ps(v128, v128)

    Compute the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.

    Declaration
    public static int testz_ps(v128 a, v128 b)
    Parameters
    Type Name Description
    v128 a
    v128 b
    Returns
    Type Description
    Int32
    Remarks

    **** VTESTPS xmm1, xmm2/v128

    undefined_pd()

    Return a 128-bit vector with undefined contents.

    Declaration
    public static v128 undefined_pd()
    Returns
    Type Description
    v128

    undefined_ps()

    Return a 128-bit vector with undefined contents.

    Declaration
    public static v128 undefined_ps()
    Returns
    Type Description
    v128

    undefined_si128()

    Return a 128-bit vector with undefined contents.

    Declaration
    public static v128 undefined_si128()
    Returns
    Type Description
    v128
    Back to top
    Copyright © 2023 Unity Technologies — Terms of use
    • Legal
    • Privacy Policy
    • Cookies
    • Do Not Sell or Share My Personal Information
    • Your Privacy Choices (Cookie Settings)
    "Unity", Unity logos, and other Unity trademarks are trademarks or registered trademarks of Unity Technologies or its affiliates in the U.S. and elsewhere (more info here). Other names or brands are trademarks of their respective owners.
    Generated by DocFX on 18 October 2023