docs.unity3d.com
Search Results for

    Show / Hide Table of Contents

    Class X86.Avx

    AVX intrinsics

    Inheritance
    object
    X86.Avx
    Inherited Members
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.MemberwiseClone()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: Unity.Burst.Intrinsics
    Assembly: Unity.Burst.dll
    Syntax
    public static class X86.Avx

    Properties

    Name Description
    IsAvxSupported

    Evaluates to true at compile time if AVX intrinsics are supported.

    Methods

    Name Description
    broadcast_ss(void*)

    Broadcast a single-precision (32-bit) floating-point element from memory to all elements of dst.

    cmp_pd(v128, v128, int)

    Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

    cmp_ps(v128, v128, int)

    Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

    cmp_sd(v128, v128, int)

    Compare the lower double-precision (64-bit) floating-point element in a and b based on the comparison operand specified by imm8, store the result in the lower element of dst, and copy the upper element from a to the upper element of dst.

    cmp_ss(v128, v128, int)

    Compare the lower single-precision (32-bit) floating-point element in a and b based on the comparison operand specified by imm8, store the result in the lower element of dst, and copy the upper 3 packed elements from a to the upper elements of dst.

    maskload_pd(void*, v128)

    Load packed double-precision (64-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

    maskload_ps(void*, v128)

    Load packed single-precision (32-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

    maskstore_pd(void*, v128, v128)

    Store packed double-precision (64-bit) floating-point elements from a into memory using mask.

    maskstore_ps(void*, v128, v128)

    Store packed single-precision (32-bit) floating-point elements from a into memory using mask.

    mm256_add_pd(v256, v256)

    Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

    mm256_add_ps(v256, v256)

    Add packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

    mm256_addsub_pd(v256, v256)

    Alternatively add and subtract packed double-precision (64-bit) floating-point elements in a to/from packed elements in b, and store the results in dst.

    mm256_addsub_ps(v256, v256)

    Alternatively add and subtract packed single-precision (32-bit) floating-point elements in a to/from packed elements in b, and store the results in dst.

    mm256_and_pd(v256, v256)

    Compute the bitwise AND of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

    mm256_and_ps(v256, v256)

    Compute the bitwise AND of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

    mm256_andnot_pd(v256, v256)

    Compute the bitwise NOT of packed double-precision (64-bit) floating-point elements in a and then AND with b, and store the results in dst.

    mm256_andnot_ps(v256, v256)

    Compute the bitwise NOT of packed single-precision (32-bit) floating-point elements in a and then AND with b, and store the results in dst.

    mm256_blend_pd(v256, v256, int)

    Blend packed double-precision (64-bit) floating-point elements from a and b using control mask imm8, and store the results in dst.

    mm256_blend_ps(v256, v256, int)

    Blend packed single-precision (32-bit) floating-point elements from a and b using control mask imm8, and store the results in dst.

    mm256_blendv_pd(v256, v256, v256)

    Blend packed double-precision (64-bit) floating-point elements from a and b using mask, and store the results in dst.

    mm256_blendv_ps(v256, v256, v256)

    Blend packed single-precision (32-bit) floating-point elements from a and b using mask, and store the results in dst.

    mm256_broadcast_pd(void*)

    Broadcast 128 bits from memory (composed of 2 packed double-precision (64-bit) floating-point elements) to all elements of dst.

    mm256_broadcast_ps(void*)

    Broadcast 128 bits from memory (composed of 4 packed single-precision (32-bit) floating-point elements) to all elements of dst.

    mm256_broadcast_sd(void*)

    Broadcast a double-precision (64-bit) floating-point element from memory to all elements of dst.

    mm256_broadcast_ss(void*)

    Broadcast a single-precision (32-bit) floating-point element from memory to all elements of dst.

    mm256_castpd128_pd256(v128)

    For compatibility with C++ code only. This is a no-op in Burst.

    mm256_castpd256_pd128(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    mm256_castpd_ps(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    mm256_castpd_si256(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    mm256_castps128_ps256(v128)

    For compatibility with C++ code only. This is a no-op in Burst.

    mm256_castps256_ps128(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    mm256_castps_pd(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    mm256_castps_si256(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    mm256_castsi128_si256(v128)

    For compatibility with C++ code only. This is a no-op in Burst.

    mm256_castsi256_pd(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    mm256_castsi256_ps(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    mm256_castsi256_si128(v256)

    For compatibility with C++ code only. This is a no-op in Burst.

    mm256_ceil_pd(v256)

    Round the packed double-precision (64-bit) floating-point elements in a up to an integer value, and store the results as packed double-precision floating-point elements in dst.

    mm256_ceil_ps(v256)

    Round the packed single-precision (32-bit) floating-point elements in a up to an integer value, and store the results as packed single-precision floating-point elements in dst.

    mm256_cmp_pd(v256, v256, int)

    Compare packed double-precision (64-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

    mm256_cmp_ps(v256, v256, int)

    Compare packed single-precision (32-bit) floating-point elements in a and b based on the comparison operand specified by imm8, and store the results in dst.

    mm256_cvtepi32_pd(v128)

    Convert packed 32-bit integers in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.

    mm256_cvtepi32_ps(v256)

    Convert packed 32-bit integers in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.

    mm256_cvtpd_epi32(v256)

    Convert packed double-precision(64-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.

    mm256_cvtpd_ps(v256)

    Convert packed double-precision (64-bit) floating-point elements in a to packed single-precision (32-bit) floating-point elements, and store the results in dst.

    mm256_cvtps_epi32(v256)

    Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers, and store the results in dst.

    mm256_cvtps_pd(v128)

    Convert packed single-precision (32-bit) floating-point elements in a to packed double-precision (64-bit) floating-point elements, and store the results in dst.

    mm256_cvtss_f32(v256)

    Copy the lower single-precision (32-bit) floating-point element of a to dst.

    mm256_cvttpd_epi32(v256)

    Convert packed double-precision (64-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.

    mm256_cvttps_epi32(v256)

    Convert packed single-precision (32-bit) floating-point elements in a to packed 32-bit integers with truncation, and store the results in dst.

    mm256_div_pd(v256, v256)

    Divide packed double-precision (64-bit) floating-point elements in a by packed elements in b, and store the results in dst.

    mm256_div_ps(v256, v256)

    Divide packed single-precision (32-bit) floating-point elements in a by packed elements in b, and store the results in dst.

    mm256_dp_ps(v256, v256, int)

    Conditionally multiply the packed single-precision (32-bit) floating-point elements in a and b using the high 4 bits in imm8, sum the four products, and conditionally store the sum in dst using the low 4 bits of imm8.

    mm256_extract_epi32(v256, int)

    Extract a 32-bit integer from a, selected with index (which must be a constant), and store the result in dst.

    mm256_extract_epi64(v256, int)

    Extract a 64-bit integer from a, selected with index (which must be a constant), and store the result in dst.

    mm256_extractf128_pd(v256, int)

    Extract 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from a, selected with imm8, and store the result in dst.

    mm256_extractf128_ps(v256, int)

    Extract 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a, selected with imm8, and store the result in dst.

    mm256_extractf128_si256(v256, int)

    Extract 128 bits (composed of integer data) from a, selected with imm8, and store the result in dst.

    mm256_floor_pd(v256)

    Round the packed double-precision (64-bit) floating-point elements in a down to an integer value, and store the results as packed double-precision floating-point elements in dst.

    mm256_floor_ps(v256)

    Round the packed single-precision (32-bit) floating-point elements in a down to an integer value, and store the results as packed single-precision floating-point elements in dst.

    mm256_hadd_pd(v256, v256)

    Horizontally add adjacent pairs of double-precision (64-bit) floating-point elements in a and b, and pack the results in dst.

    mm256_hadd_ps(v256, v256)

    Horizontally add adjacent pairs of single-precision (32-bit) floating-point elements in a and b, and pack the results in dst.

    mm256_hsub_pd(v256, v256)

    Horizontally subtract adjacent pairs of double-precision (64-bit) floating-point elements in a and b, and pack the results in dst.

    mm256_hsub_ps(v256, v256)

    Horizontally add adjacent pairs of single-precision (32-bit) floating-point elements in a and b, and pack the results in dst.

    mm256_insert_epi16(v256, int, int)

    Copy a to dst, and insert the 16-bit integer i into dst at the location specified by index (which must be a constant).

    mm256_insert_epi32(v256, int, int)

    Copy a to dst, and insert the 32-bit integer i into dst at the location specified by index (which must be a constant).

    mm256_insert_epi64(v256, long, int)

    Copy a to dst, and insert the 64-bit integer i into dst at the location specified by index (which must be a constant).

    mm256_insert_epi8(v256, int, int)

    Copy a to dst, and insert the 8-bit integer i into dst at the location specified by index (which must be a constant).

    mm256_insertf128_pd(v256, v128, int)

    Copy a to dst, then insert 128 bits (composed of 2 packed double-precision (64-bit) floating-point elements) from b into dst at the location specified by imm8.

    mm256_insertf128_ps(v256, v128, int)

    Copy a to dst, then insert 128 bits (composed of 4 packed single-precision (32-bit) floating-point elements) from b into dst at the location specified by imm8.

    mm256_insertf128_si256(v256, v128, int)

    Copy a to dst, then insert 128 bits of integer data from b into dst at the location specified by imm8.

    mm256_lddqu_si256(void*)

    Load 256-bits of integer data from unaligned memory into dst. This intrinsic may perform better than mm256_loadu_si256 when the data crosses a cache line boundary.

    mm256_load_pd(void*)

    Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory

    mm256_load_ps(void*)

    Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory

    mm256_load_si256(void*)

    Load 256-bits (composed of 8 packed 32-bit integers elements) from memory

    mm256_loadu2_m128(void*, void*)

    Load two 128-bit values (composed of 4 packed single-precision (32-bit) floating-point elements) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.

    mm256_loadu2_m128d(void*, void*)

    Load two 128-bit values (composed of 2 packed double-precision (64-bit) floating-point elements) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.

    mm256_loadu2_m128i(void*, void*)

    Load two 128-bit values (composed of integer data) from memory, and combine them into a 256-bit value in dst. hiaddr and loaddr do not need to be aligned on any particular boundary.

    mm256_loadu_pd(void*)

    Load 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from memory

    mm256_loadu_ps(void*)

    Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory

    mm256_loadu_si256(void*)

    Load 256-bits (composed of 8 packed 32-bit integers elements) from memory

    mm256_maskload_pd(void*, v256)

    Load packed double-precision (64-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

    mm256_maskload_ps(void*, v256)

    Load packed single-precision (32-bit) floating-point elements from memory into dst using mask (elements are zeroed out when the high bit of the corresponding element is not set).

    mm256_maskstore_pd(void*, v256, v256)

    Store packed double-precision (64-bit) floating-point elements from a into memory using mask.

    mm256_maskstore_ps(void*, v256, v256)

    Store packed single-precision (32-bit) floating-point elements from a into memory using mask.

    mm256_max_pd(v256, v256)

    Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed maximum values in dst.

    mm256_max_ps(v256, v256)

    Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed maximum values in dst.

    mm256_min_pd(v256, v256)

    Compare packed double-precision (64-bit) floating-point elements in a and b, and store packed minimum values in dst.

    mm256_min_ps(v256, v256)

    Compare packed single-precision (32-bit) floating-point elements in a and b, and store packed minimum values in dst.

    mm256_movedup_pd(v256)

    Duplicate even-indexed double-precision (64-bit) floating-point elements from a, and store the results in dst.

    mm256_movehdup_ps(v256)

    Duplicate odd-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst.

    mm256_moveldup_ps(v256)

    Duplicate even-indexed single-precision (32-bit) floating-point elements from a, and store the results in dst.

    mm256_movemask_pd(v256)

    Set each bit of mask dst based on the most significant bit of the corresponding packed double-precision (64-bit) floating-point element in a.

    mm256_movemask_ps(v256)

    Set each bit of mask dst based on the most significant bit of the corresponding packed single-precision (32-bit) floating-point element in a.

    mm256_mul_pd(v256, v256)

    Multiply packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

    mm256_mul_ps(v256, v256)

    Multiply packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

    mm256_or_pd(v256, v256)

    Compute the bitwise OR of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

    mm256_or_ps(v256, v256)

    Compute the bitwise OR of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

    mm256_permute2f128_pd(v256, v256, int)

    Shuffle 128-bits (composed of 2 packed double-precision (64-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.

    mm256_permute2f128_ps(v256, v256, int)

    Shuffle 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) selected by imm8 from a and b, and store the results in dst.

    mm256_permute2f128_si256(v256, v256, int)

    Shuffle 128-bits (composed of integer data) selected by imm8 from a and b, and store the results in dst.

    mm256_permute_pd(v256, int)

    Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.

    mm256_permute_ps(v256, int)

    Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.

    mm256_permutevar_pd(v256, v256)

    Shuffle double-precision (64-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst.

    mm256_permutevar_ps(v256, v256)

    Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in b, and store the results in dst.

    mm256_rcp_ps(v256)

    Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 1.5*2^-12.

    mm256_round_pd(v256, int)

    Round the packed double-precision (64-bit) floating-point elements in a using the rounding parameter, and store the results as packed double-precision floating-point elements in dst.

    mm256_round_ps(v256, int)

    Round the packed single-precision (32-bit) floating-point elements in a using the rounding parameter, and store the results as packed single-precision floating-point elements in dst.

    mm256_rsqrt_ps(v256)

    Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst. The maximum relative error for this approximation is less than 1.5*2^-12.

    mm256_set1_epi16(short)

    Broadcast 16-bit integer a to all all elements of dst. This intrinsic may generate the vpbroadcastw instruction.

    mm256_set1_epi32(int)

    Broadcast 32-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastd instruction.

    mm256_set1_epi64x(long)

    Broadcast 64-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastq instruction.

    mm256_set1_epi8(byte)

    Broadcast 8-bit integer a to all elements of dst. This intrinsic may generate the vpbroadcastb instruction.

    mm256_set1_pd(double)

    Broadcast double-precision (64-bit) floating-point value a to all elements of dst.

    mm256_set1_ps(float)

    Broadcast single-precision (32-bit) floating-point value a to all elements of dst.

    mm256_set_epi16(short, short, short, short, short, short, short, short, short, short, short, short, short, short, short, short)

    Set packed short elements in dst with the supplied values.

    mm256_set_epi32(int, int, int, int, int, int, int, int)

    Set packed int elements in dst with the supplied values.

    mm256_set_epi64x(long, long, long, long)

    Set packed 64-bit integers in dst with the supplied values.

    mm256_set_epi8(byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte)

    Set packed byte elements in dst with the supplied values.

    mm256_set_m128(v128, v128)

    Set packed __m256 vector dst with the supplied values.

    mm256_set_m128d(v128, v128)

    Set packed v256 vector with the supplied values.

    mm256_set_m128i(v128, v128)

    Set packed v256 vector with the supplied values.

    mm256_set_pd(double, double, double, double)

    Set packed double-precision (64-bit) floating-point elements in dst with the supplied values.

    mm256_set_ps(float, float, float, float, float, float, float, float)

    Set packed single-precision (32-bit) floating-point elements in dst with the supplied values.

    mm256_setr_epi16(short, short, short, short, short, short, short, short, short, short, short, short, short, short, short, short)

    Set packed short elements in dst with the supplied values in reverse order.

    mm256_setr_epi32(int, int, int, int, int, int, int, int)

    Set packed int elements in dst with the supplied values in reverse order.

    mm256_setr_epi64x(long, long, long, long)

    Set packed 64-bit integers in dst with the supplied values in reverse order.

    mm256_setr_epi8(byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte, byte)

    Set packed byte elements in dst with the supplied values in reverse order.

    mm256_setr_m128(v128, v128)

    Set packed v256 vector with the supplied values in reverse order.

    mm256_setr_m128d(v128, v128)

    Set packed v256 vector with the supplied values in reverse order.

    mm256_setr_m128i(v128, v128)

    Set packed v256 vector with the supplied values in reverse order.

    mm256_setr_pd(double, double, double, double)

    Set packed double-precision (64-bit) floating-point elements in dst with the supplied values in reverse order.

    mm256_setr_ps(float, float, float, float, float, float, float, float)

    Set packed single-precision (32-bit) floating-point elements in dst with the supplied values in reverse order.

    mm256_setzero_pd()

    Return Vector with all elements set to zero.

    mm256_setzero_ps()

    Return Vector with all elements set to zero.

    mm256_setzero_si256()

    Return Vector with all elements set to zero.

    mm256_shuffle_pd(v256, v256, int)

    Shuffle double-precision (64-bit) floating-point elements within 128-bit lanes using the control in imm8, and store the results in dst.

    mm256_shuffle_ps(v256, v256, int)

    Shuffle single-precision (32-bit) floating-point elements in a within 128-bit lanes using the control in imm8, and store the results in dst.

    mm256_sqrt_pd(v256)

    Compute the square root of packed double-precision (64-bit) floating-point elements in a, and store the results in dst.

    mm256_sqrt_ps(v256)

    Compute the square root of packed single-precision (32-bit) floating-point elements in a, and store the results in dst.

    mm256_store_pd(void*, v256)

    Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory

    mm256_store_ps(void*, v256)

    Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory

    mm256_store_si256(void*, v256)

    Store 256-bits (composed of 8 packed 32-bit integer elements) from a into memory

    mm256_storeu2_m128(void*, void*, v256)

    Store the high and low 128-bit halves (each composed of 4 packed single-precision (32-bit) floating-point elements) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.

    mm256_storeu2_m128d(void*, void*, v256)

    Store the high and low 128-bit halves (each composed of 2 packed double-precision (64-bit) floating-point elements) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.

    mm256_storeu2_m128i(void*, void*, v256)

    Store the high and low 128-bit halves (each composed of integer data) from a into memory two different 128-bit locations. hiaddr and loaddr do not need to be aligned on any particular boundary.

    mm256_storeu_pd(void*, v256)

    Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory

    mm256_storeu_ps(void*, v256)

    Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory

    mm256_storeu_si256(void*, v256)

    Store 256-bits (composed of 8 packed 32-bit integer elements) from a into memory

    mm256_stream_pd(void*, v256)

    Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

    mm256_stream_ps(void*, v256)

    Store 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

    mm256_stream_si256(void*, v256)

    Store 256-bits of integer data from a into memory using a non-temporal memory hint. mem_addr must be aligned on a 32-byte boundary or a general-protection exception may be generated.

    mm256_sub_pd(v256, v256)

    Subtract packed double-precision (64-bit) floating-point elements in b from packed double-precision (64-bit) floating-point elements in a, and store the results in dst.

    mm256_sub_ps(v256, v256)

    Subtract packed single-precision (32-bit) floating-point elements in b from packed single-precision (32-bit) floating-point elements in a, and store the results in dst.

    mm256_testc_pd(v256, v256)

    Compute the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

    mm256_testc_ps(v256, v256)

    Compute the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

    mm256_testc_si256(v256, v256)

    Compute the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return the CF value.

    mm256_testnzc_pd(v256, v256)

    Compute the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.

    mm256_testnzc_ps(v256, v256)

    Compute the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.

    mm256_testnzc_si256(v256, v256)

    Compute the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.

    mm256_testz_pd(v256, v256)

    Compute the bitwise AND of 256 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.

    mm256_testz_ps(v256, v256)

    Compute the bitwise AND of 256 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 256-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.

    mm256_testz_si256(v256, v256)

    Compute the bitwise AND of 256 bits (representing integer data) in a and b, and set ZF to 1 if the result is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, and set CF to 1 if the result is zero, otherwise set CF to 0. Return the ZF value.

    mm256_undefined_pd()

    Return a 256-bit vector with undefined contents.

    mm256_undefined_ps()

    Return a 256-bit vector with undefined contents.

    mm256_undefined_si256()

    Return a 256-bit vector with undefined contents.

    mm256_unpackhi_pd(v256, v256)

    Unpack and interleave double-precision (64-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst.

    mm256_unpackhi_ps(v256, v256)

    Unpack and interleave single-precision(32-bit) floating-point elements from the high half of each 128-bit lane in a and b, and store the results in dst.

    mm256_unpacklo_pd(v256, v256)

    Unpack and interleave double-precision (64-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst.

    mm256_unpacklo_ps(v256, v256)

    Unpack and interleave single-precision (32-bit) floating-point elements from the low half of each 128-bit lane in a and b, and store the results in dst.

    mm256_xor_pd(v256, v256)

    Compute the bitwise XOR of packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

    mm256_xor_ps(v256, v256)

    Compute the bitwise XOR of packed single-precision (32-bit) floating-point elements in a and b, and store the results in dst.

    mm256_zeroall()

    Zeros the contents of all YMM registers

    mm256_zeroupper()

    Zero the upper 128 bits of all YMM registers; the lower 128-bits of the registers are unmodified.

    mm256_zextpd128_pd256(v128)

    Casts vector of type v128 to type v256; the upper 128 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.

    mm256_zextps128_ps256(v128)

    Casts vector of type v128 to type v256; the upper 128 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.

    mm256_zextsi128_si256(v128)

    Casts vector of type v128 to type v256; the upper 128 bits of the result are zeroed. This intrinsic is only used for compilation and does not generate any instructions, thus it has zero latency.

    permute_pd(v128, int)

    Shuffle double-precision (64-bit) floating-point elements in a using the control in imm8, and store the results in dst.

    permute_ps(v128, int)

    Shuffle single-precision (32-bit) floating-point elements in a using the control in imm8, and store the results in dst.

    permutevar_pd(v128, v128)

    Shuffle double-precision (64-bit) floating-point elements in a using the control in b, and store the results in dst.

    permutevar_ps(v128, v128)

    Shuffle single-precision (32-bit) floating-point elements in a using the control in b, and store the results in dst.

    testc_pd(v128, v128)

    Compute the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

    testc_ps(v128, v128)

    Compute the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the CF value.

    testnzc_pd(v128, v128)

    Compute the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.

    testnzc_ps(v128, v128)

    Compute the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return 1 if both the ZF and CF values are zero, otherwise return 0.

    testz_pd(v128, v128)

    Compute the bitwise AND of 128 bits (representing double-precision (64-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 64-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.

    testz_ps(v128, v128)

    Compute the bitwise AND of 128 bits (representing single-precision (32-bit) floating-point elements) in a and b, producing an intermediate 128-bit value, and set ZF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set ZF to 0. Compute the bitwise NOT of a and then AND with b, producing an intermediate value, and set CF to 1 if the sign bit of each 32-bit element in the intermediate value is zero, otherwise set CF to 0. Return the ZF value.

    undefined_pd()

    Return a 128-bit vector with undefined contents.

    undefined_ps()

    Return a 128-bit vector with undefined contents.

    undefined_si128()

    Return a 128-bit vector with undefined contents.

    In This Article
    Back to top
    Copyright © 2025 Unity Technologies — Trademarks and terms of use
    • Legal
    • Privacy Policy
    • Cookie Policy
    • Do Not Sell or Share My Personal Information
    • Your Privacy Choices (Cookie Settings)