docs.unity3d.com
Search Results for

    Show / Hide Table of Contents

    Class X86.Sse

    SSE intrinsics

    Inheritance
    object
    X86.Sse
    Inherited Members
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.MemberwiseClone()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: Unity.Burst.Intrinsics
    Assembly: Unity.Burst.dll
    Syntax
    public static class X86.Sse

    Properties

    Name Description
    IsSseSupported

    Evaluates to true at compile time if SSE intrinsics are supported.

    Methods

    Name Description
    SHUFFLE(int, int, int, int)

    Return a shuffle immediate suitable for use with shuffle_ps and similar instructions.

    TRANSPOSE4_PS(ref v128, ref v128, ref v128, ref v128)

    Transposes a 4x4 matrix of single precision floating point values (_MM_TRANSPOSE4_PS).

    add_ps(v128, v128)

    Add packed single-precision (32-bit) floating-point elements in "a" and "b", and store the results in "dst".

    add_ss(v128, v128)

    Add the lower single-precision (32-bit) floating-point element in "a" and "b", store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    and_ps(v128, v128)

    Compute the bitwise AND of packed single-precision (32-bit) floating-point elements in "a" and "b", and store the results in "dst".

    andnot_ps(v128, v128)

    Compute the bitwise NOT of packed single-precision (32-bit) floating-point elements in "a" and then AND with "b", and store the results in "dst".

    cmpeq_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b" for equality, and store the results in "dst".

    cmpeq_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b" for equality, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    cmpge_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b" for greater-than-or-equal, and store the results in "dst".

    cmpge_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b" for greater-than-or-equal, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    cmpgt_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b" for greater-than, and store the results in "dst".

    cmpgt_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b" for greater-than, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    cmple_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b" for less-than-or-equal, and store the results in "dst".

    cmple_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b" for less-than-or-equal, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    cmplt_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b" for less-than, and store the results in "dst".

    cmplt_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b" for less-than, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    cmpneq_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b" for not-equal, and store the results in "dst".

    cmpneq_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b" for not-equal, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    cmpnge_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b" for not-greater-than-or-equal, and store the results in "dst".

    cmpnge_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b" for not-greater-than-or-equal, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    cmpngt_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b" for not-greater-than, and store the results in "dst".

    cmpngt_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b" for not-greater-than, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    cmpnle_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b" for not-less-than-or-equal, and store the results in "dst".

    cmpnle_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b" for not-less-than-or-equal, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    cmpnlt_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b" for not-less-than, and store the results in "dst".

    cmpnlt_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b" for not-less-than, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    cmpord_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b" to see if neither is NaN, and store the results in "dst".

    cmpord_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b" to see if neither is NaN, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    cmpunord_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b" to see if either is NaN, and store the results in "dst".

    cmpunord_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b" to see if either is NaN, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    comieq_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point element in "a" and "b" for equality, and return the boolean result (0 or 1).

    comige_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point element in "a" and "b" for greater-than-or-equal, and return the boolean result (0 or 1).

    comigt_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point element in "a" and "b" for greater-than, and return the boolean result (0 or 1).

    comile_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point element in "a" and "b" for less-than-or-equal, and return the boolean result (0 or 1).

    comilt_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point element in "a" and "b" for less-than, and return the boolean result (0 or 1).

    comineq_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point element in "a" and "b" for not-equal, and return the boolean result (0 or 1).

    cvt_ss2si(v128)

    Convert the lower single-precision (32-bit) floating-point element in "a" to a 32-bit integer, and store the result in "dst". Follows standard of rounding to nearest, and for midpoint rounding it rounds to even.

    cvtsi32_ss(v128, int)

    Convert the 32-bit integer "b" to a single-precision (32-bit) floating-point element, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    cvtsi64_ss(v128, long)

    Convert the 64-bit integer "b" to a single-precision (32-bit) floating-point element, store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    cvtss_f32(v128)

    Copy the lower single-precision (32-bit) floating-point element of "a" to "dst".

    cvtss_si32(v128)

    Convert the lower single-precision (32-bit) floating-point element in "a" to a 32-bit integer, and store the result in "dst".

    cvtss_si64(v128)

    Convert the lower single-precision (32-bit) floating-point element in "a" to a 64-bit integer, and store the result in "dst". Follows standard of rounding to nearest, and for midpoint rounding it rounds to even.

    cvtt_ss2si(v128)

    Convert the lower single-precision (32-bit) floating-point element in "a" to a 32-bit integer with truncation, and store the result in "dst".

    cvttss_si32(v128)

    Convert the lower single-precision (32-bit) floating-point element in "a" to a 32-bit integer with truncation, and store the result in "dst".

    cvttss_si64(v128)

    Convert the lower single-precision (32-bit) floating-point element in "a" to a 64-bit integer with truncation, and store the result in "dst".

    div_ps(v128, v128)

    Divide packed single-precision (32-bit) floating-point elements in "a" by packed elements in "b", and store the results in "dst".

    div_ss(v128, v128)

    Divide the lower single-precision (32-bit) floating-point element in "a" by the lower single-precision (32-bit) floating-point element in "b", store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    load_ps(void*)

    Load 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from memory into dst.

    loadu_ps(void*)

    Load 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from memory into dst. mem_addr does not need to be aligned on any particular boundary.

    loadu_si16(void*)

    Load unaligned 16-bit integer from memory into the first element of dst.

    loadu_si64(void*)

    Load unaligned 64-bit integer from memory into the first element of dst.

    max_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b", and store packed maximum values in "dst".

    max_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b", store the maximum value in the lower element of "dst", and copy the upper element from "a" to the upper element of "dst".

    min_ps(v128, v128)

    Compare packed single-precision (32-bit) floating-point elements in "a" and "b", and store packed minimum values in "dst".

    min_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point elements in "a" and "b", store the minimum value in the lower element of "dst", and copy the upper element from "a" to the upper element of "dst".

    move_ss(v128, v128)

    Move the lower single-precision (32-bit) floating-point element from "b" to the lower element of "dst", and copy the upper 3 elements from "a" to the upper elements of "dst".

    movehl_ps(v128, v128)

    Move the upper 2 single-precision (32-bit) floating-point elements from "b" to the lower 2 elements of "dst", and copy the upper 2 elements from "a" to the upper 2 elements of "dst".

    movelh_ps(v128, v128)

    Move the lower 2 single-precision (32-bit) floating-point elements from "b" to the upper 2 elements of "dst", and copy the lower 2 elements from "a" to the lower 2 elements of "dst".

    movemask_ps(v128)

    Set each bit of mask "dst" based on the most significant bit of the corresponding packed single-precision (32-bit) floating-point element in "a".

    mul_ps(v128, v128)

    Multiply packed single-precision (32-bit) floating-point elements in "a" and "b", and store the results in "dst".

    mul_ss(v128, v128)

    Multiply the lower single-precision (32-bit) floating-point element in "a" and "b", store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    or_ps(v128, v128)

    Compute the bitwise OR of packed single-precision (32-bit) floating-point elements in "a" and "b", and store the results in "dst".

    rcp_ps(v128)

    Compute the approximate reciprocal of packed single-precision (32-bit) floating-point elements in "a", and store the results in "dst". The maximum relative error for this approximation is less than 1.5*2^-12.

    rcp_ss(v128)

    Compute the approximate reciprocal of the lower single-precision (32-bit) floating-point element in "a", store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst". The maximum relative error for this approximation is less than 1.5*2^-12.

    rsqrt_ps(v128)

    Compute the approximate reciprocal square root of packed single-precision (32-bit) floating-point elements in "a", and store the results in "dst". The maximum relative error for this approximation is less than 1.5*2^-12.

    rsqrt_ss(v128)

    Compute the approximate reciprocal square root of the lower single-precision (32-bit) floating-point element in "a", store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst". The maximum relative error for this approximation is less than 1.5*2^-12.

    set1_ps(float)

    Broadcast single-precision (32-bit) floating-point value "a" to all elements of "dst".

    set_ps(float, float, float, float)

    Set packed single-precision (32-bit) floating-point elements in "dst" with the supplied values.

    set_ps1(float)

    Broadcast single-precision (32-bit) floating-point value "a" to all elements of "dst".

    set_ss(float)

    Copy single-precision (32-bit) floating-point element "a" to the lower element of "dst", and zero the upper 3 elements.

    setr_ps(float, float, float, float)

    Set packed single-precision (32-bit) floating-point elements in "dst" with the supplied values in reverse order.

    setzero_ps()

    Return vector of type v128 with all elements set to zero.

    shuffle_ps(v128, v128, int)

    Shuffle single-precision (32-bit) floating-point elements in "a" using the control in "imm8", and store the results in "dst".

    sqrt_ps(v128)

    Compute the square root of packed single-precision (32-bit) floating-point elements in "a", and store the results in "dst".

    sqrt_ss(v128)

    Compute the square root of the lower single-precision (32-bit) floating-point element in "a", store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    store_ps(void*, v128)

    Store 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a into memory.

    storeu_ps(void*, v128)

    Store 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from a into memory. mem_addr does not need to be aligned on any particular boundary.

    storeu_si16(void*, v128)

    Store 16-bit integer from the first element of a into memory. mem_addr does not need to be aligned on any particular boundary.

    storeu_si64(void*, v128)

    Store 64-bit integer from the first element of a into memory. mem_addr does not need to be aligned on any particular boundary.

    stream_ps(void*, v128)

    Store 128-bits (composed of 4 packed single-precision (32-bit) floating-point elements) from "a" into memory using a non-temporal memory hint. "mem_addr" must be aligned on a 16-byte boundary or a general-protection exception will be generated.

    sub_ps(v128, v128)

    Subtract packed single-precision (32-bit) floating-point elements in "b" from packed single-precision (32-bit) floating-point elements in "a", and store the results in "dst".

    sub_ss(v128, v128)

    Subtract the lower single-precision (32-bit) floating-point element in "b" from the lower single-precision (32-bit) floating-point element in "a", store the result in the lower element of "dst", and copy the upper 3 packed elements from "a" to the upper elements of "dst".

    ucomieq_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point element in "a" and "b" for equality, and return the boolean result (0 or 1). This instruction will not signal an exception for QNaNs.

    ucomige_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point element in "a" and "b" for greater-than-or-equal, and return the boolean result (0 or 1). This instruction will not signal an exception for QNaNs.

    ucomigt_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point element in "a" and "b" for greater-than, and return the boolean result (0 or 1). This instruction will not signal an exception for QNaNs.

    ucomile_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point element in "a" and "b" for less-than-or-equal, and return the boolean result (0 or 1). This instruction will not signal an exception for QNaNs.

    ucomilt_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point element in "a" and "b" for less-than, and return the boolean result (0 or 1). This instruction will not signal an exception for QNaNs.

    ucomineq_ss(v128, v128)

    Compare the lower single-precision (32-bit) floating-point element in "a" and "b" for not-equal, and return the boolean result (0 or 1). This instruction will not signal an exception for QNaNs.

    unpackhi_ps(v128, v128)

    Unpack and interleave single-precision (32-bit) floating-point elements from the high half "a" and "b", and store the results in "dst".

    unpacklo_ps(v128, v128)

    Unpack and interleave single-precision (32-bit) floating-point elements from the low half of "a" and "b", and store the results in "dst".

    xor_ps(v128, v128)

    Compute the bitwise XOR of packed single-precision (32-bit) floating-point elements in "a" and "b", and store the results in "dst".

    In This Article
    Back to top
    Copyright © 2025 Unity Technologies — Trademarks and terms of use
    • Legal
    • Privacy Policy
    • Cookie Policy
    • Do Not Sell or Share My Personal Information
    • Your Privacy Choices (Cookie Settings)