docs.unity3d.com
Search Results for

    Show / Hide Table of Contents

    Class BertNormalizer

    Normalizes raw text input for Bert model.

    Inheritance
    object
    BertNormalizer
    Implements
    INormalizer
    Inherited Members
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.MemberwiseClone()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: Unity.InferenceEngine.Tokenization.Normalizers
    Assembly: Unity.InferenceEngine.Tokenization.dll
    Syntax
    public class BertNormalizer : INormalizer

    Constructors

    BertNormalizer(bool, bool, bool?, bool)

    Initializes a new instance of the type BertNormalizer

    Declaration
    public BertNormalizer(bool cleanText = true, bool handleCjkChars = true, bool? stripAccents = null, bool lowerCase = true)
    Parameters
    Type Name Description
    bool cleanText

    If true, removes control characters and replaces whitespaces by the classic one.

    bool handleCjkChars

    If true, puts spaces around each chinese character.

    bool? stripAccents

    If true, strips all accents. If set to null, it takes the value of lowerCase (original BERT implementation).

    bool lowerCase

    If true, converts the input to lowercase.

    Methods

    Normalize(SubString)

    Applies transformations to the input string before pre-tokenization.

    Declaration
    public SubString Normalize(SubString input)
    Parameters
    Type Name Description
    SubString input

    The string to transform.

    Returns
    Type Description
    SubString

    The resulting string.

    Implements

    INormalizer
    In This Article
    Back to top
    Copyright © 2025 Unity Technologies — Trademarks and terms of use
    • Legal
    • Privacy Policy
    • Cookie Policy
    • Do Not Sell or Share My Personal Information
    • Your Privacy Choices (Cookie Settings)