docs.unity3d.com
Search Results for

    Show / Hide Table of Contents

    Namespace Unity.InferenceEngine.Tokenization.PreTokenizers

    Classes

    BertPreTokenizer

    Splits on spaces and punctuation, removing spaces, and keeping each punctuation as separated chunk.

    ByteLevelPreTokenizer

    Pre tokenize an input using ByteLevel rules.

    DefaultPreTokenizer

    Default placeholder implementation of a pre-tokenizer. Does not pre-cut the input.

    RegexSplitPreTokenizer

    Splits the input based on a regular expression.

    SequencePreTokenizer

    Applies a sequence of pre tokenizers.

    Interfaces

    IPreTokenizer

    Pre-cuts the input string into smaller parts. Those parts will be passed to the IMapper for tokenization.

    In This Article
    Back to top
    Copyright © 2025 Unity Technologies — Trademarks and terms of use
    • Legal
    • Privacy Policy
    • Cookie Policy
    • Do Not Sell or Share My Personal Information
    • Your Privacy Choices (Cookie Settings)