docs.unity3d.com
Search Results for

    Show / Hide Table of Contents

    Class WhitespaceSplitPreTokenizer

    A pre-tokenizer that splits input text on whitespace characters.

    Inheritance
    object
    WhitespaceSplitPreTokenizer
    Implements
    IPreTokenizer
    Inherited Members
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.MemberwiseClone()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: Unity.InferenceEngine.Tokenization.PreTokenizers
    Assembly: Unity.InferenceEngine.Tokenization.dll
    Syntax
    public class WhitespaceSplitPreTokenizer : IPreTokenizer
    Remarks

    This pre-tokenizer divides the input string into sub-strings by splitting at whitespace boundaries. Whitespace characters are not included in the output tokens. Consecutive whitespace characters are treated as delimiters and empty strings are not added to the output.

    Methods

    PreTokenize(SubString, Output<SubString>)

    Pre-cuts the input into smaller parts.

    Declaration
    public void PreTokenize(SubString input, Output<SubString> output)
    Parameters
    Type Name Description
    SubString input

    The source to pre-cut.

    Output<SubString> output

    Target collection of generated pre-tokenized strings.

    Implements

    IPreTokenizer
    In This Article
    Back to top
    Copyright © 2026 Unity Technologies — Trademarks and terms of use
    • Legal
    • Privacy Policy
    • Cookie Policy
    • Do Not Sell or Share My Personal Information
    • Your Privacy Choices (Cookie Settings)