docs.unity3d.com
Search Results for

    Show / Hide Table of Contents

    Class DigitsPreTokenizer

    A pre-tokenizer that splits input text at digit boundaries. This class separates numeric digits from non-numeric characters during the pre-tokenization phase.

    Inheritance
    object
    DigitsPreTokenizer
    Implements
    IPreTokenizer
    Inherited Members
    object.Equals(object)
    object.Equals(object, object)
    object.GetHashCode()
    object.GetType()
    object.MemberwiseClone()
    object.ReferenceEquals(object, object)
    object.ToString()
    Namespace: Unity.InferenceEngine.Tokenization.PreTokenizers
    Assembly: Unity.InferenceEngine.Tokenization.dll
    Syntax
    public class DigitsPreTokenizer : IPreTokenizer
    Remarks

    The tokenizer can operate in two modes:

    • Grouped mode: Consecutive digits are kept together as a single token (e.g., "abc123def" → ["abc", "123", "def"]).
    • Individual mode: Each digit is separated into its own token (e.g., "abc123def" → ["abc", "1", "2", "3", "def"]).

    Constructors

    DigitsPreTokenizer(bool)

    Initializes a new instance of the DigitsPreTokenizer class.

    Declaration
    public DigitsPreTokenizer(bool individualDigits = false)
    Parameters
    Type Name Description
    bool individualDigits

    If true, each digit is split into its own token; if false, consecutive digits are grouped together as a single token. Default is false.

    Methods

    PreTokenize(SubString, Output<SubString>)

    Pre-cuts the input into smaller parts.

    Declaration
    public void PreTokenize(SubString input, Output<SubString> output)
    Parameters
    Type Name Description
    SubString input

    The source to pre-cut.

    Output<SubString> output

    Target collection of generated pre-tokenized strings.

    Implements

    IPreTokenizer
    In This Article
    Back to top
    Copyright © 2026 Unity Technologies — Trademarks and terms of use
    • Legal
    • Privacy Policy
    • Cookie Policy
    • Do Not Sell or Share My Personal Information
    • Your Privacy Choices (Cookie Settings)