Class BpeDecoder
Decoder for Byte Pair Encoding (BPE) tokens that removes suffix markers and restores spaces between words.
Implements
Inherited Members
Namespace: Unity.InferenceEngine.Tokenization.Decoders
Assembly: Unity.InferenceEngine.Tokenization.dll
Syntax
public class BpeDecoder : IDecoder
Remarks
This decoder processes BPE tokens by replacing the suffix marker with spaces to reconstruct the original text. The last token does not receive a trailing space.
Constructors
BpeDecoder(string)
Initializes a new instance of the BpeDecoder class with the specified suffix.
Declaration
public BpeDecoder(string pattern = "</w>")
Parameters
| Type | Name | Description |
|---|---|---|
| string | pattern | The suffix marker to be replaced during decoding. Default value is "</w>". |
Exceptions
| Type | Condition |
|---|---|
| ArgumentNullException | Thrown when |
Methods
Decode(IReadOnlyList<string>, Output<string>)
Applies modifications to the input detokenized strings.
Declaration
public void Decode(IReadOnlyList<string> tokens, Output<string> output)
Parameters
| Type | Name | Description |
|---|---|---|
| IReadOnlyList<string> | tokens | The string values to modify. |
| Output<string> | output | The recipient of modified strings. |