Class ByteLevelPostProcessor
ByteLevel post processor only concatenates the pair sequences. The former implementation from Hugging Face trims offsets of tokenized strings, but this implementation does support offsets.
Implements
Inherited Members
Namespace: Unity.InferenceEngine.Tokenization.PostProcessors
Assembly: Unity.InferenceEngine.Tokenization.dll
Syntax
public class ByteLevelPostProcessor : IPostProcessor
Constructors
ByteLevelPostProcessor(bool)
Initializes a new instance of the ByteLevelPostProcessor type.
Declaration
public ByteLevelPostProcessor(bool trimOffsets = false)
Parameters
| Type | Name | Description |
|---|---|---|
| bool | trimOffsets | Whether to trim the whitespaces from the produced offsets. Not yet implemented. |
Methods
GetNumAddedTokens(bool)
Determines the number of tokens that this IPostProcessor will add to the sequence of tokens.
Declaration
public int GetNumAddedTokens(bool _)
Parameters
| Type | Name | Description |
|---|---|---|
| bool | _ |
Returns
| Type | Description |
|---|---|
| int | Number of tokens that this IPostProcessor will add to the sequence of tokens |
PostProcess(IReadOnlyList<IReadOnlyList<Token>>, IReadOnlyList<IReadOnlyList<Token>>, bool, Output<IEnumerable<IEnumerable<Token>>>)
Processes the sequence of tokens.
Declaration
public void PostProcess(IReadOnlyList<IReadOnlyList<Token>> sequenceA, IReadOnlyList<IReadOnlyList<Token>> sequenceB, bool _, Output<IEnumerable<IEnumerable<Token>>> output)
Parameters
| Type | Name | Description |
|---|---|---|
| IReadOnlyList<IReadOnlyList<Token>> | sequenceA | The single, or first sequence of tokens. |
| IReadOnlyList<IReadOnlyList<Token>> | sequenceB | The second sequence of a pair. |
| bool | _ | |
| Output<IEnumerable<IEnumerable<Token>>> | output | The recipient of processed sequences. |