Class RobertaPostProcessor
Adds the special tokens needed by a Roberta model. Surrounds the single sequence with CLS and SEP tokens. Surrounds the second sequence of a pair and SEP tokens.
Implements
Inherited Members
Namespace: Unity.InferenceEngine.Tokenization.PostProcessors
Assembly: Unity.InferenceEngine.Tokenization.dll
Syntax
public class RobertaPostProcessor : IPostProcessor
Constructors
RobertaPostProcessor(Token, Token, bool, bool)
Initializes a new instance of the RobertaPostProcessor type.
Declaration
public RobertaPostProcessor(Token sep, Token cls, bool addPrefixSpace = true, bool trimOffsets = true)
Parameters
| Type | Name | Description |
|---|---|---|
| Token | sep | The SEP token definition. |
| Token | cls | The CLS token definition. |
| bool | addPrefixSpace | Matches the add prefix space options of the pre-tokenization component. It defines the way the offsets are trimmed out. Not yet implemented. |
| bool | trimOffsets | Whether to trim the whitespaces from the produced offsets. Not yet implemented. |
Methods
GetNumAddedTokens(bool)
Determines the number of tokens that this IPostProcessor will add to the sequence of tokens.
Declaration
public int GetNumAddedTokens(bool isPair)
Parameters
| Type | Name | Description |
|---|---|---|
| bool | isPair | Tells if we want the number of added tokens for a pair of sequences of tokens (true), of a single sequence (false). |
Returns
| Type | Description |
|---|---|
| int | Number of tokens that this IPostProcessor will add to the sequence of tokens |
PostProcess(IReadOnlyList<IReadOnlyList<Token>>, IReadOnlyList<IReadOnlyList<Token>>, bool, Output<IEnumerable<IEnumerable<Token>>>)
Processes the sequence of tokens.
Declaration
public void PostProcess(IReadOnlyList<IReadOnlyList<Token>> sequenceA, IReadOnlyList<IReadOnlyList<Token>> sequenceB, bool addSpecialTokens, Output<IEnumerable<IEnumerable<Token>>> output)
Parameters
| Type | Name | Description |
|---|---|---|
| IReadOnlyList<IReadOnlyList<Token>> | sequenceA | The single, or first sequence of tokens. |
| IReadOnlyList<IReadOnlyList<Token>> | sequenceB | The second sequence of a pair. |
| bool | addSpecialTokens | Tells whether adding a special tokens in the result sequences. |
| Output<IEnumerable<IEnumerable<Token>>> | output | The recipient of processed sequences. |