Struct UnigramVocabEntry
Represents a vocabulary entry for unigram tokenization, containing a token string and its associated score. This structure is used to store token-score pairs in the unigram vocabulary for tokenization algorithms.
Inherited Members
Namespace: Unity.InferenceEngine.Tokenization.Mappers
Assembly: Unity.InferenceEngine.Tokenization.dll
Syntax
public readonly struct UnigramVocabEntry
Constructors
UnigramVocabEntry(string, double)
Initializes a new instance of the UnigramVocabEntry struct with the specified token value and score.
Declaration
public UnigramVocabEntry(string value, double score)
Parameters
| Type | Name | Description |
|---|---|---|
| string | value | The token string value. |
| double | score | The score associated with the token, typically representing its probability or frequency. |
Fields
Score
The score associated with this token, typically representing its probability or frequency in the training corpus. Higher scores generally indicate more common or preferred tokens in the unigram model.
Declaration
public readonly double Score
Field Value
| Type | Description |
|---|---|
| double |
Value
The token string value for this vocabulary entry.
Declaration
public readonly string Value
Field Value
| Type | Description |
|---|---|
| string |
Methods
Deconstruct(out string, out double)
Deconstructs the vocabulary entry into its constituent parts for tuple deconstruction. This allows the entry to be used in pattern matching and tuple assignments.
Declaration
public void Deconstruct(out string value, out double score)
Parameters
| Type | Name | Description |
|---|---|---|
| string | value | When this method returns, contains the token string value. |
| double | score | When this method returns, contains the score associated with the token. |
Examples
var entry = new UnigramVocabEntry("hello", 0.5);
var (tokenValue, tokenScore) = entry; // Uses Deconstruct method