Microsoft.ML.Tokenizers

Tokenizer

Contains Typescript and C# implementation of byte pair encoding (BPE) tokenizer for OpenAI LLMs, it is based on open sourced rust implementation in the OpenAI tiktoken.

Details