quaterion_models.encoders.extras.fasttext_encoder module¶
- class FasttextEncoder(model_path: str, on_disk: bool, aggregations: Optional[List[str]] = None)[source]¶
Bases:
Encoder
Creates a fasttext encoder, which generates vector for a list of tokens based in given fasttext model
- Parameters:
model_path – Path to model to load
on_disk – If True - use mmap to keep embeddings out of RAM
aggregations – What types of aggregations to use to combine multiple vectors into one. If multiple aggregations are specified - concatenation of all of them will be used as a result.
- classmethod aggregate(embeddings: Tensor, operation: str) Tensor [source]¶
Apply aggregation operation to embeddings along the first dimension
- Parameters:
embeddings – embeddings to aggregate
operation – one of
aggregation_options
- Returns:
Tensor – aggregated embeddings
- forward(batch: List[List[str]]) Tensor [source]¶
Infer encoder - convert input batch to embeddings
- Parameters:
batch – processed batch
- Returns:
embeddings – shape: (batch_size, embedding_size)
- get_collate_fn() CollateFnType [source]¶
Provides function that converts raw data batch into suitable model input
- Returns:
CollateFnType
– model’s collate function
- classmethod load(input_path: str) Encoder [source]¶
Instantiate encoder from saved state.
If no state required - just call create instead
- Parameters:
input_path – path to load from
- Returns:
Encoder
– loaded encoder
- save(output_path: str)[source]¶
Persist current state to the provided directory
- Parameters:
output_path – path to save model
- aggregation_options = ['min', 'max', 'avg']¶
- property embedding_size: int¶
Size of resulting embedding
- property trainable: bool¶
Defines if encoder is trainable.
This flag affects caching and checkpoint saving of the encoder.
- training: bool¶
- load_fasttext_model(path: str) Union[FastText, KeyedVectors] [source]¶
Load fasttext model in a universal way
Try to find possible way of loading FastText model and load it
- Parameters:
path – path to FastText model or vectors
- Returns:
FastText
orKeyedVectors
– loaded model