quaterion_models.encoders.extras.fasttext_encoder module

class FasttextEncoder(model_path: str, on_disk: bool, aggregations: Optional[List[str]] = None)[source]

Creates a fasttext encoder, which generates vector for a list of tokens based in given fasttext model

Parameters:

model_path – Path to model to load
on_disk – If True - use mmap to keep embeddings out of RAM
aggregations – What types of aggregations to use to combine multiple vectors into one. If multiple aggregations are specified - concatenation of all of them will be used as a result.

classmethod aggregate(embeddings: Tensor, operation: str) → Tensor[source]

Apply aggregation operation to embeddings along the first dimension

Parameters:

Returns:

Tensor – aggregated embeddings

forward(batch: List[List[str]]) → Tensor[source]

Infer encoder - convert input batch to embeddings

get_collate_fn() → CollateFnType[source]

Provides function that converts raw data batch into suitable model input

classmethod load(input_path: str) → Encoder[source]

Instantiate encoder from saved state.

If no state required - just call create instead

save(output_path: str)[source]

Persist current state to the provided directory

property trainable: bool

Defines if encoder is trainable.

This flag affects caching and checkpoint saving of the encoder.

load_fasttext_model(path: str) → Union[FastText, KeyedVectors][source]

Load fasttext model in a universal way

Try to find possible way of loading FastText model and load it

Qdrant