Skip to main content

📦 embeddings

A library for querying sentence embeddings based on the Universal Sentence Encoder. Each embedding is returned as a 512-dimensional array, and you can query multiple text strings at once.

Install

npm install @energetic-ai/embeddings

Usage

If you don't supply a specific model source, initModel will implicitly use remoteModelSource to download the model weights from TFHub on first use, which can take ~2 seconds.

import { initModel, distance } from "@energetic-ai/embeddings";
(async () => {
const model = await initModel();
const embeddings = await model.embed(["hello", "world"]);
console.log(distance(embeddings[0], embeddings[1]));
})();

You can also download the model weights locally by installing the @energetic-ai/model-embeddings-en package, and reduce cold start inference to ~20 ms at the cost of module size.

import { initModel } from "@energetic-ai/embeddings";
import { modelSource } from "@energetic-ai/model-embeddings-en";
(async () => {
const model = await initModel(modelSource);
// ... snip ...
})();

Exports

ExportDescription
TokenizerA class that represents a tokenizer used for encoding input strings into sequences of numbers based on a provided vocabulary.
VocabularyAn array of tuples representing a collection of words or tokens along with their corresponding numeric indices. It is used by the Tokenizer class for encoding purposes.
EmbeddingsModelDataA type representing data for an embeddings model, both vocabulary and model weights.
EmbeddingsModelSourceA type representing a function that returns a promise of EmbeddingsModelData.
EmbeddingsModelA class representing an embeddings model, with a tokenizer and a graph model.
EmbeddingsModel.embed(inputs: string[]) or EmbeddingsModel.embed(inputs: string[])A method of the EmbeddingsModel class that returns embeddings for each of the input strings, as 512-dimensional vectors. If the input is an array, the output will be an array of embeddings. If the input is a string, the output will be an embedding.
remoteModelSourceA constant representing an EmbeddingsModelSource that downloads the model weights from TFHub.
initModel(source?: EmbeddingsModelSource)A function that initializes an embeddings model with an optional model source. If no model source is passed, remoteModelSource is used, which will download weights from TFHub when called.
distance(a: number[], b: number[])A function that returns the distance between two vectors using cosine similarity.