Vectors from llm

- Citation Author(s):
-
Maksim Pokrovskiy
- Submitted by:
- Maksim Pokrovskiy
- Last updated:
- DOI:
- 10.21227/t44r-9011
- Categories:
- Keywords:
Abstract
Here i got parsed literature site https://avidreaders.ru for about 10.000.000 sentences from russian books and make sentence vector embeddings from them using Mistral open API.
Embeddings got resized from 1024 to 256 dimensions using python scikit-learn PCA method.
Word embeddings are a way of representing words as vectors in a multi-dimensional space, where the distance and direction between vectors reflect the similarity and relationships among the corresponding words.
Mistral AI is a French company specializing in artificial intelligence (AI) products. Founded in April 2023 by former employees of Meta Platforms and Google DeepMind,[1] the company has quickly risen to prominence in the AI sector.
Instructions:
All information is in "ReadMe.txt".
Update