Skip to content

MandalAutomations/Vector-Visualizer-OpenAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Embedding Visualization with LangChain + OpenAI

Generate text embeddings using OpenAI models through LangChain, reduce them to 3D with PCA, and plot them with Matplotlib so you can eyeball the semantic relationships between words.

Features

  • Generate embeddings for arbitrary text using OpenAI's embedding models.
  • Choose between:
    • text-embedding-3-large
    • text-embedding-3-small
    • text-embedding-ada-002 (default)
  • Dimensionality reduction with PCA.
  • 3D scatter plot saved as a high-resolution PNG.

Requirements

Install dependencies with:

pip install -r requirements.txt

You will also need an OpenAI API key.

Setup

  1. Clone this repository:

    git clone https://github.com/MandalAutomations/Vector-Visualizer-OpenAI.git
    cd Vector-Visualizer-OpenAI
  2. Create a .env file in the project root with your OpenAI API key:

    echo "OPENAI_API_KEY=your_api_key_here" > .env
  3. (Optional) choose a different embedding model by editing EMBEDDING_MODEL near the top of main.py:

    EMBEDDING_MODEL = "text-embedding-ada-002"
  4. Add the words/phrases you want to embed to words.txt, one per line:

    nfl
    football
    soccer
    basketball
    baseball

Usage

Run the script to generate embeddings and plot them:

python main.py

This will:

  1. Read each line from words.txt and request an embedding for it.
  2. Reduce the embeddings to 3D with PCA.
  3. Save the visualization to 3d_plot_small.png in the project root.

Project Structure

.
├── main.py              # Main script — embeds words.txt and plots them
├── words.txt            # Words/phrases to embed (one per line)
├── requirements.txt     # Dependencies
├── 3d_plot_small.png    # Example output plot
└── .env                 # API key (not committed)

Customization

  • Edit words.txt to change which words are compared.
  • Change EMBEDDING_MODEL in main.py to swap models.
  • Adjust the dpi argument inside plot_embeddings_3d to control output resolution.

Example Plot

3D embedding plot

About

Generate OpenAI embeddings with LangChain and visualize their semantic relationships in 3D using PCA and Matplotlib.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages