Building a Knowledge Base with Open AI, Langchain, OpenSearch, and Unstructured
In this blog post, I’ll discuss a project that we have been designed to help Voiceflow users build a knowledge base using custom APIs…
In this blog post, I’ll discuss a project that we have been designed to help Voiceflow users build a knowledge base using custom APIs.
The project utilizes Open AI, Langchain, Redis, OpenSearch, and Unstructured to fetch content from various sources such as URLs, sitemaps, text, PDFs, PowerPoints, Notion docs (markdown) and even images (OCR).
These sources of information are then turned into embeddings/vectors and saved in a local OpenSearch database. This knowledge base can then be used to generate context and answer questions, and because it’s an API, you can use it within your Voiceflow Assistant with the help of the API Step.
Overview
To get you started, let’s go through a quick overview of the project.
Installation
You’ll need Node.js 18+ to run this code. You can download it [here]
You will also need to have Docker Compose installed.
To get started, copy the `.env` file and set up required environment variables:
cp .env.example .env
To create the containers, install the required dependencies, and launch the server, run:
yarn build
This will create the following containers:
- Redis (cache)
- Unstructured (handles images, PPT, text, markdown)
- OpenSearch (search engine)
- OpenSearch-dashboards (search engine dashboard)
OpenSearch dashboard can be accessed at http://localhost:5601
API Documentation
There are several API endpoints available for various tasks:
Add content to OpenSearch: `POST /api/add`
Get a response using a live webpage as context: `POST /api/live`
Get a response using the vector store: `POST /api/question`
Clear Redis cache: `GET /api/clearcache`
Delete a collection: `DELETE /api/collection`
You can find more detailed API documentation in the README.md file on our repo.
Using live data
You can also use the `/api/live` endpoint to get a response using a live webpage as context without vectorizing the content.
Using the Knowledge Base
Once you have added content to OpenSearch, you can use the `/api/question` endpoint to get answers based on your knowledge base.
Now What?
Now you can easily set up and use your knowledge base to answer questions and provide valuable information for your users using the API Step in your Voiceflow Assistants.