Projects
Discord LLM Bot
Retrieval Augmented Genration (RAG) powered Discord Bot that works seamlessly on CPU. Powered by LanceDB and Llama.cpp. This Discord bot is designed to helps answer questions based on a knowledge base (vector db).
graph LR
A((User Query)) --> B((Convert to Embedding))
B --> C((Find Similar Document<br>from Vector Database))
C --> D((Use Retrieved Document<br>as Context to Answer Question<br>using Mistral 7B LLM))
LLM Inference
Large Language Model (LLM) Inference API and Chatbot 🦙
Build and run LLM Chatbot under 7 GB GPU memory in 5 lines of code.
from llm_chain import LitGPTConversationChain, LitGPTLLM
from llm_inference import prepare_weights
= str(prepare_weights("meta-llama/Llama-2-7b-chat-hf"))
path = LitGPTLLM(checkpoint_dir=path, quantize="bnb.nf4") # 7GB GPU memory
llm = LitGPTConversationChain.from_llm(llm=llm, prompt=llama2_prompt_template)
bot
print(bot.send("hi, what is the capital of France?"))
Gradsflow
An open-source AutoML Library based on PyTorch
Chitra
A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.