The newly released Jetson Copilot has attracted a lot of attention. Through this review based on the Jetson Orin 64GB platform, we will have a comprehensive understanding of the functions and performance of Jetson Copilot and its potential in practical applications. We will guide you through every step from installation to startup, and experience its interaction with the llama3 8b model and how to use pre-built indexes for efficient questioning.
To start using Jetson Copilot, you first need to clone its code repository from GitHub:
Bash
git clone https://github.com/NVIDIA-AI-IOT/jetson-copilot/
cd jetson-copilot
./setup_environment.sh
./launch_jetson_copilot.sh
After executing the above command, Jetson Copilot will start the Ollama server and Streamlit application inside the Docker container. You can access the web application hosted on Jetson through the URL output by the console.
On Jetson, you can use a web browser to open the local URL (http://localhost:8501) to access the application. If you are using a PC on the same network connected to Jetson, you can also access it through the network URL.
Jetson copilot currently only supports llama3 8b models. Due to the loading of the model, the first conversation speed is slow, and the subsequent conversation speed is about 13 tokens/s.
Figure: Demo of llama3 8b
Ask Copilot relevant questions using a pre-built index
Copilot's example is a Jeston Orin operation document. From the demonstration video, it can be seen that Copilot takes about 26 seconds to search and generate content from the index document.
Figure: Demo of Copilot
Create your own index based on your documents and ask questions
Use the LattePanda Mu product webpage content from DFRobot online store as the index document:
In addition, Jetson Copilot currently only supports the mxbai-embed-large embedding model. mxbai-embed-large is an advanced embedding model that achieved the best performance on MTEB (Massive Text Embedding Benchmark) as of March 2024, surpassing the Bert-large size model. It uses contrastive training and AnglE loss function for fine-tuning, making it adaptable to a wide range of topics and fields, suitable for various practical applications and retrieval-enhanced generation (RAG) use cases.
When processing data, Jetson Copilot uses Chunk size to split the data set into small blocks, and uses Chunk overlap to ensure that there is a certain overlap between the split data blocks to reduce edge effects.
The generated folder will be in the jetson-copilot/index folder:
Test that multiple URLs can still generate index documents:
Figure: Demo
You can also choose to use OpenAI's embedding model to generate an index file:
Figure: OpenAI's embedding model
Jetson Copilot, an advanced tool based on NVIDIA Jetson Orin, provides an easy command line startup method.
llama3 exploration scenario:
Currently, it is optimized for the llama3 8b model, ensuring a smooth conversation experience and processing about 13 tokens per second.
RAG application built with llama3:
In addition, it also supports efficient index creation using the mxbai-embed-large model. Regarding data processing, users can flexibly adjust the chunk size and chunk overlap of data blocks to optimize data segmentation and reduce information loss. Jetson Copilot also allows users to use OpenAI's embedding model to build index files, further enriching its functionality. The process of retrieving and generating content from indexed documents takes about 26 seconds, and the actual output token speed is also 13 tokens/s. Jetson Copilot is a comprehensive and easy-to-use tool that is very suitable for a variety of practical application scenarios and retrieval enhancement generation (RAG) tasks.
Comparison of performance of different frameworks
When using the MLC/TVM framework, the performance of different large language models on Jetson Orin is also different. It can be seen that the text generation rate of the Llama3 8B (int4) on Jetson AGX Orin under the MLC/TVM framework reaches 40 tokens/s.
SLM text generation rate
1. code: https://github.com/NVIDIA-AI-IOT/jetson-copilot/
2. Jetson-copilot TODO:
1. Error: Unable to open localhost, solution: give docker permissions
Bash
sudo usermod -aG docker root
sudo reboot
2. Network error, solution: reconnect to the network and restart
Network error and solution