GPT local

26 November,2023 07:32 AM IST |  Mumbai  |  Jaison Lewis

There is a way to get the algorithm of a Large Language Model (LLM), comparable to Chat GPT, running directly on your PC. We show you how

Representation Pic


A new project called Text Generation WebUI or OobaBooga as it is rather funnily known, thanks to the name of its creator on GitHub, is the latest LLM model in town. Whatever the name, it aims to simplify how LLMs are installed on the local computer, similar to what Automatic 1111 did for Stable Diffusion, truly democratising the technology. So, let's get right into it.

Requirements

>> A computer with a Nvidia graphics card and more RAM on the card, which translates to a better-performing LLM. You can technically run it on your processor as well, but be prepared to wait a few hours for every response.

>> Lots of storage space. LLM models are at least a couple of GBs each, and if you want a choice of a few LLMs, be prepared to fill up your hard drive. >> Access to the Internet to download the models and the software. Once installed, however, the LLM will not need the Internet to function.

Download and install

You will need to go to OobaBooga's Github (https://bit.ly/smdooba), click on the green colour Code button and then click on Download Zip. The entire file is around 20MB. Copy this zip file to the place where you plan to use it, and make sure there is a lot of space in the destination drive. Unzip the file. Go into the directory called text-generation-webui-main.

There should be one more directory with the same name. Go into that directory as well. If you are using Windows, double-click on the start_windows.bat file. This should start the installation process. The program will ask you what graphics card you are running. If you are running a Nvidia card, select ‘A'. AMD GPUs are only supported on Linux installations. If you don't have a GPU, select ‘N'.

If you have selected ‘A' for Nvidia, you will have another choice to make. It will ask you if you want to use CUDA 11.8 instead of 12.1. If you have an RTX or a GTX card, select ‘N'. If you have an older GPU, select ‘Y'. If you are unsure select ‘N'. It will start installing everything you need to run the program. Brace yourself: this is going to take a while. Once done, though, you can immediately start up the web interface. It should say, "Running one local URL: https://127.0.0.1:7860.
Type the URL in your browser.

Getting the model

The interface is useless without a model. There are several models to choose from, but Mistral 7B is currently my favourite model. Let's see how to install it. First, go to TheBloke's Hugging Face (https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GPTQ). Scroll down to "How to download, including from branches."

Select the text "TheBloke/Mistral-7B-Instruct-v0.1-GPTQ". Open your Text-generation-webui window. On top select the ‘Model' tab. Go to the Download model or LoRA section on the right and paste the text. Click on Download.

This will take a while because the file is a couple of GBs. You can use the same process to download different models, including LLAMA, Qwen, Orca or Gorilla. Read about models that suit your specific use case. It's a bit of a rabbit hole, but if you want something that works well, stick to Mistral 7B Instruct.

Loading the model

Once downloaded. On the left top section of the Model tab, select the model you downloaded and press Load if the model you downloaded doesn't show up. Close the Command window running in the background, and start the LLM up again by running the "start_windows.bat" file. Make sure you try loading the model again, and once it is loaded, head to the Chat tab. There, you can ask all sorts of questions, except those that show illegal or dangerous results. Just like Chat GPT, you can't ask for instructions for making a bomb or about current affairs.

Maintenance

Text-generation-webui is pretty self-sufficient, and you can easily upgrade it to the newest version by running the "update_windows.bat" file in the same directory as the start file. The update should be run every few weeks, as there are new and exciting additions to the program. This is, however, an upgrade tool only for the main program. For a newer model, you must explore TheBloke's hugging face and download a newer version there. There are usually good descriptions for models but feel free to explore the parent page for each model to see if the model is a good fit for you.

"Exciting news! Mid-day is now on WhatsApp Channels Subscribe today by clicking the link and stay updated with the latest news!" Click here!
life and style tech news mumbai Lifestyle news culture news india
Related Stories