9 Best Local/Offline LLMs You Can Try Right Now

With quantum LLMs now available on HuggingFace and AI ecosystems like H20, Text Gen, and GPT4All allowing you to load LLM weights on your computer, you now have an option for free, flexible, and secure AI. Here are the 9 best local/offline LLMs you can try right now!

Table of Contents

1. Hermes 2 Pro GPTQ
2. Zephyr 7B Beta
3. Falcon Instruct GPTQ
4. GPT4ALL-J Groovy
5. DeepSeek Coder V2 Instruct
6. Mixtral-8x7B
7. Wizard Vicuna Uncensored-GPTQ
8. Orca Mini-GPTQ
9. Llama 2 13B Chat GPTQ

1. Hermes 2 Pro GPTQ

9 Best Local/Offline LLMs You Can Try Right Now

Hermes 2 Pro is a state-of-the-art language model fine-tuned by Nous Research. It uses an updated and compact version of the OpenHermes 2.5 dataset, along with the newly introduced Function Calling and JSON datasets developed by the company. The model is based on the Mistral 7B architecture and has been trained on 1,000,000 instructions/conversations of GPT-4 quality or better, mostly synthetic data.

Model	Hermes 2 Pro GPTQ
Model size	7.26 GB
Parameters	7 billion
Quantization	4-bit
Type	Mistral
License	Apache 2.0

The Hermes 2 Pro on the Mistral 7B is the new flagship Hermes 7B model, offering improved performance across a variety of benchmarks, including AGIEval, BigBench Reasoning, GPT4All, and TruthfulQA. Its enhanced capabilities make it suitable for a wide range of natural language processing (NLP) tasks, such as code generation, content creation, and conversational AI applications.

2. Zephyr 7B Beta

Zephyr is a series of language models trained to act as helpful assistants. Zephyr-7B-Beta is the second model in the series, fine-tuned from Mistral-7B-v0.1 using Direct Preference Optimization (DPO) on a mix of publicly available synthetic datasets.

Model	Zephyr 7B Beta
Model size	7.26 GB
Parameters	7 billion
Quantization	4-bit
Type	Mistral
License	Apache 2.0

By removing the built-in alignment of the training datasets, Zephyr-7B-Beta demonstrates improved performance on benchmarks like MT-Bench, increasing its usefulness for a variety of tasks. However, this adjustment can lead to problematic text generation when prompted in certain ways.

3. Falcon Instruct GPTQ

This quantized version of Falcon is based on a decoder-only architecture fine-tuned on TII's raw Falcon-7b model. The base Falcon model is trained using 1.5 trillion outstanding tokens sourced from the public Internet. As an Apache 2-licensed, command-based decoder-only model, Falcon Instruct is perfect for small businesses looking for a model to use for language translation and data ingestion.

Model	Falcon-7B-Instruct
Model size	7.58 GB
Parameters	7 billion
Quantization	4-bit
Type	Falcon
License	Apache 2.0

However, this version of Falcon is not ideal for fine-tuning and is only intended for inference. If you want to fine-tune Falcon, you will need to use the raw model, which may require access to enterprise-grade training hardware like NVIDIA DGX or AMD Instinct AI Accelerators.

4. GPT4ALL-J Groovy

GPT4All-J Groovy is a decoder-only model tuned by Nomic AI and licensed under Apache 2.0. GPT4ALL-J Groovy is based on the original GPT-J model, which is known to be great at generating text from prompts. GPT4ALL-J Groovy has been tuned into a conversational model, which is great for fast and creative text generation applications. This makes GPT4All-J Groovy ideal for content creators in assisting them with their writing and composition, whether it is poetry, music, or stories.

Model	GPT4ALL-J Groovy
Model size	3.53 GB
Parameters	7 billion
Quantization	4-bit
Type	GPT-J
License	Apache 2.0

Unfortunately, the baseline GPT-J model was trained on an English-only dataset, which means that even this fine-tuned GPT4ALL-J model can only converse and perform text generation applications in English.

5. DeepSeek Coder V2 Instruct

DeepSeek Coder V2 is an advanced language model that enhances programming and mathematical reasoning. DeepSeek Coder V2 supports multiple programming languages and provides extended context length, making it a versatile tool for developers.

Model	DeepSeek Coder V2 Instruct
Model size	13 GB
Parameters	33 billion
Quantization	4-bit
Type	DeepSeek
License	Apache 2.0

Compared to its predecessor, DeepSeek Coder V2 shows significant improvements in coding, reasoning, and general performance. It expands support for programming languages from 86 to 338 and extends the context length from 16K to 128K tokens. In benchmarks, it outperforms models such as GPT-4 Turbo, Claude 3 Opus, and Gemini 1.5 Pro in cryptographic and mathematical benchmarks.

6. Mixtral-8x7B

Mixtral-8x7B is a mixture of expert (MoE) model developed by Mistral AI. It has 8 experts per MLP, totaling 45 billion parameters. However, only two experts are activated per token during inference, making it computationally efficient, with speed and cost comparable to a 12 billion parameter model.

Model	Mixtral-8x7B
Model size	12 GB
Parameters	45 billion (8 experts)
Quantization	4-bit
Type	Mistral MoE
License	Apache 2.0

Mixtral supports context lengths of 32k tokens and outperforms Llama 2 by 70B on most benchmarks, matching or exceeding GPT-3.5 performance. It is fluent in multiple languages, including English, French, German, Spanish, and Italian, making it a versatile choice for a variety of NLP tasks.

7. Wizard Vicuna Uncensored-GPTQ

Wizard-Vicuna GPTQ is the quantum version of Wizard Vicuna based on the LlaMA model. Unlike most LLMs released to the public, Wizard-Vicuna is an uncensored model with de-linking. This means that the model does not have the same safety and ethical standards as most other models.

Model	Wizard-Vicuna-30B-Uncensored-GPTQ
Model size	16.94 GB
Parameters	30 billion
Quantization	4-bit
Type	LlaMA
License	GPL 3

While it can pose a problem to control AI alignment, having an uncensored LLM also brings out the best in the model by allowing it to respond without any constraints. This also allows users to add their own custom alignment to how the AI should act or respond based on a given prompt.

8. Orca Mini-GPTQ

Looking to test a model trained using a unique learning approach? Orca Mini is an informal implementation of Microsoft’s Orca research papers. The model is trained using a teacher-student learning approach, where the dataset is filled with explanations rather than just prompts and feedback. This should theoretically make the student smarter, as the model can understand the problem rather than just look for input and output pairs as a typical LLM would.

9. Llama 2 13B Chat GPTQ

Llama 2 is the successor to the original Llama LLM, offering improved performance and flexibility. The 13B Chat GPTQ variant is tuned for conversational AI applications optimized for English dialogue.

Some of the models listed above come in multiple spec versions. Generally, higher spec versions will produce better results but require more powerful hardware, while lower spec versions will produce lower quality results but can run on lower-end hardware. If you’re not sure whether your PC can run a model, try the lower spec version first, then move on until you feel the performance drop is no longer acceptable.

Tags: #llm #llm local #llm offline #llm local best #llm offline best #hermes 2 pro gptq

Microsoft adds GPT-4 Turbo LLM to free version of Copilot

The year 2023 saw Microsoft betting heavily on artificial intelligence and its partnership with OpenAI to make Copilot a reality.

Nvidia Just Released Open Source LLM to Compete with GPT-4

Nvidia has just announced the release of an open-source large language model (LLM) that is said to perform on par with leading proprietary models from OpenAI, Anthropic, Meta, and Google.

Foxconn Announces Large Language Model (LLM) Refined from Metas Llama 3.1

Foxconn, the company best known for manufacturing iPhones and other Apple hardware products, has just surprised everyone by announcing its first large language model (LLM), called FoxBrain, which is intended to be used to improve manufacturing and supply chain management.

Where to find Dead Pool area in Fortnite?

Dead Pool is one of the Mortal Kombat-themed areas added to Fortnite since the v34.21 update.

Does the phone in airplane mode but still using wifi attract lightning?

Should you use your phone during a thunderstorm? Will your phone signal be hit by lightning? Will using wifi while your phone is in airplane mode attract lightning?

When did the first rain appear on Earth?

New research by a team of researchers from Australia and China suggests that the first rain appeared on Earth around 4 billion years ago, 500 million years earlier than previously thought.

The most dangerous thing in the world, just 5 minutes of contact can kill you

This object is located in the Chernobyl area, Ukraine, as a result of one of the most terrifying radiation leaks in history. This object is shaped like a giant foot, so it is called the Elephant's Foot.

Latest Code of the Main Character Clash Clash

The game code Chu Cong Xung Xung Xung allows players to receive support when they first enter the game. These rewards will help you get acquainted with the game faster and will definitely attract new players. Enter the game code Chu Cong Xung Xung Xung below to receive gifts.

9 Best Local/Offline LLMs You Can Try Right Now

Reasons to Switch from Native Android Voice Recorder App to Dolby On

The default voice recorder on Android phones is flawed, lacking essential features like effective noise cancellation and powerful editing tools.

How to set default volume level for each application

Setting a default volume level for each application is useful in some cases, such as you are watching a video using the Youtube application and want to reduce the volume of the ringtone and message tone so as not to be disturbed.

Collection of beautiful nail designs for Valentines Day

Pink glitter and red nail polish are especially suitable for Valentine's Day nails. What could be better than that? This article will summarize for you beautiful nail designs for a sweet date on Valentine's Day.

Status to change yourself, caption to change yourself to create motivation in life

Below are statuses about changing yourself and captions about changing yourself that will help you have more motivation to adapt to life's changes.

IOS 19 Leak Reveals All-New Design

Late last June, Bloomberg reported that Apple had begun development on iOS 19, macOS 16, watchOS 12, and VisionOS 3 for next year.

Super slimmed down Windows 11 version is just over 100MB in size

Developer NTDEV has created a stripped down version of Windows 11 that's just 100MB.

Invisibility shield makes people or objects behind it invisible.

British scientists have created an invisible shield that can make people or objects behind them almost completely "disappear".

There are 3 doors in the world that cannot be opened.

In the world, there are 3 doors that are said to be unopenable. And in fact, later generations have not tried to discover the secrets inside them.

What are the shadows of people and objects on the sidewalk left by the Hiroshima atomic bomb?

After the US dropped atomic bombs on Hiroshima and Nagasaki, two of Japan's largest cities, on August 6 and 9, 1945 respectively, dark shadows of people and objects appeared, scattered on the sidewalks and buildings there.

9 Best Local/Offline LLMs You Can Try Right Now

1. Hermes 2 Pro GPTQ

2. Zephyr 7B Beta

3. Falcon Instruct GPTQ

4. GPT4ALL-J Groovy

5. DeepSeek Coder V2 Instruct

6. Mixtral-8x7B

7. Wizard Vicuna Uncensored-GPTQ

8. Orca Mini-GPTQ

9. Llama 2 13B Chat GPTQ

Leave a Comment

Microsoft adds GPT-4 Turbo LLM to free version of Copilot

Nvidia Just Released Open Source LLM to Compete with GPT-4

Foxconn Announces Large Language Model (LLM) Refined from Metas Llama 3.1

Where to find Dead Pool area in Fortnite?

Does the phone in airplane mode but still using wifi attract lightning?

When did the first rain appear on Earth?

The most dangerous thing in the world, just 5 minutes of contact can kill you

Latest Code of the Main Character Clash Clash

9 Best Local/Offline LLMs You Can Try Right Now

Reasons to Switch from Native Android Voice Recorder App to Dolby On

How to set default volume level for each application

Collection of beautiful nail designs for Valentines Day

Status to change yourself, caption to change yourself to create motivation in life

IOS 19 Leak Reveals All-New Design

Super slimmed down Windows 11 version is just over 100MB in size

Invisibility shield makes people or objects behind it invisible.

There are 3 doors in the world that cannot be opened.

What are the shadows of people and objects on the sidewalk left by the Hiroshima atomic bomb?