个人常用LLM模型、工具及网站整理

By Refrain | May 06, 2024 | 2 minutes

LLM 在线平台

Claude

Claude is a next generation AI assistant built for work and trained to be safe, accurate, and secure.

Poe

Poe 是一个可供人们提出问题、获得即时答复并与各种人工智能机器人进行对话的大型平台，适用于 iOS、Android、 MacOS、Windows、和 web1 等系统。

Poe目前支持的机器人包括：OpenAI的 ChatGPT、GPT-4 和 DALL-E 3，Anthropic的 Claude Instant、Claude 2 和 Claude 3，Stability AI的 StableDiffusionXL，Google的 PaLM 和 Gemini-Pro，Meta的 Llama 2，Playground的 Playground-v2，Mistral的 Mistral-Medium，以及大量由社区用户自己创建的机器人。我们希望构建一个平台，让人们在其中共同探索新款AI模型所带来的各种可能性。

基于 LLM 的在线工具

Chatpdf

https://www.chatpdf.com/

AI理解论文之类的pdf文档。

Join millions of students, researchers and professionals to instantly answer questions and understand research with AI.

LLM模型

llama

Meta’s Paper: LLaMA: Open and Efficient Foundation Language Models

LLM 本地部署工具

llama.cpp

https://github.com/ggerganov/llama.cpp

The main goal of llama.cpp is to enable LLM inference with minimal setup and state-of-the-art performance on a wide variety of hardware - locally and in the cloud.

Plain C/C++ implementation without any dependencies Apple silicon is a first-class citizen - optimized via ARM NEON, Accelerate and Metal frameworks AVX, AVX2 and AVX512 support for x86 architectures 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization for faster inference and reduced memory use Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs via HIP) Vulkan, SYCL, and (partial) OpenCL backend support CPU+GPU hybrid inference to partially accelerate models larger than the total VRAM capacity Since its inception, the project has improved significantly thanks to many contributions. It is the main playground for developing new features for the ggml library.

Ollama

https://ollama.com/

Run Llama 3, Phi 3, Mistral, Gemma, and other models. Customize and create your own.

基于Golang实现的LLM服务器，后端使用的是llama.cpp。

RAG

Langchain

https://www.langchain.com/

https://github.com/langchain-ai

FastGPT

FastGPT is a knowledge-based platform built on the LLMs, offers a comprehensive suite of out-of-the-box capabilities such as data processing, RAG retrieval, and visual AI workflow orchestration, letting you easily develop and deploy complex question-answering systems without the need for extensive setup or configuration.

https://github.com/labring/FastGPT

RAGFlow

RAGFlow is an open-source RAG (Retrieval-Augmented Generation) engine based on deep document understanding.

https://github.com/infiniflow/ragflow

Dify.AI

Dify is an open-source LLM app development platform. Dify’s intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

https://dify.ai/zh

https://github.com/langgenius/dify