OpenPipe is a company developing tools to convert large language model (LLM) prompts into fine-tuned models. The platform uses a software development kit (SDK) to abstract away fine-tuning custom models. OpenPipe captures users' existing prompt-completion pairs in the background, using them to create a new model that aims to be faster, cheaper, and often more accurate than the original prompt. Rather than using a large model designed for a large number of tasks, OpenPipe plans to allow users to build small models fine-tuned on a specific prompt. These models can be particularly good at data extraction and classification.
OpenPipe built infrastructure simplifying the process of fine-tuning models. The process can be divided into a series of prompts:
- Develop a prototype of a feature using an LLM (GPT-3.5 or GPT-4, for example).
- Collect prompts and completions over time using OpenPipe's reporting SDK
- With a few hundred to a few thousand completions recorded, start a training job in OpenPipe's UI
- After a few hours, a new model will be ready that can either be downloaded or hosted on the OpenPipe platform
OpenPipe allows users to evaluate model and prompt combinations in its playground environment, querying past requests and exporting optimized training data. Features include the following:
- Bulk-test a wide variety of scenarios using code templating
- Translate prompts across different model APIs
- Tap into autogenerated scenarios for fresh test perspectives
- Integrate with OpenPipe's SDK in both Python and JS
- Query logs using built-in filters
- Export data in multiple training formats, including Alpaca and ChatGPT, with deduplication
OpenPipe was founded by Kyle Corbitt and David Corbitt. Kyle had previously founded Emberall and was an engineer at Google and Y Combinator, where he led the Startup School team. OpenPipe was part of Y Combinator's S23 Batch and officially launched on August 28, 2023. Upon the launch, the OpenPipe was in Beta, with users able to join a waiting list to gain access. The company is based in San Francisco.
The founders began working on OpenPipe after running into limitations using GPT-3.5 and GPT-4. David was building an app that needed to search Reddit and classify the results based on user-specific dimensions. However, because the possible results that GPT-3.5 had to classify were so large, each search cost multiple dollars. Kyle built and sold a startup that involved translating official documents. Using GPT-3.5 led to issues getting the model to comply with the requirements of official translations, and using GPT-4 was too slow to provide a good user experience. After speaking with many other companies, they saw common issues related to cost and latency blocking production deployment of LLM-backed functionality.
OpenPipe supports a number of popular LLMs:
- OpenAI—GPT-3.5 Turbo, GPT-3.5 Turbo 16k, GPT-4
- Llama2—7b chat, 13b chat, 70b chat
- Llama2 Fine-Tunes—Open-Orca/OpenOrcaxOpenChat-Preview2-13B, Open-Orca/OpenOrca-Platypus2-13B, NousResearch/Nous-Hermes-Llama2-13b, jondurbin/airoboros-l2-13b-gpt4-2.0, lmsys/vicuna-13b-v1.5, Gryphe/MythoMax-L2-13b, NousResearch/Nous-Hermes-llama-2-7b
- Anthropic—Claude 1 Instant, Claude 2
OpenPipe's pricing is based on tokens (basic units of text or code used to process and generate language via LLMs). Users can also access dozens of models in OpenPipe's experimentation playground at cost. Example pricing for Llama 2 models is shown below:
Llama 2 pricing
Llama 2 13b
$0.0080 / 1K tokens
$0.0024 / 1K tokens
$0.0032 / 1K tokens
Llama 2 7b
$0.0040 / 1K tokens
$0.0012 / 1K tokens
$0.0016 / 1K tokens
OpenPipe has provided a series of sample experiments showing how the technology works. Users are free to fork these experiments. OpenPipe sample experiments include those below: