Company attributes
Technology attributes
Other attributes
LlamaIndex is an open-source data framework for connecting custom datasets with large language models (LLMs) to build applications. LlamaIndex was previously called GPTIndex. LLMs are pre-trained on publicly available data. LlamaIndex provides a framework for augmenting LLMs with a user's private data. The framework includes the following tools:
- Data connectors to ingest existing data sources and data formats (APIs, PDFs, docs, SQL, etc.)
- Methods of structuring data (indices, graphs) so that it can be easily used with LLMs
- A retrieval/query interface over user data such that LLM input prompts return context and knowledge-augmented outputs
- Integration with outer application frameworks such as LangChain, Flask, Docker, ChatGPT, and more
LlamaIndex aims to simplify the development of knowledge-intensive LLM applications, such as document Q&A, data-augmented chatbots, autonomous knowledge agents, and structured analytics. LlamaIndex aims to link systems of intelligence (LLMs) with systems of record (data sources). The framework sits between foundational models and private user data to enhance LLM-powered enterprise applications.

Representation of the AI application tech stack created by LlamaIndex investors Greylock.
LlamaIndex tools are targeted at both beginners and advanced users. The high-level API allows beginner users to use LlamaIndex to ingest and query their data in five lines of code. The lower-level APIs allow advanced users to customize and extend modules (data connectors, indices, retrievers, query engines, reranking modules) to fit their needs. LlamaIndex's main third-party package requirements are tiktoken, OpenAI, and Langchain.
LlamaIndex has developed a wider Llama ecosystem with LlamaHub, a tool with over 100 data loaders to connect custom data sources via LlamaIndex or Langchain, and LlamaLab, a repository for projects built using LlamaIndex. These projects include the following:
- llama_agi—BabyAGI and AutoGPT-inspired project to create, plan, and solve tasks
- auto_llama—AutoGPT-inspired project to search, download, and query the internet, solving user-specified tasks
- Conversational agents—a conversational simulator between different agents
Llama Lab also contains references to external subprojects using LlamaIndex, such as Insight, an autonomous AI for medical research.
LlamaIndex was created by Jerry Liu after experimenting with GPT-3 and looking for ways to mitigate the limitations of the model when working with his own personal data. Liu open-sourced the project and publicly released it for the first time on November 9, 2022. Initially an experimental project to organize and retrieve information using LLMs, LlamaIndex showed many others were experiencing the same issues. After two months of working on the project, Liu teamed up with his former colleague at Uber, Simon Suo, to build the product and community and develop a comprehensive framework for connecting user data with LLMs. In six months, the project had grown with 16K Github Stars, 20K Twitter followers, 200K monthly downloads, and 6K active Discord users. Companies like Instabase, Front, and Uber also started experimenting with LlamaIndex on top of their data.
On June 6, 2023, Liu (CEO) and Suo (CTO) announced they had started a company around LlamaIndex and raised $8.5M in seed funding. The round was led by Greylock with participation from angel investors Jack Altman (CEO of Lattice), Lenny Rachitsky (Lenny’s Newsletter), Mathilde Collin (CEO of Front), Raquel Urtasun (CEO of Waabi), Joey Gonzalez (Berkeley), and others. Liu stated that the funding would be used to build an enterprise solution on top of the open-source LlamaIndex project, which they plan to launch in late 2023. The solution includes allowing customers to use "protection-grade" data connectors to parse and transport significant volumes of data as well as allowing them to index domain-specific data.