Product attributes
Software attributes
Other attributes
LMQL (Language Model Query Language) is a programming language for large language model interaction with a focus on multi-part prompting to enable novel forms of LM interaction. LMQL is a research project by the Secure, Reliable, and Intelligent Systems Lab at ETH Zurich. LMQL is based on a superset of Python and provides full Python support. It is developed to work with language models like OpenAI and offers functionality such as multi-variable templates, conditional distributions, constraints, datatype constraints, and control flow. LMQL offers a Playground IDE where users can experiment with the language.
The researchers who developed LMQL have used the query language to enable Language Model Programming (LMP), which allows for generalized language model prompting from pure text prompts to a combination of text prompting and scripting. LMQL can influence the constraints and control flow to generate an inference procedure, which can then be translated into token masks with the help of some evaluation semantics enforced at the time of generation.
Further, LMQL was introduced to avoid the costs associated with re-querying and validating generated text, which can help LMQL produce text closer to a desired output on a first attempt rather than through subsequent iterations. Further, the constraints in the LMQL system allow users to guide or steer text generation based on their desired specifications, such as following grammatical or syntactic rules or avoiding specific words or phrases.
LMQL allows for scripted prompting, or prompts that are not just static text, but dynamic prompt constructs that allow users to control flow (such as loops, conditions, or function calls). This is to enable LMQL queries to respond to model output and is achieved by a combination of prompt templates, control flow, and output constraining.
LMQL allows a user to set and specify constraints on the language model output. This can be valuable to users to ensure the model output stops at the desired point and to guide the model during decoding. The constraints are evaluated on each generated token, meaning that provided constraints are either satisfied by directly guiding the model during generation, or validation will fail to save the cost of generating an invalid output. Further, the constraints are high-level and operate on a text rather than a token level. This allows users to specify constraints without having to consider the tokenization of each individual phrase.
LMQL offers users support for various decoding algorithms which can be used to generate text from the token distribution of a language model. The decoding algorithm is specified at the beginning of a query, and LMQL offers a library for array-based decoding, which can be used to implement custom decoders. Otherwise, LMQL can work with several decoding algorithms, which can be found in its repository. In general, however, all LMQL decoding algorithms are model-agnostic and can be used with any LMQL-supported inference back end.
Due to the design of LMQL, it is modular, allowing users to add support for new models or back ends. This means that LMQL is not specific to any particular text generation model but supports a wide range of text generation models on the back end, including OpenAI models, and can be used with self-hosted models through transformers.

