Product attributes
Software attributes
Other attributes
BloombergGPT is a 50-billion parameter, decoder-only causal large language model (LLM) developed by Bloomberg and trained on a wide range of financial data to improve financial natural language processing (NLP) tasks. Introduced on March 30, 2023, the model is detailed in a paper written by Bloomberg researchers. BloombergGPT will assist in a range of financial NLP, including sentiment analysis, named entity recognition, news classification, and question answering. Additionally, the model will help provide new opportunities to apply data on the Bloomberg Terminal.
BloombergGPT was developed between Bloomberg's ML Product and Research group and its AI Engineering team. Researchers at the company combined financial data with general-purpose datasets to train BloombergGPT, drawing upon Bloomberg's existing data creation, collection, and curation resources, pulling from extensive financial data archives to build a 363 billion token dataset of English financial documents. This data was augmented by a 345 billion token public dataset to build a training corpus of over 700 billion tokens. The resulting model trained on a portion of this dataset has been validated on existing finance-specific NLP benchmarks, Bloomberg internal benchmarks, and broader NLP categories. Bloomberg reports the model outperforms existing open models of a similar size on financial tasks while exhibiting comparable or better performance for general NLP benchmarks.
"FinPile," BloombergGPT's training dataset, consists of a range of English financial documents, including news, filings, press releases, web-scraped financial documents, and social media drawn from the Bloomberg archives. These documents are augmented by public data widely used to train LLMs. The file dataset is roughly half finance-specific and half general-purpose. To improve the quality of the data, researchers de-duplicate each dataset. The table below shows a breakdown of the full training set used for BloombergGPT, including:
- The number of documents, expressed in units of 104
- The characters per document, "C/D"
- The number of characters, expressed in units of 108
- The number of characters per token
- The number of tokens, expressed in units of 108
- The percentage of overall tokens, "T%"
Financial data was sourced from the Bloomberg Terminal between March 1, 2007, and July 31, 2022. Generally, the quality and quantity of the data increase over this time range.
Three widely known and available public datasets were used:
- The Pile—an 825 GB open-source language modeling dataset from EleutherAI. The Pile has been used to train multiple LLMs, including GPT-Neo, GPT-J, and GPT-NeoX. It was used for BloombergGPT for its past success training models, the significant cleaning and pre-processing that has been performed on the dataset, and because it includes multiple domains and diverse data. The de-duplication process reduced the overall size of The Pile significantly.
- C4—The Colossal Crawled Corpus, a common dataset used to train LLMs. C4 and the Pile have some overlap, but C4 was cleaned and processed differently.
- Wikipedia—While both The Pile and C4 include previous copies of Wikipedia, Bloomberg researchers chose to include a more recent version of Wikipedia pages from July 1, 2022.
BloombergGPT is a decoder-only causal language model based on BLOOM. It contains seventy layers of transformer decoder blocks. Attention with Linear Biases (ALiBi) positional encoding is applied through additive biases at the self-attention component of the transformer network. Input token embeddings are tied to linear mapping before the final softmax (the last activation function of the neural network). The model has an additional layer normalization after token embeddings.
Upon release, BloombergGPT was evaluated on two broad categories of tasks, finance-specific and general-purpose. The finance-specific evaluation included a range of NLP tasks as well as model performance on Bloomberg tasks of interest drawn from the company's internal evaluation sets for sentiment analysis and named entity recognition. General-purpose tasks were drawn from existing benchmarks and group results, including BIG-bench Hard, Knowledge Assessments,
Reading Comprehension, and Linguistic Tasks.
The paper compares BloombergGPT to three similar models based on size, type of training data, performance, and access:
- GPT-NeoX
- OPT66B
- BLOOM176B
Results (shown in the table below) found BloombergGPT outperforming the other three models in finance-specific tasks and performing on par with them on general-purpose tasks. The model does not perform as well as GPT-3 on general-purpose tasks.