If you wanted to scale up your big tech company and you had $10 million to spend, how would you spend it? In a Super Bowl ad? F1 sponsorship?
You can spend it training the generative AI model. Although they are not marketing in the traditional sense, generative models attract attention – and are increasingly directed to core sellers' products and services.
Watch DBRX from Databricks, a new AI model announced today similar to OpenAI's GPT series and Google's Gemini. Available on GitHub and the AI development platform Hugging Face for research as well as commercial use, the basic (DBRX Base) and fine-tuned (DBRX Instruct) versions of DBRX can be run and configured on public, custom, or private data.
“DBRX is trained to be helpful and provide information on a wide range of topics,” Naveen Rao, vice president of generative AI at Databricks, told TechCrunch in an interview. “The DBRX has been optimized and tuned for English use, but is capable of speaking and translating into a wide range of languages, such as French, Spanish and German.”
Databricks describes DBRX as “open source” in the same way as “open source” models such as Meta’s Llama 2 and emerging AI models Mistral. (It is a subject of vigorous debate whether these models actually meet the definition of open source.)
Databricks says it spent approximately $10 million and eight months training DBRX, which it claims (quoting a press release) “outperforms.”[s] All current open source models are in accordance with standard standards.
But – and here's the marketing rub – DBRX is extremely difficult to use unless you're a Databricks customer.
This is because, in order to run DBRX in the standard configuration, you need a server or PC with at least four Nvidia H100 GPUs. A single H100 costs thousands of dollars, maybe more. This may be a small change for the average enterprise, but for many developers and solopreneurs, it is out of reach.
A good print to boot. Databricks says that companies with more than 700 million active users will face “certain restrictions” similar to Meta's for Llama 2, and that all users will have to agree to terms that ensure they use DBRX “responsibly.” (Databricks had not volunteered details of these terms at the time of publication.)
Databricks offers its Mosaic AI Foundation Model product as a managed solution to these bottlenecks, which in addition to running DBRX and other models provides a training package for tuning DBRX on custom data. Customers can host DBRX privately using Databricks' offering model, Rao suggested, or they can work with Databricks to deploy DBRX on hardware of their choice.
Rao added:
We're focused on making the Databricks platform the best choice for building custom models, so the ultimate benefit to Databricks is having more users on our platform. DBRX is a demonstration of our best-in-class pre-training and tuning platform, which customers can use to build their own models from scratch. It's an easy way for customers to get started using Databricks Mosaic AI generative AI tools. The DBRX is highly capable out of the box and can be tuned for excellent task-specific performance and better economy than larger enclosed models.
Databricks claims that DBRX runs up to 2x faster than Llama 2, thanks in part to its combination of expert (MoE) architecture. MoE—which DBRX shares with the Llama 2, newer models from Mistral, and Google's recently announced Gemini 1.5 Pro—essentially breaks down data processing tasks into multiple subtasks and then delegates those subtasks to smaller, specialized “expert” models.
Most Ministry of Education models include eight experts. DBRX has 16, which Databricks says improves quality.
But quality is relative.
While Databricks claims that DBRX outperforms the Llama 2 and Mistral models in some measures of language comprehension, programming, mathematics, and logic, DBRX falls short of the leading generative AI model, OpenAI's GPT-4, in most areas outside of specialized use cases such as database programming. Language generation.
Rao admits that DBRX has other limitations as well, namely that it – like all other generative AI models – can fall victim to “hallucinogenic” answers to queries despite Databricks’ work in safety testing and red collaboration. Because the model is simply trained to associate words or phrases with certain concepts, if these associations are not completely accurate, its responses will not always be accurate.
DBRX is also not multimodal, unlike some of the more recent leading generative AI models including Gemini. (It can only process and create text, not images). We don't know exactly what data sources were used to train it; Rao would only reveal that no Databricks customer data was used in DBRX training.
“We trained DBRX on a large set of data from a variety of sources,” he added. “We used open datasets that the community knows, loves, and uses every day.”
I asked Rao whether any of the DBRX training datasets were copyrighted, licensed, or showed obvious signs of bias (for example, racial bias), but he did not answer directly, saying only: “We have been careful about the data used,” It conducted red team exercises to improve the model's weaknesses. Generative AI models tend to duplicate training data, which is a major concern for commercial users of models trained on unlicensed, copyrighted, or very clearly biased data. In a worst-case scenario, it could end A user is put in ethical and legal trouble for unwittingly incorporating work that infringes intellectual property rights or is biased from a template into their projects.
Some companies that train and release generative AI models offer policies that cover legal fees arising from a potential breach. Databricks doesn't do this at present — Rao says the company is “exploring scenarios” under which it could be done.
Given this and other aspects in which DBRX misses the mark, the model seems like a tough sell to anyone but current or potential Databricks customers. Databricks' competitors in the generative AI space, including OpenAI, offer equally if not more compelling technologies at very competitive prices. And many generative AI models come closer to the common definition of open source than DBRX.
Rao promises that Databricks will continue to improve DBRX and release new versions as Mosaic Labs' R&D team – the team behind DBRX – investigates new generative AI methods.
“DBRX is pushing the open source modeling space forward and challenging future models to be built more efficiently,” he said. “We will release variants as we apply techniques to improve the quality of the output in terms of reliability, safety and bias… We see the open model as a platform on which our customers can build custom capabilities using our tools.”
Judging by where DBRX stands now relative to its peers, it's a very long way forward.