Gold Rush or Sugar Rush? How to maximise GPT in the enterprise

You’d struggle to find a single person now who hasn’t seen, read or had a conversation about the “AI utopia” seemingly represented by OpenAI’sChatGPT.

The world is buzzing with the possibilities of this and other Large Language Models (LLMs). ChatGPT, which is based on the GPT-3 engine, can rapidly generate human-like text, such as writing stories, articles, poetry, and more. It does this by generating statistically likely text predictions based on your input and what they saw in their training.

Global interest and confidence is increasing in LLMs and that is not entirely unfounded, with users now receiving relevant, comprehensive responses that can be indistinguishable from human generated content, even from minimal input. The press coverage surrounding ChatGPT’s launch has certainly been enough to reboot interest in the chatbot phenomenon that has fizzled out over the last five years.

LLMs are a powerful content creation tool and will augment the work of many content-oriented roles to deliver value. However, when we think about the relevance and value of LLMs in an organisation’s data and automation strategy, things can get a little blurry.

The limitations of LLM technology

LLMs have limitations. For example, they cannot perform mathematical or computational analysis and there are numerous technical barriers when aligning any specific LLM to an enterprise’s specific domain.

LLMs are trained on big data which itself contains massive amounts of bias. User bias is also relevant when they are used. It is incredibly difficult to identify and mitigate for these biases and because LLMs are producing statistical predictions not actual judgements, the output can still be exceptionally convincing but also incorrect nonsense. In fact, the potential for incorrect output and inadvertent disinformation puts the enterprise at enormous risk; financial, regulatory and reputational.

ChatGPT is 10x the size of the previous model and is therefore more useful. However, all LLMs are still based on machine learning which means they are inherently ‘black box’. It is impossible to understand how their predictions are made and they cannot provide a rationale for their answers. They can therefore be wrong and look right or be right and look wrong. Unless you know the domain, you cannot necessarily know the difference.

This methodology makes it difficult to build tools that align with what is important to a business, and the lack of transparency amplifies this risk.

LLMs are not designed to consider issues of fairness and safety in predictions and are therefore not suitable for decision-making processes where interpretability is important; like lending, hiring or medical processes.

The most common category of business decision-making is complex and contextual. Despite LLMs impressive abilities at creating content, the idea of delegating any critical decision process to an LLM is misplaced. LLMs can be effective co-workers in some domains, but like all machine learning systems, as agents of ‘prediction’ they should not be left to make ‘judgements’.

So, where is the utility?

So, what can LLMs effectively and safely be used for?

Will they give birth to a new creative era? Well…maybe, but not because of their ability to write original content so much as their utility as an augmentation of human hybrid workers.

The potential efficiency gains in the creation of content are pretty impressive, but as things stand, will always require significant human participation; editing, checking and moderating the LLM’s output. LLMs can happily draft code for a software engineer, but they will still need to make adjustments.

LLMs can be fantastic research tools but do need to be fact-checked. They can generate original content but remain unable to move beyond the scope of their training; so are inherently conservative. They can also be effective training and support aids, helping someone learn a topic or understand how to fix a problem without utilising another human worker’s time or resources.

As a side-tool, LLMs are pretty nifty. Gains in efficiency around content generation are reason enough for them to be deployed in the enterprise, but that is the limit. On their own, they’re the next generation of assistive technology – able to augment hybrid workers in existing processes. The technology is not ready for full delegation, and due to the lack of interpretability may never be.

Improving strategy with LLMs
If we are to fully reap the benefits of LLMs, we need to make them core to our automation strategy. There are two main spaces where, in combination with other technologies, such benefits are already showing themselves to be most apparent.

Enhancing trust and accountability in AI systems
The first is the significant role LLMs can play in helping human experts create more transparent decision-automation technology. It is possible for domain experts to create ‘digital brains’ from visual models of their expertise. These can automate complex, contextual operational decisions that cannot be automated using LLMs alone because of their ability to occasionally churn out convincingly wrong answers and their lack of interpretability.

By using an LLM to translate natural language into structured knowledge representations, we can empower domain experts to build data-led decisioning engines that include rules that describe the nuance of what’s important – all with minimal technical expertise. In the same way that LLMs can be an assistant to content creation workers, LLMs can support domain experts in the building of “glass box” models.

This approach has multiple advantages, enabling the expert to quickly convey their intentions for a system that can then automatically provide a rationale for every decision made, all without humans losing control over the outcome. Such systems are unaffected by bias and enable human expertise locked up in organisations to be scaled. Imagine even the most complex and contextual operational decision-making being over 100x faster, more accurate and free of human noise and bias.

The engine behind Open.AIsChatGPT tool is GPT-3. It took around 350 years of computing resources to train. Even most large enterprises can’t stretch to that, which makes ownership & deployment into sensitive or critical areas impossible. Using the ‘glass box’ strategy any enterprise can leverage existing models like GPT3 in the building of transparent decision automation tools safely, and with their own data, without a huge investment.

In conclusion…
There are many reasons why enterprises shouldn’t wait before embracing and incorporating LLM’s into their automation strategy. LLMs are powerful tools for the prediction of content and there are many applications in the enterprise for that today, not least of all in marketing.

However, when it comes to high-value transactional decision-making, the limitations of LLMs mean we must look to more transparent forms of AI if we are to scale judgement.

Leveraging LLMs in the building of complex interpretable AI will enable organisations to strike gold and make lasting strides in the transformation of their business beyond ‘getting high’ on the short-term sugar rush of ChatGPT.