Introduction:

Imagine an AI poet, pouring its heart out about an enigmatic poet. With utmost confidence, the LLM conjures up detailed answers, even offering a poem and translation supposedly penned by X poet. However, the truth reveals itself—the poem is an exquisite fabrication concocted by the LLM, leaving us astounded by its creative prowess. But how does this hallucination occur?

The Intricacies of LLM Training:

At the core of LLMs, such as ChatGPT, lies a neural network trained on colossal amounts of textual data. It’s a statistical behemoth, learning intricate patterns and relationships from the vast corpus it consumes. During training, these models are exposed to a vast range of text sources, from scientific articles to captivating works of fiction. Their objective? To predict the next word in a sentence based on the context provided by the preceding words.

This is where it gets interesting. LLMs prioritize generating text that is coherent and contextually appropriate, rather than being factually accurate. Their training data, while a rich tapestry of human knowledge, harbors inaccuracies and even fictional content. Regrettably, the model lacks the ability to distinguish between fact and fiction. Consequently, it may weave intricate narratives and eloquent arguments that align with the patterns it has learned but have no grounding in reality.

Tangled in the Web: The Quest to Mitigate Hallucination

How can this hallucination be stopped? Enter reinforcement learning with human feedback (RLHF), a remarkable technique offering hope to guide LLMs towards the path of truth. Ilya Sutskever, the Chief Scientist at OpenAI, suggests that through the enhancement of reinforcement learning with human feedback, it is possible to educate LLMs to refrain from producing hallucinations. This technique can be likened to a captivating beacon of guidance amidst a complex maze of illusions.

This technique empowers LLMs to become modes of accuracy. Just as a we can learn to make decisions in an ever-changing environment, LLMs undergo a transformative journey. They generate text as actions, while human evaluators assume the role of wise mentors. These evaluators assess the quality of the generated text, scrutinizing coherence, relevance, and above all, truthfulness. Their insightful feedback becomes the compass that navigates the LLMs toward the realms of accuracy, providing the necessary rewards and nudges along the way.