What is Generative AI

GenAI

Generative AI (or GenAI) refers to artificial intelligence systems designed to create new content that resembles human-made data. The data could be text, images, audio, or code.

These models, like GPT (for text) or DALL-E (for images), are trained on large datasets and use patterns learned from this data to generate new output.

A diagram showing the process of Generative AI

Generative AI is widely used in applications such as chatbots, content creation, image synthesis, and code generation.

GenAI

Generative AI models are not "intelligent" in the way humans are:

They do not understand or comprehend the content they generate
They rely on statistical patterns and correlations learned from their training data.

While Generative AI models can produce coherent and contextually relevant outputs, they lack understanding.

Large Language Models (LLMs)

This course will focus on text-generating models, specifically Large Language Models (LLMs)

LLMs are a type of generative AI model designed to understand and generate human-like text.

These models are trained on vast amounts of text data and can perform various tasks, including answering questions, summarizing data, and analyzing text.

LLM Responses

The response generated by an LLM is a probabilistic continuation of the instructions it receives.

The LLM provides the most likely response based on the patterns it has learned from its training data.

If presented with the instruction:

"Continue this sequence - A B C"

An LLM could respond:

"D E F"

Prompts

To get an LLM to perform a task, you provide a prompt.

A prompt should specify your requirements and provide clear instructions on how to respond.

A user asks an LLM the question 'What is an LLM? Give the response using simple language avoiding jargon.', the LLM responds with a simple definition of an LLM.

Caution

While GenAI and LLMs provide a lot of potential, you should also be cautious.

At their core, LLMs are highly complex predictive text machines. LLM’s don’t know or understand the information they output; they simply predict the next word in a sequence.

The words are based on the patterns and relationships from other text in the training data.

An LLM as a black box, responding to the question 'How did you determine that answer?' with 'I don’t know.'

Access to Data

The sources for this training data are often the internet, books, and other publicly available text. The data could be of questionable quality or even incorrect.

Training happens at a point in time, the data is static, and may not reflect the current state of the world or include any private information.

Access to Data

When prompted to provide a response, relating to new or data not in the training set, the LLM may provide a response that is not accurate.

A diagram of an LLM returning out of data information.

Accuracy

LLMs are designed to create human-like text and are often fine-tuned to be as helpful as possible, even if that means occasionally generating misleading or baseless content, a phenomenon known as hallucination.

For example, when asked to "Describe the moon." an LLM may respond with "The moon is made of cheese.". While this is a common saying, it is not true.

A diagram of a confused LLM with a question mark thinking about the moon and cheese.

While LLMs can represent the essence of words and phrases, they don’t possess a genuine understanding or ethical judgment of the content.

Improving LLM responses

You can improve the accuracy of responses from LLMs by providing context in your prompts.

The context could include relevant information, data, or details that help the model generate more accurate and relevant responses.

Avoiding hallucination

Providing context can help minimize hallucinations by anchoring the model’s response to the facts and details you supply.

If you ask a model to summarize a company’s performance, the model is more likely to produce an accurate summary if you include a relevant stock market report in your prompt.

A diagram of an LLM being passed a stock market report and being asked to summarise a company’s performance.

Access to data

LLMs have a fixed knowledge cutoff and cannot access real-time or proprietary data unless it is provided in the prompt.

If you need the model to answer questions about recent events or organization-specific information, you must supply that data as part of your prompt. This ensures that the model’s responses are up-to-date and relevant to your particular use case.

You could also provide statistics or data points in the prompt to help the model include useful facts in its response.

A diagram of an LLM being passed a stock market report and the annual results, being asked to summarize a company’s performance. The response includes a specific profit figure from the annual results.

Supplying context

Supplying context in your prompts helps LLMs generate more accurate, relevant, and trustworthy responses by reducing hallucinations and compensating for the lack of access to data.

Lesson Summary

In this lesson, you learned about Generative AI models, their capabilities, constraints, and how providing context in your prompts can help improve the accuracy of LLM responses.

In the next lesson, you will learn about Retrieval-Augmented Generation (RAG), GraphRAG, and how they can be used to provide context to LLMs.

Neo4j and Generative AI Workshop

Generative AI

Knowledge Graph Construction

Retrieval

Agents

What is Generative AI

GenAI

GenAI

Large Language Models (LLMs)

LLM Responses

Prompts

Caution

Access to Data

Access to Data

Accuracy

Improving LLM responses

Avoiding hallucination

Access to data

Supplying context

Lesson Summary

Chatbot