Prompt Engineering for Named Entity Recognition (NER)
Prompt Engineering Tips & Tricks for Named Entity Recognition (NER). Learn how to effectively extract entities from text using Large Language Models with practical techniques and examples.
2023 was the year of exploration, testing and proof-of-concepts or deployment of smaller LLM-powered workflows/use cases for many organizations. Whilst 2024 will likely be the year where we will see even more production systems leveraging LLMs. Compared to a traditional ML system where data (examples, labels), model and weights are some of the main artifacts, prompts are instead the main artifacts. Prompts and prompt engineering are fundamental in driving a certain behavior of an assistant or agent, for your use case.
Therefore many of the large players as well as academia have provided guides on how to prompt LLMs efficiently:
- đź’» OpenAI Prompt Engineering
- 💻 Guide to Anthropic’s prompt engineering resources
- đź’» Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4
- đź’» Prompt Engineering Guide
Many of these guides are quite generic and can work for many different use cases. However, there are very few guides mentioning best practices for constructing prompts for Named Entity Recognition (NER). OpenAI has for instance a cookbook for NER
as well as this and this paper which both suggested a method called Prompt-NER
.
In this article, we will discuss some techniques that might be helpful when using LLM for NER use cases. To set the stage we will first start with defining what Prompt Engineering and Named Entity Recognition are.
Prompt Engineering
Prompt Engineering sometimes feels more like an art compared to science but there are more best practices showing up (see some of the references in the previous section).
One definition of Prompt Engineering
is shown below:
Definition: Prompt Engineering
Prompt engineering is a relatively new discipline for developing and optimizing prompts to efficiently use language models (LMs) for a wide variety of applications and research topics. Prompt engineering skills help to better understand the capabilities and limitations of large language models (LLMs).
Prompt engineering is not just about designing and developing prompts. It encompasses a wide range of skills and techniques that are useful for interacting and developing with LLMs. It’s an important skill to interface, build with, and understand the capabilities of LLMs. You can use prompt engineering to improve the safety of LLMs and build new capabilities like augmenting LLMs with domain knowledge and external tools.
— Prompt Engineering Guide
Like any other artifact prompts may be “outdated” or “drift” which is why it is important to have systems in place to do the:
- Experiment tracking of prompts
- Evaluating your prompts (either via the “golden dataset” approach, LLM-based evals or both)
- Observability of how your prompts are being used
- Versioning of your prompts.
Named Entity Recognition (NER)
Extracting entities
or tags
from data can be very useful for many different domains and businesses and can be used for many different things such as classification, knowledge retrieval, search etc.
See one definition below for Named Entity Recognition
:
Definition: Named Entity Recognition
Named Entity Recognition (NER) is a task of Natural Language Processing (NLP) that involves identifying and classifying named entities in a text into predefined categories such as person names, organizations, locations, and others. The goal of NER is to extract structured information from unstructured text data and represent it in a machine-readable format. Approaches typically use BIO notation, which differentiates the beginning (B) and the inside (I) of entities. O is used for non-entity tokens.
— Papers with Code
Below is an example of the BIO notation, which is a common format used for NER:
Mac [B-PER]
Doe [I-PER]
ate [O]
a [O]
hamburger [O]
at [O]
Mcdonalds [B-LOC]
However, the BIO notation does not make sense for all use cases. Let’s say that if you are interested in extracting food entities then hamburger above might be a key entity or tag that you want to predict.
Usually, a “typical” NER pipeline comprises the following steps:
tokenizer
: Turn text into tokenstagger
: Assign part of speech tagsparser
: Assign dependency labelsner
: Detect and label named entities
Solving NER systems has previously been done by using:
- Rule-based systems
- Statistical & ML systems
- Deep-Learning systems
- Mix of (1) - (3).
Where spacy and Transformer architectures from e.g. HuggingFace such as BERT, XLM-ROBERta etc. have been the go-to methods or architectures.
Before we start with prompt engineering let’s set the stage with an example use case.
Food Entities from recipes
In the following section we will assume the following:
- We are a food tech startup that provides and sells custom smart purchase lists to retailers from online food recipes.
- We want to extract the following type of entities:
FOOD
,QUANTITY
,UNIT
,PHYSICAL_QUALITY
,COLOR
- When we have the
entities
we want to populate smart purchase lists with recommendations on where to get the food.
The example recipe that we will be using:
### Chashu pork (for ramen and more)
Chashu pork is a classic way to prepare pork belly for Japanese dishes such as ramen.
While it takes a little time, it's relatively hands-off and easy, and the result is delicious.
### Ingredients
2 lb pork belly or a little more/less
2 green onions spring onions, or 3 if small
1 in fresh ginger (a chunk that will give around 4 - 6 slices)
2 cloves garlic
â…” cup sake
â…” cup soy sauce
ÂĽ cup mirin
½ cup sugar
2 cups water or a little more as needed
### Instructions
Have some string/kitchen twine ready and a pair of scissors before you prepare the pork.
If needed, trim excess fat from the outside of the pork but you still want a later over the
...
Technique #1 - Use temperature=0.0
LLMs are non-deterministic by nature and different generations of using e.g. chat completions APIs from e.g. OpenAI
will return different responses. However, one way to mitigate this is to use the temperature
parameter often provided in these types of APIs. In short, the lower the temperature, the more deterministic the results in the sense that the highest probable next token is always picked. Do notice that this is still not a guarantee for deterministic output. Rather the best try effort to always select the most likely token as model output.
This is of course useful in a NER case where you would like the same or similar input to produce the same or similar output.
Tip 1: Temperature Control
Use temperature=0.0
to get more deterministic output:
- Lower temperature = more predictable responses
- Higher temperature = more creative/random responses
- For NER tasks, you want consistency, so use
0.0
Technique #2 - Set seed
for your runs
Recently OpenAI added the functionality to set the seed to make runs more reproducible. If all other params are set to the same value e.g. temperature=0
etc, seed
is set and the system_fingerprint
then output will be mostly deterministic.
Tip 2: Seed Control
Set seed
parameter for reproducible results:
response = openai.chat.completions.create(
model="gpt-4",
temperature=0.0,
seed=42, # Same seed = same output
messages=[...]
)
Note: Only works when combined with temperature=0
and same system_fingerprint
.
Both the previous section and this section were more focused on model parameters. The following sections will instead focus on what Prompt Engineering
techniques we can use to extract named entities.
Technique #3 - Use clear instructions
The next step is to use clear
instructions to the agent normally:
System
prompt is used for instructionsUser
prompt to provide context and data- (
Assistant
) prompt to provide examples
Starting with a prompt like the below:
System:
You are a food AI assistant who is an expert in natural language processing
and especially name entity recognition.
User:
Extract all food-related entities from the recipe below in backticks:
```{recipe}```
...
We get the following output:
#### extract_food_entities
I will now proceed to extract all the food-related entities from the given recipe.
#### extract_food_entities
The food-related entities present in the recipe are as follows:
* Pork belly
* Green onions
* Fresh ginger
* Garlic
* Sake
* Soy sauce
* Mirin
* Sugar
* Water
* Rice
These entities cover the main ingredients used for the Chashu pork recipe.
You can also use the Assistant
message to prompt with some examples:
Assistant:
Example:
```json
{
"food": "minced meat",
"quantity": 500
"unit": grams (g)
"physicalQuality": "minced",
"color": "brown"
}
…
Tip 3
Use clear instructions to guide the LLM effectively:
- Use
System
prompt for instructions - Use
User
prompt to provide context and data - Use
Assistant
prompt to provide examples
This structured approach helps the model understand exactly what you want it to do.
Technique #4 - Use functions
or tools
When we use functions
or tools
we prompt the model to provide input arguments for an actual function in a downstream manner. This is similar to what is mentioned in section 4.2 here. You can use these arguments as they are (as they will be valid JSON
) or do some further processing by doing the function calling in your logic. One example could be that we want to trigger certain actions or do certain formatting based on the function arguments.
The functions will also be part of the system prompt. Many of the latest models have been fine-tuned to work with function calling and thus produce valid JSON
output in that way. To define a function we define a jsonSchema
as below:
{
"name": "extract_food_entities",
"description": "You are a food AI assistant. Your task is to extract food entities from a recipe based on the JSON schema. You are to return the output as valid JSON.",
"parameters": {
"type": "object",
"properties": {
"food-metadata": {
"type": "object",
"properties": {
"food": {
"type": "string",
"description": "The name of the food item"
},
"quantity": {
"type": "string",
"description": "The quantity of the food item"
},
"unit": {
"type": "string",
"description": "The unit of the food item"
},
"physicalQuality": {
"type": "string",
"description": "The physical quality of the food item"
},
"color": {
"type": "string",
"description": "The color of the food item"
}
},
"required": [
"food",
"quantity",
"unit",
"physicalQuality",
"color"
]
}
},
"required": [
"food-metadata"
]
}
}
The example output of using the extract_food_entities
below is:
{
"food-metadata": {
"food": "Pork Belly",
"quantity": "2",
"unit": "lb",
"physicalQuality": "-",
"color": "-"
}
}
Tip 4
Use tools
or function
calling with a jsonSchema
to extract wanted metadata.
Technique #5 - Use domain prompts
As seen above using the jsonSchema
above gives us metadata in a structured format that we can use for downstream processing. However, there are some limitations in the number of characters you can set in the description for each property
in the jsonSchema
. One way to give further instructions to the LLM is to add domain-specific
instructions to e.g. the system
prompt:
System:
You are a food AI assistant who is an expert in natural language processing
and especially name entity recognition. The entities we are interested in are: "food", "quantity", "unit", "physicalQuality" and "color".
See further instructions below for each entity:
"food": This can be both liquid and solid food such as meat, vegetables, alcohol, etc.
"quantity": The exact quantity or amount of the food that should be used in the recipe. Answer in both full units such as 1,2,3, etc but also fractions e.g. 1/3, 2/4, etc.
"unit": The unit being used e.g. grams, milliliters, pounds, etc. The unit must always be returned.
"physicalQuality": The characteristic of the ingredient (e.g. boneless for chicken breast, frozen for
spinach, fresh or dried for basil, powdered for sugar).
"color": The color of the food e.g. green, black, white. If no color is identified respond with colorless.
User:
Extract all food-related entities from the recipe below in backticks:
```{recipe}```
...
Example output with this update prompt is shown below:
{
"food-metadata": {
"food": "pork belly",
"quantity": "2 lb",
"unit": "pound",
"physicalQuality": "raw",
"color": "colorless"
}
}
{
"food-metadata": {
"food": "green onions",
"quantity": "2",
"unit": "pieces",
"physicalQuality": "fresh",
"color": "green"
}
}
Tip 5
Incorporate domain
knowledge to help the LLM with extracting the entities you are looking for.
Technique #6 - Use Chain-of-Thought
Chain-of-Thought (CoT)
Chain-of-Thought (CoT) is a prompting technique where each input question is followed by an intermediate reasoning step, that leads to the final answer. This shown to improve the the output from LLMs. There is also a slight variation of CoT called Zero-Shot Chain-of-Thought where you introduce “Let’s think step by step” to guide the LLM’s reasoning.
An update to the prompt now using Zero-Shot Chain-of-Thought would be:
System:
You are a food AI assistant who is an expert in natural language processing
and especially name entity recognition. The entities we are interested in are: "food", "quantity", "unit", "physicalQuality" and "color".
See further instructions below for each entity:
"food": This can be both liquid and solid food such as meat, vegetables, alcohol, etc.
"quantity": The exact quantity or amount of the food that should be used in the recipe. Answer in both full units such as 1,2,3, etc but also fractions e.g. 1/3, 2/4, etc.
"unit": The unit being used e.g. grams, milliliters, pounds, etc. The unit must always be returned.
"physicalQuality": The characteristic of the ingredient (e.g. boneless for chicken breast, frozen for
spinach, fresh or dried for basil, powdered for sugar).
"color": The color of the food e.g. green, black, white. If no color is identified respond with colorless.
Let's think step-by-step.
User:
Extract all food-related entities from the recipe below in backticks:
```{recipe}```
...
By adding “Let’s think step by step” we can see some slight improvements for the extraction:
{
"food-metadata": {
"food": "pork belly", "quantity": "2 lb",
"unit": "ounce",
"physicalQuality": "trimmed",
"color": "colorless"
}
}
Trimmed is likely a better physicalQuality
to describe the pork belly, instead of raw
. This as the pork belly is sliced and used as a topping e.g. a bowl of ramen.
Tip 6
Chain-of-thought
may improve performance, especially for fields that need some calculation or reasoning steps.
Technique #7 - Use Prompt Chaining
Prompt Chaining
To improve the reliability and performance of LLMs, one of the important prompting engineering techniques is to break tasks into subtasks. Once those subtasks have been identified, the LLM is prompted with a subtask and then its response is used as input to another prompt. This is what’s referred to as prompt chaining where a task is split into subtasks with the idea to create a chain of prompt operations.
— Prompt Engineering Guide
Prompt Chaining is somewhat similar to CoT here with the idea of breaking down the problem into sub-problems or tasks. However, the main difference here is that one prompt outputs its response from a previous prompt to the next prompt as refined context.
In our recipe example, we might construct a prompt looking like the below:
CONTEXT_PROMPT = """
You are a food AI assistant who is an expert in natural language processing
and especially named entity recognition. The entities we are interested in are: "food", "quantity", "unit", "physicalQuality" and "color".
Your task is to extract the entity, the value of the entity and the reasoning,
for why the extracted value is the correct value.
If you cannot extract the entity add "N/A".
Make sure to output the results according to the output format below.
<Food-Metadata>:
"<metadata>":
- Entity: // the actual entity i.e. pork, onions etc
- Value: // Value of the metadata field
- Reasoning: // Reasoning for why the entity is correct.
Let's think step-by-step.
"""
The output from the first prompt is:
"pork":
- Entity: food
- Value: pork belly
- Reasoning: The text mentions preparing pork belly for Japanese dishes such as ramen, and the recipe specifically calls for 2 lb of pork belly.
"green onions":
- Entity: food
- Value: green onions
- Reasoning: The text mentions using 2 green onions and specifies that 3 should be used if they are small.
"ginger":
- Entity: food
- Value: ginger
- Reasoning: The recipe specifies using 1 inch of fresh ginger.
"garlic":
- Entity: food
- Value: garlic
- Reasoning: The recipe specifies using 2 cloves of garlic.
"sake":
- Entity: food
- Value: sake
- Reasoning: The recipe calls for â…” cup of sake.
"soy sauce":
- Entity: food
- Value: soy sauce
- Reasoning: The recipe calls for â…” cup of soy sauce.
"mirin":
- Entity: food
- Value: mirin
- Reasoning: The recipe calls for ÂĽ cup of mirin.
"sugar":
- Entity: food
- Value: sugar
- Reasoning: The recipe calls for ½ cup of sugar.
"water":
- Entity: food
- Value: water
- Reasoning: The recipe specifies using 2 cups of water.
We then use the output from the prompt above as input to our extract_food_entities
prompt from before. This approach may be helpful when you have entities that need to be calculated with some reasoning around them or they may not be in the exact format that you have in your JSON schema.
Tip 7
Prompt-Chaining
can help as an import pre-processing step to provide more relevant context.
Closing Remarks
In this post, we have been walking through some useful prompt-engineering techniques that might be helpful when you deal with Named Entity Recognition (NER) using LLMs such as OpenAI.
Key Takeaways
Depending on your use-case one or several of these techniques may help improve your NER solution. However, writing clear instructions, using CoT and or prompt chaining together with tools
or functions
tend to improve the NER extraction.
Was this helpful?
Let me know what you think!