Hallucinations on LLM detection
In the context of LLMs, “hallucination” refers to a phenomenon where the model generates text that is incorrect, nonsensical, or not real. Bias refers to the bias learnt during training.
Type of Hallucinations
-
Factual Inaccuracies: The LLM produces a statement that is factually incorrect.
-
Unsupported Claims: The LLM generates a response that has no basis in the input or context.
-
Nonsensical Statements: The LLM produces a response that doesn’t make sense or is unrelated to the context.
-
Improbable Scenarios: The LLM generates a response that describes an implausible or highly unlikely event.
Measure Hallucinations
How to detect hallucination issue?
Measure NER
Check if a new named entity (NE) is present in the text != from the ground truth.
Simple code:
import nltk
from nltk.tokenize import word_tokenize
from nltk.tag import pos_tag
import nltk
nltk.download('averaged_perceptron_tagger')
nltk.download('punkt')
def check_hallucination_by_NE(source_text, generated_text, verbose=False, pattern = 'NP: {<DT>?<JJ>*<NN>}', acceptable_ne=['NNP']):
cp = nltk.RegexpParser(pattern)
def ne(sent):
sent = nltk.word_tokenize(sent)
sent = nltk.pos_tag(sent)
return cp.parse(sent)
source_ne = ne(source_text)
generated_ne = ne(generated_text)
def filter_ne(ne_list):
ret = []
for _ne in ne_list:
if (len(_ne) ==2) and (_ne[1] in acceptable_ne):
ret.append(_ne[0])
return ret
source_ne = filter_ne(source_ne)
generated_ne = filter_ne(generated_ne)
if len(generated_ne)<=0:
return -1
score = 0
for _ne in generated_ne:
if (verbose):
print(_ne)
if _ne in source_ne:
score +=1
return score / len(generated_ne)
and now try it
check_hallucination_by_NE("President Biden was born USA", "President Biden was born France", verbose=True)