Named Entity Recognition
NER is a sequence tagging problem, where given an input document, the task is to correctly identify the span boundaries for various entities and also classify the spans into correct entity types.
Galileo supports NER for various tagging schema including - BIO, BIOES, and BILOU. Additionally, you can use Galileo for other span classification tasks that follow similar schemas. Here's an example:
input = "Galileo was an Italian astronomer born in Pisa, and he discovered the moons of planet Jupiter"
output = [{"span_text": "Galileo", "start": 0, "end": 1, "label": "PERSON"},
{"span_text": "Italian", "start": 3, "end": 4, "label": "MISCELANEOUS"},
{"span_text": "Pisa", "start": 6, "end": 7, "label": "LOCATION"},
{"span_text": "Jupiter", "start": 13, "end": 14, "label": "LOCATION"}]
Last modified 10d ago