Making an NER API with FastAPI and spaCy

Named Entity Recognition (NER) is an interesting NLP feature that is made very easy thanks to spaCy. If you want to expose your NER model to the world, you can easily build an API with FastAPI.

FastAPI is a new API engine that has just been released a couple of months ago. It makes API development both fast and convenient.

As for spaCy, in case you don’t know it yet, it’s a great open-source framework for NLP, and especially NER.

Code example

We want to build an API endpoint that will return entities from a simple sentence: “John Doe is a Go Developer at Google”.

The following code is mostly coming from this great “spacy-api-docker” repo by jgontrum (thanks!): https://github.com/jgontrum/spacy-api-docker/, and most specifically from this file: https://github.com/jgontrum/spacy-api-docker/blob/master/displacy_service/parse.py.

The API will return each entity along with it’s position.

[
  {
    "end": 8,
    "start": 0,
    "text": "John Doe",
    "type": "PERSON"
  },
  {
    "end": 25,
    "start": 13,
    "text": "Go Developer",
    "type": "POSITION"
  },
  {
    "end": 35,
    "start": 30,
    "text": "Google",
    "type": "ORG"
  },
]

Here is the code:

import spacy
from fastapi import FastAPI
from pydantic import BaseModel
from typing import List

en_core_web_lg = spacy.load("en_core_web_lg")

api = FastAPI()

class Input(BaseModel):
    sentence: str

class Extraction(BaseModel):
    first_index: int
    last_index: int
    name: str
    content: str

class Output(BaseModel):
    extractions: List[Extraction]

@api.post("/extractions", response_model=Output)
def extractions(input: Input):
    document = en_core_web_lg(input.sentence)

    extractions = []
    for entity in document.ents:
      extraction = {}
      extraction["first_index"] = entity.start_char
      extraction["last_index"] = entity.end_char
      extraction["name"] = entity.label_
      extraction["content"] = entity.text
      extractions.append(extraction)

    return {"extractions": extractions}

First we load the spaCy model:

en_core_web_lg = spacy.load("en_core_web_lg")

Then we perform NER:

document = en_core_web_lg(input.sentence)
# [...]
document.ents

Data validation

Thanks to FastAPI it is easy to perform input and output data validation:

class Extraction(BaseModel):
    first_index: int
    last_index: int
    name: str
    content: str

class Output(BaseModel):
    extractions: List[Extraction]

Conclusion

Thanks to spaCy and FastAPI, building an entity extraction API has never been so easy. I hope it will help you for your next project!

If you have questions, please let me know!

Making an NER API with FastAPI and spaCy

March 10, 2019

Code example

Data validation

Conclusion

API Rate Limiting With Traefik, Docker, Go, and Caching

API Analytics With Time-Series Thanks to TimescaleDB

Storing Stripe Payment Data in the Database