Making an NER API with FastAPI and spaCy

Reading time ~1 minute

Named Entity Recognition (NER) is an interesting NLP feature that is made very easy thanks to spaCy. If you want to expose your NER model to the world, you can easily build an API with FastAPI.

FastAPI is a new API engine that has just been released a couple of months ago. It makes API development both fast and convenient.

As for spaCy, in case you don’t know it yet, it’s a great open-source framework for NLP, and especially NER.

Code example

We want to build an API endpoint that will return entities from a simple sentence: “John Doe is a Go Developer at Google”.

The following code is mostly coming from this great “spacy-api-docker” repo by jgontrum (thanks!): https://github.com/jgontrum/spacy-api-docker/, and most specifically from this file: https://github.com/jgontrum/spacy-api-docker/blob/master/displacy_service/parse.py.

The API will return each entity along with it’s position.

[
  {
    "end": 8,
    "start": 0,
    "text": "John Doe",
    "type": "PERSON"
  },
  {
    "end": 25,
    "start": 13,
    "text": "Go Developer",
    "type": "POSITION"
  },
  {
    "end": 35,
    "start": 30,
    "text": "Google",
    "type": "ORG"
  },
]

Here is the code:

import spacy
from fastapi import FastAPI
from pydantic import BaseModel
from typing import List

model = spacy.load("en_core_web_lg")

app = FastAPI()

class UserRequestIn(BaseModel):
    text: str

class EntityOut(BaseModel):
    start: int
    end: int
    type: str
    text: str

class EntitiesOut(BaseModel):
    entities: List[EntityOut]

@app.post("/entities", response_model=EntitiesOut)
def read_entities(user_request_in: UserRequestIn):
    doc = model(user_request_in.text)

    return {
        "entities": [
            {
                "start": ent.start_char,
                "end": ent.end_char,
                "type": ent.label_,
                "text": ent.text,
            } for ent in doc.ents
        ]
    }

First we load the spaCy model:

model = spacy.load("en_core_web_lg")

Then we perform NER:

doc = model(user_request_in.text)
# [...]
doc.ents

Data validation

Thanks to FastAPI it is easy to perform input and output data validation:

class EntityOut(BaseModel):
    start: int
    end: int
    type: str
    text: str
class EntitiesOut(BaseModel):
    entities: List[EntityOut]

Conclusion

Thanks to spaCy and FastAPI, building an entity extraction API has never been so easy. I hope it will help you for your next project!

If you have questions, please let me know!

Storing Stripe Payment Data in the Database

It's hard to know whether Stripe payment data should be stored in the local database or not. Let me show you how we're solving this issue at NLP Cloud. Continue reading