Storing Stripe Payment Data in the Database

Reading time ~4 minutes

It’s hard to know whether Stripe payment data should be stored in the local database or not. Many developers are wondering which kind of Stripe data they should save in their local DB. They might sometimes even be tempted not to store any data locally and only rely on Stripe API calls.

Let me show you how we’re dealing with this problem at NLP Cloud, and why.

NLP Cloud is a Natural Language Processing API based on spaCy and HuggingFace transformers in order to propose Named Entity Recognition (NER), sentiment analysis, text classification, summarization, text generation, and much more. Customers are charged monthly and their payment is processed by Stripe. It is important to us that the API and the user dashboard are lightning fast, so we want to rely on the Stripe API the less we can. We also don’t want to depend too much on Stripe in case of a data loss on their end.

Typical Scenario

Here is a standard scenario for a subscription based service like we have at NLP Cloud:

  1. A customer registers to your website
  2. You save the customer in DB
  3. You create the customer in Stripe through the Stripe API
  4. You save the Stripe customer ID in the local DB

You cannot really do better than this.

Now the fun begins.

For example you might want to keep track of your customer’s subscription in order to grant him access to some paid features, to more API requests, or, if he’s a free user, disable some features (for example). Sometimes, you are going to update a subscription by yourself, but sometimes Stripe will (for example if a payment fails multiple times, Stripe will mark the subscription as canceled). When a subscription is updated by Stripe, they will let you know through a webhook call. If you are using Stripe Portal, everything is going to be handled on the Stripe end, and any change is going to be sent to you through webhooks.

So dealing with Stripe is a bidirectional thing: sometimes you initiate a change, sometimes they do. It is easy to get out of sync!

Speed Considerations

One might be tempted to delegate as much information as possible to Stripe so Stripe is the single source of truth. In such a situation, you would need to make a call to the Stripe API very often. This is a bad idea.

For example, if your customer subscription data is in Stripe only, you will first need to call Stripe before allowing a customer to access or not a specific paid feature. It adds critical milliseconds to your website’s response time, which is not good. And if Stripe is temporarily lagging, your website is lagging too. In case of an API, this is out of the question: you cannot slow down an API call because you’re waiting for Stripe to return.

Disaster Recovery Considerations

Delegating information to Stripe without local data is risky. Even if Stripe is a solid player, you can never be sure that they’re not going to lose your data.

From a safety standpoint, it is paramount to store the customers’ data locally so you can start again you service somewhere else in case of a disaster, without losing any customer subscription (which would be terrible).

Caching Everything Locally

The strategy we follow at NLP Cloud is to cache everything related to Stripe customers and Stripe subscription locally. It is simpler than it might sound thanks to the fact that modern databases like PostgreSQL can store JSON data seamlessly with almost no performance tradeoffs.

Basically, what you should do - if you want to follow this strategy - is the following:

  1. When you create a Stripe customer with their API, save the Stripe JSON response in a JSON DB field (with PostgreSQL, use the JSONB type)
  2. When you create a Stripe subscription for this customer, do the same
  3. Whenever you need to access Stripe customer or subscription information, just query the customer or the subscription DB fields

Here is an example of data INSERT in a PostgreSQL JSONB field:

CREATE TABLE customers (  
  id serial NOT NULL,
  stripe_customer jsonb
  stripe_subscription jsonb
);
INSERT INTO customers VALUES (1, '{id:1, ...}', '{id:1, ...}');

And here is how you could retrieve the Stripe subscription id for example:

SELECT stripe_subscription->'id' AS id FROM customers;  

2 fields in the DB and that’s it! No need to create a bunch of new fields for every customer fields and subscription fields.

Staying in Sync

In order to make sure that your local cache is perfectly in sync with Stripe you should properly listen to Stripe webhooks.

Every time you get a Stripe webhook about a customer or a subscription, update the customer field or the subscription field in DB.

If you really want to be safe, you should also be prepared for potential Stripe webhooks failures. In that case, the best strategy would be to proactively poll Stripe customers and subscription on a regular basis, in order to make sure you never end up out of sync.

Conclusion

As you can see, it is quite easy to create both a simple and robust Stripe local cache. This strategy saves a lot of development time, it is fast, safe in case of Stripe failure, and you no longer have to wonder which Stripe fields you need to store locally or not.

I hope you found this useful. If you have feedbacks, or if you think of a better strategy, please let me know!

Existe aussi en français

API Rate Limiting With Traefik, Docker, Go, and Caching

Limiting API usage based on advanced rate limiting rule is not so easy. In order to achieve this behind the NLP Cloud API, we're using a combination of Docker, Traefik (as a reverse proxy) and local caching within a Go script. When done correctly, you can considerably improve the performance of your rate limiting and properly throttle API requests without sacrificing speed of the requests. Continue reading