BERT models with millisecond inference — The Pattern project

The challenge:

  1. Convert and pre-load BERT models on each shard of RedisCluster (code)
  2. Pre-tokenise all potential answers using RedisGears and distribute potential answers on each shard of Redis Cluster using RedisGears(code for batch and for event-based RedisGears function)
  3. Amend calling API to direct question query to shard with most likely answers. Code. The call is using graph-based ranking and zrangebyscore to find the most ranked sentences in response to question and then gets relevant hashtag from sentence key
  4. Tokenise question. Code. Tokenisation happening on shard and uses RedisGears and RedisAI integration via `import redisAI`
  5. Concatenate user question and pre-tokenised potential answers. Code
  6. Run inference using RedisAI. Code model run in async mode without blocking the main Redis threat, so shard can still serve users
  7. Select answer using max score and convert tokens to words. Code
  8. Cache the answer using Redis — next hit on API with the same question returns the answer in nanoseconds.Code this function uses ‘keymiss’ event.
  1. Clevo laptop with Intel(R) Core(TM) i7–10875H CPU @ 2.30GHz, 64 GB RAM, SSD

--

--

--

I am a systems thinker with a deep understanding of technology and a methodological approach to innovation

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Using Statistics To Determine Which Fitness Watch To Wear When Joining Virtual Races

Monte Carlo Methods for Risk Management: CVA and the Merton Model in Python

Linear Regression with scikit-learn

Data Driven Modeling of Complex Systems

Top 10 statistical tools for medical research.

India Covid-19: Post-covid health Infrastructure

LINEAR REGRESSION MODEL WITH SPSS

Who is the Main Character on Friends?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alex Mikhalev

Alex Mikhalev

I am a systems thinker with a deep understanding of technology and a methodological approach to innovation

More from Medium

Gilbane Advisor 2–2–22 — Block protocol, structured data, NLP, KGs

Behind The Scenes of Real-Time Interactive Recommendations

NLP Automation for Invoice Processing

Community detection at Eurovision