You got probability and likelihood calculation wrong; this is why you can’t build a proper recommender.

I finally finished reading long overdue essay (300 лет в искаженной реальности) in Russian, which is referencing Nature publication “The ergodicity problem in economics” and “Time to move beyond average thinking”.

Those pieces are critical to understanding if you want to build a practically useful intelligent system today. The implications of this development are exposing how economists and risk managers treat probability can be seen in GDP or performance of pension funds. Still, the one example which is easier to relate is the Amazon…


Newsletter from NVIDIA about the BERT model with 1.2 milliseconds on the latest hardware reminded me that I run BERT QA (bert-large-uncased-whole-word-masking-finetuned-squad) in 1.46 ms on a laptop with Intel i7–10875H on RedisAI during RedisLabs RedisConf 2021 hackathon.

The challenge:

1) BERT QA requires a user input — question, hence it’s not possible to pre-calculate response from the server like in the case of summarization.

2) At the same time NLP based machine learning relies on the tokenisation step — converting common text into numbers before running inference, so the most common pipeline is tokenisation, inference, select a most relevant answer. …


Being green for IT first and foremost means effective use of resources.

It’s quite topical to be “Green” today and while it’s difficult to differential “Green-washing” from the action, personally I believe we can all contribute to “being green” by effective use of available resources.

You may need a large RAM server or a dedicated high-performance GPU for ML training, but do you need it 24x7 for 356 days per year?

Do you need a desktop with 1KW or even 2KW PSU on your desktop or will the laptop does just fine for most of your tasks? …


It’s brilliant: Microsoft releases GitHub co-pilot which is trained on open source code on Github.

Hurray! Developer productivity may increase occasionally.

Hold on second, have you checked licenses on those repositories you used for training?

As MS is an open-source contributor it probably knows about open source licenses, their differences and that a lot of them have a “derivative work” clause — that any code derived from the work shall also be open-sourced and attributed accordingly.

There is no legal precedence or practice regarding this and I don’t believe MS FAQ on “fair use” of open data will hold the water if challenge.

I would argue that since MS used code with GPL type of licenses to train the Co-Pilot algorithm it shall release the Co-pilot model in its entirety.

See this and this discussion.


Update: very simple example of working data fusion in every phone: panorama photos. Stitching images is a hard image-processing problem, adding gyro measurements made it way simpler.

Newsletter from Comet-ML arrived in my mailbox and it nearly forced me to spill coffee, the passage which caught my eyes:

From CVPR 2021: Tesla’s Andrej Karpathy on a Vision-Based Approach to Autonomous Vehicles

Tesla is doubling down on its vision-first approach to self-driving cars and will stop using radar sensors altogether in future releases.

From its inception, Tesla has taken a different approach to most other companies developing a self-driving car that…


One of the things I noticed in my hobby project working with Redis I started overthinking and over optimising:

Consider old code here — it makes two calls to RedisGraph, one to fetch edges and then another one to turn node ids to list of dictionary `{id:node_id,name:node_name}`, both queries are trivial:

#fetch edges
WITH $ids as ids MATCH (e:entity)-[r]->(t:entity) where e.id in ids RETURN DISTINCT e.id,t.id,max(r.rank) ORDER BY r.rank
#fetch nodes
WITH $ids as ids MATCH (e:entity) where e.id in ids RETURN DISTINCT e.id,e.name,max(e.rank)

But when I added years to nodes properties I decided to “optimise” and fetch node names…


Prior to starting in my previous employment — Shopitize Ltd, I was a researcher for about 8 years and I didn’t manage anyone, but myself. Now I find myself in charge of 35 people spread into 4 different countries. What do you do to get help? Call your mate.

So I picked up the phone and called Maxim Dorofeev who since then went to write 2 books and became well known public speaker. My question to him was: for the last 8 years I was a researcher and didn’t manage anyone but myself, now I have a team of 35…


Context

Let’s have a small organisation running a “proper” strategic loop: OODA + PDCA

OODA

Observe Orient Decide Act — this loop is on the strategic level “surface” of the organisation, observing external trends: technology, market, global political trends, then orient organisation via artefacts like strategy.

PDCA

Plan Do Check Act — loop inside the organisation, pretty much extension of Act — once you decided to act you start your plan.

What I would like to have: the system or combination of collaborative systems, which allow retaining knowledge throughout the whole lifecycle and both loops — so the questions like “can we select…


I came across Seaweed FS, which I think is perfectly architectured:

  • Acknowledgement of existence of other technologies, so master metadata can be stored in Redis(Cluster), Cassandra, Etcd
  • Any use case I could think of covered — FUSE mount, HDFS, S3 API gateway, WebDAV, Async Backup into the cloud
  • Tiered Storage.
  • Cross-Datacentre replication (I don’t need it at home, but the number of databases/storages supporting cross DC replication out of the box can be counted on one hand)

Quickstart with a single master on one RPI4 server and volume server on another took 10 minutes. Why did I spend so much…


  • Reading non-fiction book: you are not reading the book, but you are extracting knowledge from the book: electronic or printed medium
  • Watching learning video: extracting knowledge from the video medium
  • Reading blog/article: extracting knowledge from blog or article

Why this framing is important?

  • It’s [mental] work (even if it’s enjoyable activity)
  • work requires [enabling capabilities]: infrastructure, office, team (or lack of distraction), tools. Popcorn would not do if you are going to take notes, pen and paper may be better. Or YiNote extension.
  • Goal. Why do you spend time on it? Knowledge is a perishable item, how are you going to apply it?

So there are two parts:

  1. We extract knowledge
  2. We apply knowledge

Stay tuned for more on the topic

Alex Mikhalev

I am a systems thinker with a deep understanding of technology and a methodological approach to innovation

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store