Open in app
Home
Notifications
Lists
Stories

Write
Michael Larionov, PhD
Michael Larionov, PhD

Home

Published in Towards Data Science

·Dec 26, 2021

Bhattacharyya Kernels And Machine Learning on Sets of Data

Using probability distribution of the elements of the set to extract set-specific features — The problem of Machine Learning on sets arises when one of the features is defined not as a single number, but a set of objects. A good example can be when you are trying to infer something about a customer based on the order history. …

Machine Learning

5 min read

Bhattacharyya Kernels And Machine Learning on Sets of Data
Bhattacharyya Kernels And Machine Learning on Sets of Data

Published in Towards Data Science

·Dec 20, 2021

On Bayesian Geometry

Geometric interpretation of probability distributions — Bayesian inference is based on the fact that we often don’t know the underlying distribution of data, so we need to build a model and then iteratively adjust it as we get more data. In parametric Bayesian inference you start with picking a general form of the probability distribution f(x;θ)…

Bayesian Statistics

4 min read

On Bayesian Geometry
On Bayesian Geometry

Published in Towards Data Science

·Jul 30, 2021

Causal Framework for Model Robustness

Using causal modeling for feature selection and creation — Model Robustness We would all like our Machine Learning models generalize on the unseen data, but often find that our model performance drops when the new data do not look like the old data, that is have a different distribution. For example, a medical diagnostics system we trained on the data from…

Machine Learning

6 min read

Causal Framework for Model Robustness
Causal Framework for Model Robustness

Published in Towards Data Science

·May 24, 2021

Role of Data in Implicit Regularization

Using Legendre polynomials as the features can significantly improve convergence — TL;DR In this article, we revisit an apparent implicit regularization effect of Gradient Descent-based optimization algorithms. We consider it on a textbook example of overfitting that you can find on scikit-learn website. …

Machine Learning

5 min read

Role of Data in Implicit Regularization
Role of Data in Implicit Regularization

Published in CodeX

·Mar 27, 2021

How to control audio volume in Google Meet

TL;DR Use Zoom to control Google Meet volume —

Google Meet

2 min read

How to control volume on Google Meet
How to control volume on Google Meet
Photo by Anne Nygård on Unsplash

How to control audio volume in Google Meet

TL;DR Use Zoom to control Google Meet volume

--

--


Published in Towards Data Science

·Feb 17, 2021

Measuring Distance Using Convolutional Neural Network

A tutorial on deep signal processing — In signal processing, it is sometimes necessary to measure horizontal distance between some features of the signal, for example, the peaks. A good example of this could be interpreting an electrocardiogram (ECG), which relies on measuring distances for much of its interpretation. …

Machine Learning

5 min read

Measuring Distance Using Convolutional Neural Network
Measuring Distance Using Convolutional Neural Network

Published in Towards Data Science

·Jan 29, 2021

What is Data Condensation?

Small synthetic data set can be enough for training a model. — The topic of data-efficient learning an important topic in Data Science and is an active area of research. Training large models on big data could take a lot of time and resources, so the question is can we replace a large data set with a smaller one, that will nevertheless…

Data Science

7 min read

What is Data Condensation?
What is Data Condensation?

Published in Towards Data Science

·Nov 23, 2020

Bayesian probability mass estimation using TensorFlow

When all you have are categorical variables — I have spent some time studying data with categorical variables trying to explore many ways to encode them into numeric features. What if all your variables are categorical? One of the mechanism to describe this scenario known as contingency tables. Contingency tables in their essence are (potentially multidimensional) tables where…

Data Science

7 min read

Bayesian probability mass estimation using TensorFlow
Bayesian probability mass estimation using TensorFlow

Published in Towards Data Science

·Sep 13, 2020

Set Attention Models for Time Series Classification

A deep learning algorithm for real world time series data — As a data scientist working primarily with business data (sometimes also called “tabular data”), I’m always looking for latest development in the areas of data science that helps work with the more realistic data. One of this area addresses the fact that business data are rarely “tabular”, but usually are…

Icml 2020

7 min read

Set Attention Models for Time Series Classification
Set Attention Models for Time Series Classification

Published in Towards Data Science

·Jul 4, 2020

NGBoost algorithm: solving probabilistic prediction problems

Predict a distribution of the target variable, not just point estimate — While looking through the ICML 2020 accepted papers, I found an interesting paper: NGBoost: Natural Gradient Boosting for Probabilistic Prediction We present Natural Gradient Boosting (NGBoost), an algorithm for generic probabilistic prediction via gradient…arxiv.org You may ask, how many more papers we need on Gradient Boosting? But in fact, the GBT family of algorithms works really well on tabular data, consistently taking the first places in the Kaggle leaderboards. …

Icml 2020

6 min read

NGBoost algorithm: solving probabilistic prediction problems
NGBoost algorithm: solving probabilistic prediction problems
Michael Larionov, PhD

Michael Larionov, PhD

Data Scientist

Following
  • Caitlin Johnstone

    Caitlin Johnstone

  • Synced

    Synced

  • Ethan Siegel

    Ethan Siegel

  • Jesus Rodriguez

    Jesus Rodriguez

  • Cassie Kozyrkov

    Cassie Kozyrkov

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Knowable