190906

September 06th, 2019

Today’s work

  1. Self-study for Gaussian Process
    • Multivariate Normal Distribution
    • Cholesky Decomposition - Generating Multivariate Normal Random Number
    • Bayesian Optimization (Fully understanding this is final goal)

Since I am going to tune hyperparameter of GBM like LightGBM or Catboost for the Kaggle Competition that I am now in, I’d like to fully understand how the Bayesian Optimization. While studying these, I have been learning more detail of matrix decomposition and Gaussian Distrubiton. I’ve also been realizing that how important they are in machine learning.

  1. New Posts
    • Schur Complement
    • Cholesky Decomposition

A hard one for these posts is that I wasn’t sure how deeply I should study for them. Both of Schur Complement and Cholesky Decomposition are widely and significantly used in statistics.

2019년 9월 6일

오늘 할 일

  1. Gaussian Process 셀프 스터디
    • Multivariate Normal Distribution
    • Cholesky Decomposition - Generating Multivariate Normal Random Number
    • Bayesian Optimization (Fully understanding this is final goal)

Gaussian Process 를 공부하게 된 계기는 현재 참여하고 있는 Kaggle 대회에서 LightGBM 이나 Catboost 알고리즘을 사용하려 하는데, 여기서 Hyperparameter tuning을 Random Search 나 Grid Search 가 아닌 Bayesian Optimization을 사용하기에 공부하기 시작했다.
공부하다보니 가장 기본이 되는 Matrix Decomposition, Gaussian Distribution등에 대해 더 자세히 배우고 있으며, 이들이 머신러닝에서 얼마나 많이 사용되며 중요한지 배우게 되었다.

  1. 새로운 포스팅
    • Schur Complement
    • Cholesky Decomposition

이번 포스팅중 어려웠던 점은 Schur Complement과 Cholesky Decomposition은 Matrix 를 다루는 분석 혹 연구에서 매우 중요하게, 그리고 많이 쓰이기에 그 개념들을 얼마나 깊게 공부해야 하는지였다. 결국 다른 공부를 할때에도 그 개념들은 이어질테니 그때 다시 더 깊게 공부해보고 싶다.

190828

August 28th, 2019

Today’s work

  1. Convex Optimization Course - Duality;
    weak and strong duality
  2. IEEE Kaggle - Data Organization;
    convert data by Data Description - link
  3. Study LightGBM with the video presented by Mateusz Susik from McKinsey - link

XGBoost by Tianqi Chen- link

오늘 할 일

  1. Convex Optimization 수업 - Duality; weak and strong duality
  2. IEEE 카글 대회 - 데이터 정리; 데이터 Description에 따라 변수들 변환 - link
  3. LightGBM 원리 공부 - 프레젠테이션 Mateusz Susik from McKinsey - link

XGBoost -> LightGBM

Update:

Duality -

  1. Form Lagrange

    $\mathcal{L}(x,\lambda,\nu)$ = $f_0 (x) + \sum_{i=1}^{m} \lambda_i f_i(x) + \sum_{i=1}^{p} \nu_i h_i (x)$

  2. Set gradient for $x$ equal to zero to minimize $\mathcal{L}$

    $\nabla_x \mathcal{L}$ = 0

  3. Plug it in $\mathcal{L}$ to get the Lagrangian dual function;

    $g(\lambda, \nu)$ = $\inf_{x\in D} \mathcal{L}(x, \lambda, \nu)$

    which is a concave function, can be $-\infty$ for some $\lambda, \nu$

    Lagrangian dual function is a concave function, since the Lagrangian form is affine function, and infimum of any family of affine is concave.

    We want to maximize lower bound (concave) to get the best optimal points, and maximizing lower bound is convex optimization problem.

LightGBM - XGBoost (either histogram implementation available)

One of the method used in LightGBM is ‘Graident-based one-side sampling’ that is the biggest benefit of LightGBM. This method is to concentrate on data points with large gradients and ignore data points with small graidents (close to local minima).

LightGBM - XGBoost(둘 다 histogram implementation 가능)

가장 큰 장점은 Gradient-based one-side sampling 이라는 방법으로, gradient가 큰 데이터 포인트들을 집중하며, small gradients (Gradients가 작다는 것은 local minima에 가깝다는것이고, 그 말은 즉 Residual 혹은 loss 가 적다는것) 들은 무시하기에 training 속도가 굉장히 빠르다는 것.

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×