Chaobin Tang (唐超斌)

A probablistic approach in pattern recognition and bayes' theorem

In supervised learning, data is provided to us which can be considered as evidence. For example, in a text classification system, we may have a collection of texts (corpus) that can be percieved as evidence as to how language is used in real world that can give us insight to the text genre, author gender, text sentiment, etc. And based on these evidence, we can try to get a better opinion to classify a new text. 17/09/2017

Understand and build a neural network on digit recognition

The multilayer perceptron, is not only itself a very powerful statistical model used in classification, it is now a building block to some of the deep networks that made recent headlines. The understanding of MLP can rise from that of the logistic regression previously investigated, and will be essential to understand many 2016 deep neural networks. In this post, in addition to the mathematical reasoning that I accepted as being necessary, I am to provide an intuition that helped me a lot to grasp the back propagation algorithm. In the end, a classifier using everything in this post is built and tested on a task of recognizing hand written digits from images. 07/06/2016

使用本地dns和socks5代理服务器改善访问受限网络服务

几年前买的RaspberryPi A+一直在闲置,每小时不到1W的功耗让它可以胜任很多有用的小服务。继在上面mount了一个基于evdev的Xbox360 Controller Accepter, 前段时间我在Pi上部署了另外几项服务,用来提升访问中国以外地区网络服务的体验。我明显的留意到,这个新的方案带来的改善明显超过了过去几年我使用的若干种方案。 30/05/2016

Logistic regression on income prediction

The non-linear sigmoid function allows us to interpret the mapped result as the posterior probability of a category given data x. This has important applications. Also, starting with a basic likelihood function, we derive a cost function sometimes called the cross entropy function used to quantify the quality of the model's prediction. Once again, this cost function can be used in the gradient descent to find an optimal set of parameters that best predicts the category given x. In this post, we develop a classifiction model that will be trained and used to predict the income category using an online archive of income data. 13/03/2016

Linear regression with multiple features

In trying to understand radient descent, I have built a linear regression model with one input, now I am taking that same model and generalize it to use multiple inputs. So an immediate question to construct this model is what inputs or features I am going to use. It turns out this question is a general question in machine learning. To decide the inputs for a model not only involves the domain knowledge, such as knowledge on the the credit in building a credit risk model, also involves many techniques learning useful information from the training data. 09/03/2016

Gradient descent intuitively understood

Gradient Descent is one of many wildly used optimization algorithms. It’s built on measuring the change of a function with respect to the parameter.There are other variants that extend the vanilla version of Gradient Descent and performs better than it. But a good understanding of it is important to begin with. 08/03/2016

A working understanding on ssl/tls and https using python

SSL is designed against man-in-the-middle attack. Safty is no easy thing. SSL can ensure a secured connection if it is correctly implemented. Right now, the possibly most popular implementation is OpenSSL. The ssl in Python's stdlib is essentially a wrapper around it. It provides a small set of very high level operations. To make use of it, a basic understanding on SSL is important. 22/07/2015

Understand import system of python

In some rare situations, you need to work with Python's import system for some customized behaviours. There are several ways of doing it. The specifications of how the import system works varied from time to time, and more so between Python 2 and Python 3. Here I give you a detailed breakdown, yet hopefully easy to understand. 22/06/2015

Inter-operate with c/c++ in python

It's in Python's early days that its ability to inter-operate with lower level languages made it promising. Later, many solutions emerged to improve this ability in some way. 13/05/2015