4 votes
453 reads

Cherry Picking and Global Warming

The recent extreme cold event across North America has once again opened up discussion on the impacts of global warming in media and many online blogs. Some scientists are claiming that due to the Arctic warming (result of global warming) the westerly winds in upper atmosphere, or as is called Jet Stream, is weakened and caused the unusual waviness in the stream. This waviness brings the cold Arctic air southward and causes the frigid air and record wind chills over many parts of the North America especially continental US.

3 votes
452 reads

Relationships between Probability Distributions

Probably the most well-known relationship between two probability distributions is that random variable Y has log-normal distribution if log(Y) is normally distributed. In fact, there are many of these inter-connections between different probability distributions as shown in the following figure: 

3 votes
479 reads

How to build a machine learning document classification system from scratch using R

Timothy DAuria shows how to build a machine learning document classification system from scratch in less than 30 minutes using R. He uses a text mining approach to identify the speaker of unmarked presidential campaign speeches. Other applications of this work are in brand management, auditing, fraud detection, electronic medical records, and so on.

3 votes
1705 reads

Speeding up Computations in MATLAB using GPU

The Graphics Processing Units (GPU) are being used more and more nowadays for speeding up computations as a mean for parallel programming. These units were initially designed to provide fast and smooth graphics on the computers; however, during the recent years they have been used as a tool for parallel programming. The benefit of using GPUs with respect to CPUs is that a regular computer might have 4 CPU cores but around a 100 GPU cores. 

2 votes
1081 reads

Statistics Done Wrong

“Statistics Done Wrong” is an interesting guide provided by Alex Reinhart to the most common statistical errors committed in science. 

2 votes
382 reads

Infinite Mixture Models with Nonparametric Bayes and the Dirichlet Process

Edwin Chen has a brief and yet complete introduction on nonparametric Bayes and in specific on the Dirichlet Process. He explains the concept of Dirichlet process as well as three different representations of DP: Chinese Restaurent Process, Polya Urn Model, and Stick-Breaking Process. Finally, he has a quite interesting example of using Dirichlet Process Gaussian Mixture model to cluster different items in McDonald's menu. You can find the article here:

4 votes
724 reads

Modern Bayesian Nonparametrics - NIPS 2011

An interesting talk on "Modern Bayesian Nonparametrics" by P. Orbanz and Y.W. Teh.

1 vote
509 reads

Bayesian nonparametrics in document and language modeling

A nice talk on "Bayesian nonparametrics in document and language modeling" by Yee Whye Teh. It starts with a brief introduction on Dirichlet Processes and Hirarchical Dirichlet Processes and it continues by using hierarchical dirichlet processes in document and language modeling.

2 votes
363 reads

Film on "Water in the Anthropocene"

"Water in the Anthropocene", a 3-minute film, tries to visualize the impact of humanity on the global water cycle. "As datasets build upon one another, the film charts Earth's changing global water cycle, why it is changing, and what this means for the future.

1 vote
850 reads

Step by Step tutorial on building R Hadoop System - Yanchang Zhao

In an interesting blog-post, Yanchang Zhao provides a step-by-step tutorial on how he set up hist first R hadoop system. He summarizes these steps as:
1. Install Hadoop
2. Run Hadoop
3. Install R
4. Install RHadoop