Posts

Statistical Analysis of Facebook Network of Friends

“The Future of Economics Uses the Science of Real-Life Social Networks” - PAUL OMEROD 1 The goal of this project is not to make a report or literature review or synthesis, rather to get some hands-on experience in working with Graphs and Network Data based on some classical and original (own) datasets and problems. It will involve both some theoretical understanding and programming. The outcome would be to get comfortable with this type of data and maybe build the ground for some future research.

King County Homes Challenge. Exploratory Data Analysis

The King County Homes prices prediction challenge is an excellent dataset for trying out and experimenting with various regression models. As we’ll see in the following post on Moscow flats, the modeler deals with similar challenges: skewed data and outliers, highly correlated variables (predictors), heteroskedasticity and a geographical correlation structure. Ignoring one of these may lead to undeperforming models, so in this post we’re going to carefully explore the dataset, which should inform which modeling strategy to choose.

Understanding Caratheodori Extension Theorem

In order to build adequate models of economic and other complex phenomena, we have to take into account their inherent stochastic nature. Data is just the appearance, an external manifestation of some latent processes (seen as random mechanisms). Even though we won’t know the exact outcome for sure, we can model general regularities and relationships as a result of the large scale of phenomena. For more ideas see (Ruxanda 2011)

CAPM and Eugene Fama's devastating critique

I was impressed by the down-to-earth debate between Eugene Fama and Richard Thaler. Their discussion was very insightful in order to make sense of what’s going on with Efficient Market Hypothesis, CAPM, Fama and French 3 Factor Model, Markowitz and where is the field moving. This will be my last blog post on economics for a while, so expect lots of Machine Learning and Statistics topics next. This is a continuation that is supposed to add some missing pieces to the analysis done in the partI and partII

Tiny Steps in Prospect Theory and Investment Decisions Part II

Last time we went through a rigorous process of eliciting prior beliefs about 5 stocks, exploratory data analysis and quite advanced descriptive stats. The last part of the assignment has the goal of drawing connections to the behavioral economics principles. A lesson learned for now, is that there are many pitfalls even in most innocently looking questions.

Part IV. Portfolio Construction by Simulation Before we dig in, I would like to suggest the following reading "Please no, not another bias" by Jason Collin.

Tiny Steps in Prospect Theory and Investment Decisions Part I

This is an assignment for the Behavioral Economics class at Quantitative Economics Masters taught by prof. dr. Anamaria Aldea. The subject is refreshing in the sense that it brings back the real world into the classroom with a show me the evidence / data attitude.

Nonetheless, this is hard to deliver as experimental data is scarce and classroom experiments involving a small sample of people with neoclassical training are hardly representative.

On Google's Alpha Zero beating the best chess engine

Chess has taught me how to better deal with Dynamic Complexity1, even though I did not realize this at the time. What I saw in these games by Alpha Zero was a Deep Reinforcement Learning algorithm that captured the essence of chess. It’s not the fact that Stockfish got crushed, it’s the fashion which brought so much fascination and excitement. [Thinking] What if I could put those rusty chess skills to good use again?

Reproducing The Economist Chart

While searching for solutions to fine-tune ggplot2 visualizations, I stumbled upon1 a nice challenge: to reproduce The Economist’s plot showing the correlation between Corruption Perception Index and the Human Development Index2.

The conclusion from this exercise is that even a simple task like reproducing a chart involves various analysis and modeling decisions. The only way to reduce the amount of mistakes is to make the analysis reproducible.

Following the data trails, there are two sources of data:

R Tutorials for Behavioral Economics Class

In the last blog post I took a bird’s eye (personal) perspective of R programming and suggested not to be discouraged by early encounters with this seemingly weird language. The conclusion was that by following “the right tool for the right job” principle, R is a great language for statistical research and the ecosystem of packages improves the data analysis workflow and gives the modeler more tools to extract insights from data.

R Programming. The big picture

I wish somebody showed me the real power of R earlier and explained the big picture

This is not an usual tutorial on R, my goal being to make you aware and curious about various topics related to Data Analysis in R, which I learned the hard way during a year of nearly daily use. It is not supposed to be easy or have a particular application in mind, but rather to suggest many possibilities.