The “Master of the Robots” on machine learning in finance

Marcos Lopez de Prado

The Master of the Robots

Marcos Lopez de Prado, who was named Quant of the Year for 2019 by the Journal of Portfolio Management, is widely regarded as one of the leading quantitative mathematicians in today’s financial world. He currently ranks #1 among authors in the economics field on the SSRN research network, as measured by downloaded articles within the past 12 months.

In a Bloomberg article titled The Master of Robots Left AQR. Now He’s Coming for Wall Street, Lopez de Prado explains why he established his own company (True Positive Technologies) to dispense algorithms and expertise in the machine learning area: “There is tremendous hype and very few people have a track record. … It’s not helpful.”

It is clear that many machine learning or AI-based investment funds are only marginally successful. The Eurekahedge AI index registered a -4.31% return in 2018 and 3.35% in 2019 (as of 9 Oct), compared with -4.01% and 5.73%, respectively, for the overall equal-weighted Eurekahedge index, and with -6.37% and 16.49%, respectively, for an S&P500 index fund.

According to the Bloomberg article, Lopez de Prado’s diagnosis is that “Fund managers are routinely throwing data at a robot without forming a theory … Without this theory-ML interplay, investors are placing their trust on either toy models or high-tech horoscopes.”

How machine learning differs from traditional regression and big data

Lopez de Prado has also posted a new paper to the SSRN site: Q&A on Financial Machine Learning. In this article, Lopez de Prado explains how machine learning differs from traditional regression analyses that have been the mainstay of economics and finance. He illustrates this with the following example:

Consider the following example: A researcher wishes to estimate the survival probability of a passenger on the Titanic, based on a number of variables, such as gender, ticket class, age, etc. A typical regression approach would be to fit a logit model to a binary variable, where 1 means survivor and 0 means deceased, using gender, ticket class and age as regressors. It turns out that, even though these regressors are correct, a logit (or probit) regression model fails to make good predictions. The reason is that logit models do not recognize that this dataset embeds a hierarchical (tree-like) structure, with complex interactions. For example, adult males in second class died at a much higher rate than each of these attributes taken independently. In contrast, a simple “classification tree” algorithm performs substantially better, because the algorithm learns the hierarchical nature of the dataset (and associated complex interactions).

Lopez de Prado also emphasizes that machine learning is more than merely “big data” — practitioners must properly analyze this data to be effective, and must utilize more sophisticated models than those traditionally utilized in economics. As he explains:

The bad news is that these datasets are beyond the grasp of econometrics, and pose multiple challenges to the study of economics. To cite just a few: (a) some of the most interesting datasets are unstructured. They can also be non-numerical and non-categorical, like news articles, voice recordings or satellite images; (b) these datasets are high-dimensional (e.g., credit card transactions.) The number of variables involved often greatly exceeds the number of observations, making it very difficult to apply linear algebra solutions; (c) many of these datasets are extremely sparse. For instance, samples may contain a large proportion of zeros, where standard notions such as correlation do not work well; and (d) embedded within these datasets is critical information regarding networks of agents, incentives and aggregate behavior of groups of people. ML techniques are designed for analyzing big data, which is why they are often cited together.

Less of a casino and more of a utility

Lopez de Prado’s long-term vision is that eventually finance will be regarded as straightforward, well-understood and perhaps even a tad boring, like going to the doctor, where there is a protocol for addressing each particular medical problem or need. “That is my hope: eventually we make finance more scientific and as a result it becomes less of a casino and more of a utility.”

For additional details, see the Bloomberg article and the SSRN paper.

Comments are closed.