A field day for machine learning and artificial intelligence PhDs
In a 27 March 2019 Bloomberg op-ed, Stony Brook University professor Noah Smith describes the quest by many technology and finance companies to hire top-tier PhD graduates, particularly in machine learning (ML) and artificial intelligence (AI). A 2017 Paysa study found that 35% of listed jobs in ML and AI required a PhD.
All of the major tech firms are aggressively expanding their staffs in the ML and AI arenas. Google has more than tripled its number of machine learning researchers in the past few years. Amazon is hiring persons not only to support its popular Echo product, but also in its core online shopping and enterprise cloud operations. Apple is redoubling its efforts to support and upgrade its Siri voice-recognition system, and recently began offering an advanced face recognition feature on its iPhones and iPads, complete with a special-purpose “neural engine” in its processors to support this function. Facebook is aggressively hiring to improve its ability to connect users with the content they prefer, while at the same time blocking fake news and other objectionable content. Microsoft (see also here and here) is pursuing a broad range of research and development in ML, AI and deep learning.
Salaries for those in ML and AI are quite generous: Paysa found that average U.S. salaries in the field ranged from $102,000 in the Midwest to $157,000 on the West Coast. These are 2017 salaries; current salaries are significantly higher.
Machine learning and artificial intelligence PhDs in finance
Leading quant firms are aggressively recruiting top ML/AI researchers in the finance field. Some persons in the field earn more than $1 million per year.
There is no shortage of tasks for persons with ML and AI training in finance. Obviously one major area is in quantitative finance, namely using ML and AI to automatically find statistically significant phenomena in market data and data from other sources that can lead to a profitable trading strategy. It is worth noting here that highly mathematical, big-data-oriented quant firms and trading operations are, to a steadily increasing degree, the only ones that consistently earn better-than-market-average returns. See this previous Mathematical Investor blog for more details.
One good reference here is the recently-published book Advances in Financial Machine Learning by Marcos Lopez de Prado. It explores commonly used data structures in finance, modeling techniques, backtesting techniques (and ways to avoid backtest overfitting), and other more advanced techniques based on a machine learning approach.
But there are numerous other opportunities for ML and AI in finance. According to a Bloomberg report, some specific areas that are prime for automation include:
- Sell side credit markets: Natural-language processing, data collection and machine learning are being applied to automate subjective human decisions.
- Sell side foreign exchange: Big data and machine learning are being used to anticipate variations in client demand and the resulting price swings.
- Sell side commodities: Trader and salesperson conversations are being catalogued to create profiles of clients.
- Sell side equities: Artificial intelligence is being applied to order execution.
- Buy side equities: Predictive analytics is being applied to time stock purchases and assess risk based on market liquidity.
- Buy side credit: Computer programs are being trained to scan and understand bond covenants, legal documents and court rulings.
- Buy side macroeconomics: Natural-language processing is being used to analyze central bank commentary for clues on monetary policy. Other software is analyzing data such as oil-tanker shipments and satellite images (e.g., Chinese industrial sites, Walmart parking lots and more) to spot trends in the economy.
Other potential applications for ML, AI and big data in finance are highlighted in two previous Mathematical Investor blogs: Blog A and Blog B.
What is the best training for finance PhDs?
In his Bloomberg column, Noah Smith wonders whether PhDs are really the best choice for industry in general, or, at the least, whether the current training for finance PhDs in particular is the best preparation for careers in the field. He suggests forming academic tracks that guide students to employment in the industry, possibly including apprentice-like research done in conjunction with advisers in the private sector, with dissertation research possibly done in team efforts rather than alone.
Marcos Lopez de Prado has also expressed concern about the typical preparation of researchers in finance careers. He notes, for example, that econometric models often employ statistical practices, such as multiple testing, that are not only considered ineffective but also downright unethical in other scientific research fields. And while most mathematical training for such persons is in areas such as linear algebra and calculus, topics such as graph theory, topology, discrete mathematics, information theory and signal processing are rising in importance.
In a Institutional Investor commentary, Lopez de Prado goes further, arguing that “The presence of financial academia is fading, something that was unthinkable 10 years ago.” He adds, “The [leading] edge is not yet another reincarnation of the capital asset pricing model,” but instead it is in analyzing heretofore untapped data sources. Technologies such as FinTech, big data, ML and quantum computing are likely to render traditional academic education in finance even more irrelevant. Compounding the problem is that many academic journals in the finance field are mostly geared as “tenure-track vehicles” for aspiring professors, rather than venues for state-of-the-art research by practitioners. Similarly, books in the field are written by authors who, in many cases, have not actually attempted to field their techniques. As Lopez de Prado explains, “They contain extremely elegant mathematics that describe a world that does not exist.”
As David H. Bailey and Lopez de Prado further argued in a Forbes commentary, interviewed by Brett Steenbarger, rigorous training in statistics is typically not given its appropriate emphasis for prospective finance professionals, PhD or not. As a result, the finance field is replete with backtest overfitting and multiple-testing errors and, even more significantly, many in the field fail to appreciate how deeply these difficulties pervade modern finance, and the extent to which customers and the public are potentially misled by inaccurate claims.
Industry-university partnerships
What is the answer? Like Noah Smith, Lopez de Prado believes that universities need to create industry-university apprenticeships. He envisions students pursuing finance degrees at institutions such as the University of Chicago or the Wharton School of Business “dirty[ing] their hands in the Wall Street weeds,” so that they can better compete with highly trained mathematicians and computer scientists graduating from places like MIT and the California Institute of Technology.
Indeed, finance is rapidly becoming a full-fledged data-driven discipline, on a par with scientific disciplines such as physics (high-energy accelerator data), astronomy (large star databases and remotely monitored telescopes), cosmology (cosmic microwave background data), environmental monitoring (real-time earth-observing satellite data) and even biology (DNA sequence data). Programs for training researchers in the finance field need to reflect this new reality.