Transcript: Jon McAuliffe

 

 

The transcript from this week’s, MiB: Jon McAuliffe, the Voleon Group, is below.

You can stream and download our full conversation, including any podcast extras, on iTunes, SpotifyGoogle, YouTube, and Bloomberg. All of our earlier podcasts on your favorite pod hosts can be found here.

~~~

ANNOUNCER: This is Masters in Business with Barry Ritholtz on Bloomberg Radio.

BARRY RITHOLTZ, HOST, MASTERS IN BUSINESS: This week on the podcast, strap yourself in. I have another extra special guest. Jon McAuliffe is co-founder and chief investment officer at the Voleon Group. They’re a $5 billion hedge fund and one of the earliest shops to ever use machine learning as it applies to trading and investment management decisions. It is a full systematic approach to using computer horsepower and database and machine learning and their own predictive engine to make investments and trades and it’s managed to put together quite a track record.

Previously, Jon was at D. E. Shaw where he ran statistical arbitrage. He is one of the people who worked on the Amazon recommendation engine, and he is currently a professor of statistics at Berkeley.

I don’t even know where to begin other than to say, if you’re interested in AI or machine learning or quantitative strategies, this is just a master class in how it’s done by one of the first people in the space to not only do this sort of machine learning and apply it to investing, but one of the best. I think this is a fascinating conversation, and I believe you will find it to be so.

Also, with no further ado, my discussion with Voleon Group’s Jon McAuliffe.

Jon McAuliffe, welcome to Bloomberg.

JON MCAULIFFE, CO-FOUNDER AND CHIEF INVESTMENT OFFICER, THE VOLEON GROUP: Thanks, Barry. I’m really happy to be here.

RITHOLTZ: So let’s talk a little bit about your academic background first. You start out undergrad computer science and applied mathematics at Harvard. Before you go on to get a PhD from California Berkeley, what led to a career in data analysis? How early did you know that’s what you wanted to do?

MCAULIFFE: Well, it was a winding path, actually. I was very interested in international relations and foreign languages when I was finishing high school. I spent the last year of high school as an exchange student in Germany. And so when I got to college, I was expecting to major in government and go on to maybe work in the foreign service, something like that.

RITHOLTZ: Really? So this is a big shift from your original expectations.

MCAULIFFE: Yeah. It took about one semester for me to realize that none of the questions that were being asked in my classes had definitive and correct answers.

RITHOLTZ: Did that frustrate you a little bit?

MCAULIFFE: It did frustrate me. Yeah.

And so I stayed home over winter. I stayed, excuse me, I didn’t go home. I stayed at college over winter break to try to sort out what the heck I was going to do because I could see that it wasn’t, my plan was in disarray. And I’d always been interested in computers, had played around with computers, never done anything very serious, but I thought I might as well give it a shot. And so in the spring semester, I took my first computer science course. And when you write software, everything has a right answer. It either does what you want it to do or it doesn’t.

RITHOLTZ: Does not compile.

MCAULIFFE: Exactly.

RITHOLTZ: So that’s really quite fascinating. So what led you from Berkeley to D. E. Shaw? They’re one of the first quant shops. How did you get there? What sort of research did you do?

MCAULIFFE: Yeah, I actually, I spent time at D. E. Shaw in between my undergrad and my PhD program. So it was after Harvard that I went to D. E. Shaw.

RITHOLTZ: So did that light an interest in using machine learning and computers applied to finance or what was that experience like?

MCAULIFFE: Yeah, it made me really interested in and excited about using statistical thinking and data analysis to sort of understand the dynamics of securities prices.

Machine learning did not play really a role at that time. I think not at D. E. Shaw, but probably nowhere. It was too immature a field in the ’90s. But I had already been curious and interested in using these kinds of statistical tools in trading and in investing when I was finishing college. And then at D. E. Shaw, I had brilliant colleagues and we were working on hard problems. So I really got a lot of it.

RITHOLTZ: Still one of the top performing hedge funds, one of the earliest quant hedge funds, a great a great place to cut your teeth at.

MCAULIFFE: Absolutely.

RITHOLTZ: So was it Harvard, D. E. Shaw, and then Berkeley? Yeah, that’s right. And then from Berkeley, how did you end up at Amazon? I guess I should correct myself. There was a year at Amazon after D. E. Shaw, but before Berkeley. And am I reading this correctly? The recommendation engine that Amazon uses, you helped develop?

MCAULIFFE: I would say I worked on it.

RITHOLTZ: Okay.

MCAULIFFE: It existed. place when I got there. And the things that are familiar about the recommendation engine had already been built by my manager and his colleagues.

But I did research on improvements and different ways of forming recommendations. It was funny because at the time, the entire database of purchase history for all of Amazon fit in one 20 gigabyte file on a disk so I could just load it on my computer and run that.

RITHOLTZ: I don’t think we could do that anymore.

MCAULIFFE: We could not.

RITHOLTZ: So thank goodness for Amazon Cloud Services so you could put, what is it, 25 years and hundreds of billions of dollars of transactions?

MCAULIFFE: Yes.

RITHOLTZ: So my assumption is products like that are highly iterative. The first version is all right, it does a half decent job and then it gets better and then it starts to get almost spookily good. It’s like, “Oh, how much of that is just the size of the database and how much of that is just a clever algorithm?”

MCAULIFFE: Well, that’s a great question because the two are inextricably linked. The way that you make algorithms great is by making them more powerful, more expressive, able to describe lots of different kinds of patterns and relationships. But those kinds of approaches need huge amounts of data in order to correctly sort out what signal and what’s noise.

The more expressive a tool like that is, like a recommender system, the more prone it is to mistake one-time noise for persistent signal. And that is a recurring theme in statistical prediction. It is really the central problem in statistical prediction.

So you have it in recommender systems, you have it in predicting price action in the problems that we solve and elsewhere.

RITHOLTZ: There was a pretty infamous New York Times article a couple of years ago about Target sending out, using their own recommender system and sending out maternity things to people. A dad gets his young teenage daughters “What is this?” And goes in to yell at them and turns out she was pregnant and they had pieced it together.

How far of a leap is it from these systems to much more sophisticated machine learning and even large language models?

MCAULIFFE: The answer, it turns out, is that it’s a question of scale that wasn’t at all obvious before GPT-3 and ChatGPT, but it just turned out that when you have, for example, GPT is built from a database of sentences in English, it’s got a trillion words in it, that database.

RITHOLTZ: Wow.

MCAULIFFE: And when you take a trillion words and you use it to fit a model that has 175 billion parameters, there is apparently a kind of transition where things become, you know, frankly astounding. I don’t think that anybody who isn’t astounded is telling the truth.

RITHOLTZ: Right, it’s eerie in terms of how sophisticated it is, but it’s also kind of surprising in terms of, I guess what the programmers like to call hallucinations. I guess if you’re using the internet as your base model, hey, there’s one or two things on the internet that are wrong. So of course, that’s going to show up in something like ChatGPT.

MCAULIFFE: Yeah. Underlyingly, there’s this tool GPT-3. That’s really the engine that powers ChatGPT. And that tool, it has one goal. It’s a simple goal. You show at the beginning of a sentence, and it predicts the next word in the sentence. And that’s all it is trained to do. I mean, it really is actually that simple.

RITHOLTZ: It’s a dumb program that looks smart.

MCAULIFFE: If you like. But the thing about predicting the next word in a sentence is whether, you know, the sequence of words that’s being output is leading to something that is true or false is irrelevant. The only thing that it is trained to do is make highly accurate predictions of next words.

RITHOLTZ: So when I said dumb, it’s really very sophisticated. It just, we tend to call this artificial intelligence, but I’ve read a number of people said, “Hey, this really isn’t AI. This is something a little more rudimentary.”

MCAULIFFE: Yeah, I think a critic would say that artificial intelligence is a complete misnomer. There’s sort of nothing remotely intelligent in the colloquial sense about these systems. And then a common defense in AI research is that artificial intelligence is a moving target. As soon as you build a system that does something quasi magical that was the old yardstick of intelligence, then the goalposts get moved by the people who are supplying the evaluations.

And I guess I would sit somewhere in between. I think the language is unfortunate because it’s so easily misconstrued. I wouldn’t call the system dumb and I wouldn’t call it smart. Those are not characteristics of these systems.

RITHOLTZ: But it’s complex and sophisticated.

MCAULIFFE: It certainly is. It has 175 billion parameters. If that doesn’t fit your definition of complex, I don’t know what would.

RITHOLTZ: Yeah, that works for me. So in your career line, where is Affymetrix and what was that recommendation engine like?

MCAULIFFE: Yeah, so that was work I did as a summer research intern during my PhD. And that work was about, the problem is called genotype calling.

So–

RITHOLTZ: Genotype calling.

MCAULIFFE: I’ll explain, Barry. Do you have an identical twin?

RITHOLTZ: I do not.

MCAULIFFE: Okay, so I can safely say your genome is unique in the world. There’s no one else who has exactly your genome. On the other hand, if you were to lay your genome and mine alongside each other, lined up, they would be 99.9% identical. About one position in a thousand is different. But those differences are what cause you to be you and me to be me. They’re obviously of intense scientific and applied interest.

And so it’s very important to be able to take a sample of your DNA and quickly produce a profile of all the places that have variability, what your particular values are. And that problem is the genotyping problem.

RITHOLTZ: And this used to be a very expensive, very complex problem to solve that we spent billions of dollars figuring out. Now a lot faster, a lot cheaper.

MCAULIFFE: A lot faster. In fact, even the technology I worked on in 2005, 2004 is multiple generations old and not really what’s used anymore.

RITHOLTZ: So let’s talk about what you did at the Efficient Frontier. Explain what real-time click prediction rules are and how it works for a keyword search.

MCAULIFFE: Sure. The revenue engine that drives Google is search keyword ads. So every time you do a search at the top, you see ad, ad, ad. So how do those ads get there? Well, actually, it’s surprising, maybe if you don’t know about it, but every single time you type in a search term on Google and hit return, a very fast auction takes place. And a whole bunch of companies running software bid electronically to place their ads at the top of your search results. And the more or less, the results that are shown on the page are in order of how much they bid.

It’s not quite true, but you could think of it as true.

RITHOLTZ: A rough outline. So the first three sponsored results on a Google page go through that auction process. And I think at this point, everybody knows what page rank is for the rest of that.

MCAULIFFE: Yeah, that’s right.

RITHOLTZ: And that seemed to be Google secret sauce early on, right?

MCAULIFFE: Well, to talk about the ad placement, so the people who are supplying the ad who are participating in these auctions, they have a problem, which is how much to bid, right?

And so how would you decide how much to bid? Well, you want to know basically the probability that somebody is going to click on your ad, right? And then you would multiply that by how much money you make eventually if they click. And that’s kind of an expectation of how much money you’ll make.

And so then you gear your bid price to make sure that it’s going to be profitable for you. And then, so really you have to make a decision about what this click-through rate is going to be. You have to predict the click-through probability. And that was the problem I worked on.

RITHOLTZ: So I was going to say, this sounds like it’s a very sophisticated application of computer science, probability, and statistics. And if you do it right, you make money. And if you do it wrong, your ad budget is a money loser.

MCAULIFFE: That’s right.

RITHOLTZ: So tell us a little bit about your doctorate, what you wrote about for your PhD at Berkeley?

MCAULIFFE: Yeah. So we’re back to genomes, actually. This was around the time when I was in my first year of my PhD program is when the human genome was published in “Nature”. So it was kind of really the beginning of the explosion of work on kind of high throughput, large scale genetics research. And one really important question when you, after you’ve sequenced a genome is, well, what are all the bits of it doing? You can look at a string of DNA. It’s just made up of these kind of four letters. But you don’t want to just know the four letters. They’re kind of a code. And some parts of the DNA represent useful stuff that is being turned by your cell into proteins and et cetera. And other parts of the DNA don’t appear to have any function at all. It’s really important to know which is which as a biology researcher.

And so it’s, for a long time before high throughput sequencing, biologists would be in the lab and they would very laboriously look at very tiny segments of DNA and establish what their function was. But now we have the whole human genome sitting on disk and we would like to be able to just run an analysis on it and have the computer spit out everything that is functional and not functional, right?

And so that’s the problem I worked on. And a really important insight is that you can take advantage of the idea of natural selection and the idea of evolution to help you. And the way you do that is you have the human genome, you sequence a bunch of primate genomes, nearby relatives of the human, and you lay all those genomes on top of each other. And then you look for places where all of the genomes agree, right? There hasn’t been variation that’s happening through mutations.

And why hasn’t there been? Well, the biggest force that throws out variation is natural selection. If you get a mutation in a part of your genome that really matters, then you’re kind of unfit and you won’t have progeny and that’ll get stamped out.

So natural selection is this very strong force that’s causing DNA not to change. And so when you make these primate alignments, you can really leverage that fact and look for conservation and use that as a big signal that something is functional.

RITHOLTZ: Really, really interesting. You mentioned our DNA is 99.99.

MCAULIFFE: Yeah.

RITHOLTZ: I don’t know how many places to the right of the decimal point you would want to go, but very similar. How similar or different are we from, let’s say a chimpanzee? I’ve always–

MCAULIFFE: Great question.

RITHOLTZ: There’s an urban legend that they’re practically the same. It always seems like it’s overstated.

MCAULIFFE: 98%.

RITHOLTZ: 98%, so it’s a 2%.

So you and I have a 0.1% different, me and the average chimp, it’s 2.0% different.

MCAULIFFE: That’s exactly right, yeah. So chimps are essentially our closest non-human primate relatives.

RITHOLTZ: Really, really quite fascinating.

So let’s talk a little bit about the firm. You guys were one of the earliest pioneers of machine learning research. Explain a little bit what the firm does.

MCAULIFFE: Sure, so we run trading strategies, investment strategies that are fully automated. So we call them fully systematic. And that means that we have software systems that run every day during market hours. And they take in information about the characteristics of the securities we’re trading, think of stocks, right?

And then they make predictions of how the prices of each security is going to change over time. And then they decide on changes in our inventory, changes in held positions based on those predictions. And then those desired changes are sent into an execution system, which automatically carries them out. Okay?

RITHOLTZ: So fully automated, is there human supervision or it’s kind of running on its own with a couple of checks?

MCAULIFFE: There’s lots of human diagnostic supervision, right? So there are people who are watching screens full of instrumentation and telemetry about what the systems are doing, but those people are not taking any actions, unless there’s a problem, and then they do.

RITHOLTZ: So let’s talk a little bit about how machines learn to identify signals. I’m assuming you start with a giant database that is the history of stock prices, volume, et cetera, and then bring in a lot of additional things to bear, what’s the process like developing a particular trading strategy?

MCAULIFFE: Yeah. So as you’re saying, we begin with a very large historical data set of prices and volumes, market data of that kind, but importantly, all kinds of other information about securities. So financial statement data, textual data, analyst data.

RITHOLTZ: So it’s everything from prices, fundamental, everything from earnings to revenue to sales, et cetera. I’m assuming the change and the delta of the change is going to be very significant in that.

What about macroeconomic, what some people call noise, but one would imagine the sum — signal, and everything from inflation to interest rates to GDP to consumer spending.

MCAULIFFE: Sure.

RITHOLTZ: Are those inputs worthwhile or how do you think about those?

MCAULIFFE: So we don’t hold portfolios that are exposed to those things. So it’s really a business decision on our part. We are working with institutional investors who already have as much exposure as they want to things like the market or to well-recognized econometric risk factors like value.

RITHOLTZ: Right.

MCAULIFFE: So they don’t need our help to be exposed to those things. They are very well equipped to handle that part of their investment process. What we’re trying to provide is the most diversification possible. So we want to give them a new return stream, which has good and stable returns, but on top of that, importantly, is also not correlated with any of the other return streams that they already have.

RITHOLTZ: That’s interesting. So can I assume that you’re applying your machine learning methodology across different asset classes or is it strictly equities?

MCAULIFFE: Oh no, we apply it to equities, to credit, to corporate bonds, and we trade futures contracts. And in the fullness of time, we hope that we will be trading kind of every security in the world.

RITHOLTZ: So, currently, stocks, bonds, when you say futures, I assume commodities?

MCAULIFFE: All kinds of futures contracts.

RITHOLTZ: Really, really interesting. So, it could be anything from interest rate swaps to commodities to the full gamut.

So how different is this approach from what other quant shops do that really focus on equities?

MCAULIFFE: I think it’s kind of the same question as asking, “Well, what do we mean when we say we use machine learning or that, you know, our principles are machine learning principles?” And so how does that make us different than the kind of standard approach in quantitative trading?

And the answer to the question really comes back to this idea we mentioned a little while ago of how powerful the tools are that you’re using to form predictions, right? So in our business, the thing that we build is called a prediction rule, okay? That’s our widget. And what a prediction rule does is it takes in a bunch of input, a bunch of information about a stock at a moment in time, and it hands you a guess about how that stock’s price is going to change over some future period of time, okay?

And so there’s one most important question about prediction rules, which is how complex are they? How much complexity do they have?

Complexity is a colloquial term. It’s, you know, unfortunately another example of a place where things can be vague or ambiguous because a general purpose word has been borrowed in a technical setting. But when you use the word complexity in statistical prediction, there’s a very specific meaning.

It means how much expressive power does this prediction rule have? How good a job can it do of approximating what’s going on in the data you show it? Remember, we have these giant historical data sets and every entry in the data set looks like this. What was going on with the stock at a certain moment in time? It’s price action, its financials, analyst information, and then what did its price do in the subsequent 24 hours or the subsequent 15 minutes or whatever, okay?

And so when you talk about the amount of complexity that a prediction rule has, that means how well is it able to capture the relationship between the things that you can show it when you ask it for a prediction and what actually happens to the price.

And naturally, you kind of want to use high complexity rules because they have a lot of approximating power. They do a good job of describing anything that’s going on. But there are two disadvantages to high complexity. One is it needs a lot of data. Otherwise it gets fooled into thinking that randomness is actually signal.

And the other is that it’s hard to reason about what’s going on under the hood, right? When you have very simple prediction rules, you can sort of summarize everything that they’re doing in a sentence, right? You can look inside them and get a complete understanding of how they behave. And that’s not possible with high complexity prediction rules.

RITHOLTZ: So I’m glad you brought up the concept of how easy it, or how frequently you can fool an algorithm or a complex rule, because sometimes the results are just random. And it reminds me of the issue of backtesting. No one ever shows you a bad backtest. How do you deal with the issue of overfitting and backtesting that just is geared towards what already happened and not what might happen in the future?

MCAULIFFE: Yeah, that is, you know, if you like, the million dollar question in statistical prediction, okay? And you might find it surprising that relatively straightforward ideas go a long way here. And so let me just describe a little scenario of how you can deal with this.

We agree we have this big historical data set. One thing you could do is just start analyzing the heck out of that data set and find a complicated prediction rule. But you’ve already started doing it wrong. The first thing you do before you even look at the data is you randomly pick out half of the data and you lock it in a drawer. And that leaves you with the other half of the data that you haven’t locked away.

On this half, you get to go hog wild. You build every kind of prediction rule, simple rules, enormously complicated rules, everything in between, right? And now you can check how accurate all of these prediction rules that you’ve built are on the data that they have been looking at. And the answer will always be the same. The most complex rules will look the best. Of course, they have the most expressive power. So naturally they do the best job of describing what you’ve showed them.

The big problem is that what you showed them is a mix of signal and noise, and there’s no way you can tell to what extent a complex rule has found the signal versus the noise. All you know is that it’s perfectly described the data you showed it.

You certainly suspect it must be overfitting if it’s doing that well, right?

Okay, so now you freeze all those prediction rules. You’re not allowed to change them in any way anymore. And now you unlock the drawer and you pull out all that data that you’ve never looked at. you can’t overfit data that you never fit. And so you take that data and you run it through each of these prediction rules that’s frozen that you built. And now it is not the case at all that the most complex rules look the best, instead, you’ll see a kind of U-shaped behavior where the very simple rules are too simple. They’ve missed signal. They left signal on the table. The two complex rules are also doing badly because they’ve captured all the signal, but also lots of noise.

And then somewhere in the middle is a sweet spot where you’ve struck the right trade-off between how much expressive power the prediction rule has and how good a job it is doing of avoiding the mistaking of noise for signal.

RITHOLTZ: Really, really intriguing. Yeah. So you guys have, you’ve built one of the largest specialized machine learning research and development teams in finance. How do you assemble a team like that and how do you get the brain trust to do the sort of work that’s applicable to managing assets?

MCAULIFFE: Well, the short answer is we spend a huge amount of energy on recruiting and identifying the sort of premier people in the field of machine learning, kind of both academic and practitioners. And we exhibit a lot of patience. We wait a really long time to be able to find the people who are kind of really the best. And that matters enormously to us, both from the standpoint of the success of the firm and also because it’s something that we value extremely highly, just having great colleagues, brilliant colleagues that I want to work in a place where I can learn from all the people around me.

And, you know, when my co-founder, Michael Kharitonov, and I were talking about starting Voleon, one of the reasons that was on our minds is we wanted to be in control of who we worked with. You know, we really wanted to be able to assemble a group of people who were, you know, as brilliant as we could find, but also, you know, good people, people that we like, people that we were excited to collaborate with.

So let’s talk about some of the fundamental principles Voleon is built on. You reference a prediction-based approach from a paper Leo Breiman wrote called “Two Cultures”.

MCAULIFFE: Yeah.

RITHOLTZ: Tell us a little bit about what “Two Cultures” actually is.

MCAULIFFE: Yeah. So this paper was written about 20 years ago. Leo Breiman was one of the great probabilists and statisticians of his generation, a Berkeley professor, need I say.

And Leo had been a practitioner in statistical consulting, actually, for quite some time in between a UCLA tenured job and returning to academia at Berkeley. And he learned a lot in that time about actually solving prediction problems instead of hypothetically solving them in the academic context.

And so all of his insights about the difference really culminated in this paper from 2000 that he wrote.

RITHOLTZ: The difference between practical use versus academic theory.

MCAULIFFE: If you like, yeah. And so he identified two schools of thought about solving prediction problems, right? And one school is sort of model-based. The idea is there’s some stuff you’re going to get to observe, stock characteristics, let’s say. There’s a thing you wish you knew, future price change, let’s say. And there’s a box in nature that turns those inputs into the output.

And in the model-based school of thought, you try to open that box, reason about how it must work, make theories. In our case, these would be sort of econometric theories, financial economics theories. And then those theories have knobs, not many, and you use data to set the knobs, but otherwise you believe the model, right?

And he contrasts that with the machine learning school of thought, which also has the idea of nature’s box. The inputs go in, the thing you wish you knew comes out. But in machine learning, you don’t try to open the box. You just try to approximate what the box is doing. And your measure of success is predictive accuracy and is only predictive accuracy.

If you build a gadget and that gadget produces predictions that are really accurate, they turn out to look like the thing that nature produces, then that is success. And at the time he wrote the paper, his assessment was 98% of statistics was taking the model-based approach and 2% was taking machine learning approach.

RITHOLTZ: Are those statistics still valid today or have we shifted quite a bit?

MCAULIFFE: We shifted quite a bit. And different arenas of prediction problems have different mixes these days. But even in finance, I would say it’s probably more like 50/50.
RITHOLTZ: Really? That much? That’s amazing.

MCAULIFFE: I think, you know, the logical extreme is natural language modeling, which was done for decades and decades in the model-based approach where you kind of reasoned about linguistic characteristics of how people kind of do dialogue, and those models had some parameters and you fit them with data.

And then instead, you have, as we said, a database of a trillion words and a tool with 175 billion parameters, and you run that, and there is no hope of completely understanding what is going on inside of GPT-3, but nobody complains about that because the results are astounding. The thing that you get is incredible.

And so that is by analogy, the way that we reason about running systematic investment strategies.

At the end of the day, predictive accuracy is what creates returns for investors. Being able to give complete descriptions of exactly how the predictions arise does not in itself create returns for investors.

Now, I’m not against interpretability and simplicity. All else equal, I love interpretability and simplicity, but all else is not equal.

If you want the most accurate predictions, you are going to have to sacrifice some amount of simplicity. In fact, this truth is so widespread that Leo gave it a name in his paper. He called it Occam’s Dilemma. So Occam’s Razor is the philosophical idea that you should choose the simplest explanation that fits the facts.

Occam’s dilemma is the point that in statistical prediction, the simplest approach, even though you wish you could choose it, is not the most accurate approach. If you care about predictive accuracy, if you’re putting predictive accuracy first, then you have to embrace a certain amount of complexity and lack of interpretability.

RITHOLTZ: That’s really quite fascinating.

So let’s talk a little bit about artificial intelligence and large language models. You follow D. E. Shaw playing in e-commerce and biotech, it seems like this approach to using statistics, probability and computer science is applicable to so many different fields.

MCAULIFFE: It is, yeah. I think you’re talking about prediction problems ultimately. So in recommender systems, you can think of the question as being, well, if I had to predict what thing I could show a person that would be most likely to change their behavior and cause them to buy it, that’s the kind of prediction problem that motivates recommendations.

In biotechnology, very often we are trying to make predictions about whether someone, let’s say, does or doesn’t have a condition, a disease, based on lots of information we can gather from high throughput diagnostic techniques.

These days, the keyword in biology and in medicine and biotechnology is high throughput. You’re running analyses on an individual that are producing hundreds of thousands of numbers. And you want to be able to take all of that kind of wealth of data and turn it into diagnostic information.

RITHOLTZ: And we’ve seen AI get applied to pharmaceutical development in ways that people just never really could have imagined just a few short years ago. Is there a field that AI and large language models are not going to touch, or is this just the future of everything?

MCAULIFFE: The kinds of fields where you would expect uptake to be slow are where it is hard to assemble large data sets of systematically gathered data. And so any field where it’s relatively easy to, at large scale, let’s say, produce the same kinds of information that experts are using to make their decisions, you should expect that field to be impacted by these tools if it hasn’t been already.

RITHOLTZ: So you’re kind of answering my next question, which is, what led you back to investment management? But it seems if there’s any field that just generates endless amounts of data, it’s the markets.

MCAULIFFE: That’s true. I’ve been really interested in the problems of systematic investment strategies from my time working at D. E. Shaw. And so my co-founder, Michael Kharitonov, and I, we were both in the Bay Area in 2004, 2005. He was there because of a firm that he had founded, and I was there finishing my PhD. And we started to talk about the idea of using contemporary machine learning methods to build strategies that would be really different from strategies that result from classical techniques.

And we had met at D. E. Shaw in the ’90s and been less excited about this idea because the methods were pretty immature. There wasn’t actually a giant diversity of data back in the ’90s in financial markets, not like there was in 2005. And compute was really still quite expensive in the ’90s, whereas in 2005, it had been dropping in the usual Moore’s Law way, and this was even before GPUs.

RITHOLTZ: Right.

MCAULIFFE: And so when we looked at the problem in 2005, it felt like there was a very live opportunity to do something with a lot of promise that would be really different. And we had the sense that not a lot of people were of the same opinion. And so it seemed like something that we should try.

RITHOLTZ: There was a void, nothing in the market hates more than a vacuum in an intellectual approach.

So you mentioned the diversity of various data sources.

What don’t you consider? Like how far off of price and volume do you go in the net you’re casting for inputs into your systems?

MCAULIFFE: Well I think we’re prepared as a research principle, we’re prepared to consider any data that has some bearing on price formation, like some plausible bearing on how prices are formed. Now of course we’re a relatively small group of people with a lot of ideas and so we have to prioritize. So in the event, we end up pursuing data that makes a lot of sense. We don’t try…

RITHOLTZ: I mean, can you go as far as politics or the weather? Like how far off of prices can you look?

MCAULIFFE: So an example would be the weather. For most securities, you’re not going to be very interested in the weather, but for commodities futures, you might be. So that’s the kind of reasoning you would apply.

RITHOLTZ: Really, really interesting.

So let’s talk about some of the strategies you guys are running.

Short and mid-horizon US equities, European equities, Asian equities, mid-horizon US credit, and then cross-asset. So I might assume all of these are machine learning based, and how similar or different is each approach to each of those asset classes?

MCAULIFFE: Yeah, they’re all machine learning based. The kind of principles that I’ve described of using as much complexity as you need to maximize predictive accuracy, et cetera, those principles underlie all the systems. But of course, trading corporate bonds is very different from trading equities. And so the implementations reflect that reality.

RITHOLTZ: So let’s talk a little bit about the four-step process that you bring to the systematic approach. And this is off of your site. So it’s data, prediction engine, portfolio, construction, and execution. I’m assuming that is heavily computer and machine learning based at each step along the way. Is that fair?

MCAULIFFE: I think that’s fair. I mean, to different degrees. The data gathering, that’s largely a software and kind of operations and infrastructure job.

RITHOLTZ: Do you guys have to spend a lot of time cleaning up that data and making sure that, because you hear between CRISP and S&P and Bloomberg, sometimes you’ll pull something up and they’re just all off a little bit from each other because they all bring a very different approach to data assembly. How do you make sure everything is consistent and there’s no errors or inputs throughout?

MCAULIFFE: Yeah, through a lot of effort, essentially.

We have an entire group of people who focus on data operations, both for gathering of historical data and for the management of the ongoing live data feeds. There’s no way around that. I mean, that’s just work that you have to do.

RITHOLTZ: You just have to brute force your way through that.

MCAULIFFE: Yeah.

RITHOLTZ: And then the prediction engine sounds like that’s the single most important part of the machine learning process, if I’m understanding you correctly. That’s where all the meat of the technology is.

MCAULIFFE: Yeah, I understand the sentiment. I mean, it’s worth emphasizing that you do not get to a successful systematic strategy without all the ingredients. You have to have clean data because of the garbage in, garbage out principle. You have to have accurate predictions, but predictions don’t automatically translate into returns for investors.

Those predictions are kind of the power that drives the portfolio holding part of the system.

RITHOLTZ: So let’s talk about that portfolio construction, given that you have a prediction engine and good data going into it, so you’re fairly confident as to the output. How do you then take that output and say, “Here’s how I’m going to build a portfolio based on what this generates”?

MCAULIFFE: Yeah, so there are three big ingredients in the portfolio construction. The predictions, what is usually called a risk model in this business, which means some understanding of how volatile prices are across all the securities you’re trading, how correlated they are, how, you know, if they have a big movement, how big that movement will be. That’s all the risk model.

And then the final ingredient is what’s usually called a market impact model. And that means an understanding of how much you are going to push prices away from you when you try to trade. This is a reality of all trading.

If you buy a lot of a security, you push the price up. You push it away from you in the unfavorable direction. And in the systems that we run, the predictions that we’re trying to capture are about the same size as the effect that we have on the markets when we trade.

And so you cannot neglect that impact effect when you’re thinking about what portfolios to hold.

RITHOLTZ: So execution becomes really important. If you’re not executing well, you are moving prices away from your profit.

MCAULIFFE: That’s right. And it is probably the single thing that undoes quantitative hedge funds most often is that that they misunderstand how much they’re moving prices, they get too big, they start trading too much, and they sort of blow themselves up.

RITHOLTZ: It’s funny that you say that, because as you were describing that, the first name that popped into my head was long-term capital management, was trading these really thinly traded, obscure fixed income products.

MCAULIFFE: Yeah.

RITHOLTZ: And everything they bought, they sent higher, because there just wasn’t any volume in it. And when they needed liquidity, there was none to be had. And that plus no risk management, 100X leverage equals a kaboom.

MCAULIFFE: Yes. Barry, they made a number of mistakes. The book is good. So “When Genius Failed.”

RITHOLTZ: Oh, absolutely.

I love that book.

MCAULIFFE: Really fascinating.

RITHOLTZ: So when you’re reading a book like that, somewhere in the back of your head, are you thinking, hey, this is like a what not to do when you’re setting up a machine learning fund? How influential is something like that?

MCAULIFFE: Well, 100%. I mean, look, I think the most important adage I’ve ever heard in my professional life is, good judgment comes from experience, experience comes from bad judgment.

So the extent to which you can get good judgment from other people’s experience, that is like a free lunch.

RITHOLTZ: Cheap tuition.

MCAULIFFE: Yeah, absolutely.

RITHOLTZ: That is like a free lunch.

MCAULIFFE: And so we talk a lot about all the mistakes that other people have made. And we do not congratulate ourselves on having avoided mistakes. We think those people were smart. I mean, look, you read about these events and none of these people were dummies. They were sophisticated.

RITHOLTZ: Nobel laureates, right? They just didn’t have a guidebook on what not to do, which you guys do.

MCAULIFFE: We don’t, no, I don’t think we do. I mean, apart from reading about, right. But everybody is undone by a failure that they didn’t think of or didn’t know about yet. And we’re extremely cognizant of that.

RITHOLTZ: That has to be somewhat humbling to constantly being on the lookout for that blind spot that could disrupt everything.

MCAULIFFE: Yes, yeah, humility is the key ingredient in running these systems.

RITHOLTZ: Really quite amazing. So let’s talk a little bit about how academically focused Voleon is. You guys have a pretty deep R&D team internally. You teach at Berkeley. What does it mean for a hedge fund to be academically focused?

MCAULIFFE: What I would say probably is kind of evidence-based rather than academically focused. Saying academically focused gives the impression that papers would be the goal or the desired output, and that’s not the case at all. We have a very specific applied problem that we are trying to solve.

RITHOLTZ: Papers are a mean to an end.

MCAULIFFE: Papers are, you know, we don’t write papers for external consumption. We do lots of writing internally, and that’s to make sure that, you know, we’re keeping track of our own kind of scientific process.

RITHOLTZ: But you’re fairly widely published in statistics and machine learning.

MCAULIFFE: Yes.

RITHOLTZ: What purpose does that serve other than a calling card for the fund, as well as, hey, I have this idea, and I want to see what the rest of my peers think of it, when you put stuff out into the world, what sort of feedback or pushback do you get?

MCAULIFFE: I guess I would have to say I really, I do that as kind of a double life of non-financial research. So it’s just something that I really enjoy.

Principally, what it means is that I get to work with PhD students and we have really outstanding PhD students at Berkeley in statistics. And so it’s an opportunity for me to do a kind of intellectual work that, namely, you know, writing a paper, laying out an argument for public consumption, et cetera, that is kind of closed off as far as Voleon is concerned.

RITHOLTZ: So not adjacent to what you guys are doing at Voleon?

MCAULIFFE: Generally no. No.

RITHOLTZ: That’s really interesting. So then I always assume that that was part of your process for developing new models to apply machine learning to new assets. Take us through the process. How do you go about saying, hey, this is an asset class we don’t have exposure to, let’s see how to apply what we already know to that specific area?

MCAULIFFE: Yeah, we have, it’s a great question. So we’re trying as much as possible to get the problem for a new asset class into a familiar setup, as standard a setup as we can.

And so we know what these systems look like in the world of equity.

And so if you’re trying to do the same, if you’re trying to build the same kind of system for corporate bonds and you start off by saying, “Well, okay, I need to know closing prices or intraday prices for all the bonds.” Already you have a very big problem in corporate bonds because there is no live price feed that’s showing you a “bid offer” quote in the way that there is in equity.

And so before you can even get started thinking about predicting how a price is going to change, it would be nice if you know what the price currently was. And that is already a problem you have to solve in corporate bonds, as opposed to being just an input that you have access to.

RITHOLTZ: The old joke was trading by appointment only.

MCAULIFFE: Yeah.

RITHOLTZ: And that seems to be a bit of an issue. And there are so many more bond issuers than there are equities.

MCAULIFFE: Absolutely.

RITHOLTZ: Is this just a database challenge or how do you work around it?

MCAULIFFE: It’s a statistics problem, but it’s a different kind of statistics problem. We’re not, in this case, we’re not trying to yet, we’re not yet trying to predict the future of any quantity. We’re trying to say, I wish I knew what the fair value of this CUSIP was. I can’t see that exactly because there’s no live order book with a bid and an offer that’s got lots of liquidity that lets me figure out the fair value. But I do have …

RITHOLTZ: At best, you have a recent price or maybe not even so recent.

MCAULIFFE: I have lots of related information. I know, you know, this bond, maybe this bond didn’t trade today, but it traded a few times yesterday. I get to say, I know where it traded. I’m in touch with bond dealers. So I know where they’ve quoted this bond, maybe only on one side over the last few days. I have some information about the company that issued this bond, et cetera.

So I have lots of stuff that’s related to the number that I want to know. I just don’t know that number. And so what I want to try to do is kind of fill in and do what in statistics or in control we would call a now-casting problem.

And an analogy actually is to automatically controlling an airplane, surprisingly. If a software is trying to fly an airplane, there are six things that it absolutely has to know. It has to know the XYZ of where the plane is and the XYZ of its velocity, where it’s headed.

Those are the six most important numbers.

Now nature does not just supply those numbers to you. You cannot know those numbers with perfect exactitude, but there’s lots of instruments on the plane and there’s GPS and all sorts of information that is very closely related to the numbers you wish you knew.

And you can use statistics to go from all that stuff that’s adjacent to a guess and infill of the thing you wish you knew. And the same goes with the current price of a corporate bond.

RITHOLTZ: That’s really kind of interesting. So I’m curious as to how often you start working your way into one particular asset or a particular strategy for that asset and just suddenly realize, “Oh, this is wildly different than we previously expected.” And suddenly you’re down a rabbit hole to just wildly unexpected areas. It sounds like that isn’t all that uncommon.

MCAULIFFE: It is not uncommon at all.

It’s a nice, you know, there’s this kind of wishful thinking that, oh, we figured it out in one asset class in the sense that we have a system that’s kind of stable and performing reasonably well that we have a feel for. And now we want to take that system and somehow replicate it in a different situation.

And while we’re going to standardize the new situation to make it look like the old situation, that’s the principle. That principle kind of quickly goes out the window when you start to make contact with the reality of how the new asset class actually behaves.

RITHOLTZ: So stocks are different than credit, are different than bonds, are different than commodities. They’re all like starting fresh over. What’s some of the more surprising things you’ve learned as you’ve applied machine learning to totally different asset classes?

MCAULIFFE: Well I think corporate bonds provide a lot of examples of this. I mean the fact that you don’t actually really know a good live price or a good live bid offer seems, you know…

RITHOLTZ: It seems crazy.

MCAULIFFE: it’s surprising. I mean, this fact has started to change. Like, over the years, there’s been an accelerating electronification of corporate bond trading. And that’s been a big advantage for us, actually, because we were kind of first movers. And so we’ve really benefited from that.

So the problem is diminished relative to how it was six, seven years ago when we started.

RITHOLTZ: But it’s still essentially.

MCAULIFFE: Relative to equities, it’s absolutely there. Yeah.

RITHOLTZ: So you get – so in other words, if I’m looking at a bond mutual fund or even a bond ETF that’s trading during the day, that price is somebody’s best approximation of the value of all the bonds inside. But really, you don’t know the NAV, do you? You’re just kind of guessing.

MCAULIFFE: Barry, don’t even get me started on bond ETFs. (LAUGHTER)

RITHOLTZ: Really? Because it seems like that would be the first place that would show up, “Hey, bond ETFs sound like throughout the day they’re going to be mispriced a little bit or wildly mispriced.”

MCAULIFFE: Well, the bond ETF, there’s a sense if you’re a market purist in which they can’t be mispriced because their price is set by supply and demand in the ETF market, and that’s a super liquid market.

And so there may be a difference between the market price of the ETF and the NAV of the underlying portfolio.

RITHOLTZ: Right. Except in many cases with bond ETFs there’s not even a crisply defined underlying portfolio. It turns out that the authorized participants in those ETF markets can negotiate with the fund manager about exactly what the constituents are of the Create Redeem baskets.

And so it’s not even at all clear what you mean when you say that the NAV is this or that relative to the price of the ETF.

So when I asked about what’s surprising when you work you in on a rabbit hole, “Hey, we don’t know what the hell’s in this bond ETF. Trust us, it’s all good.” That’s a pretty surprise and I’m only exaggerating a little bit, but that seems like that’s kind of shocking.

MCAULIFFE: It is surprising when you find out about it, but you quickly come to understand if you trade single name bonds as we do, you quickly come to understand why bond ETFs work that way.

RITHOLTZ: I recall a couple of years ago there was a big Wall Street Journal article on the GLD ETF. And from that article, I learned that GLD was formed because gold dealers had just excess gold piling up in their warehouses and they needed a way to move it. So that was kind of shocking about that ETF.

Any other space that led to a sort of big surprise as you worked your way into it?

MCAULIFFE: Well, I think ETFs are kind of a good source of these examples. The volatility ETFs, the ETFs that are based on the VIX or that are short the VIX, you may remember several years ago.

RITHOLTZ: I was going to say the ones that haven’t blown up.

MCAULIFFE: Yeah, right. There was this event called Volmageddon.

RITHOLTZ: Right.

MCAULIFFE: Where …

RITHOLTZ: That was ETF notes, wasn’t it? The volatility notes.

MCAULIFFE: Yeah, the ETFs, ETNs, right. So there are these, essentially these investment products that were short VIX and VIX went through a spike that caused them to have to liquidate, which was part, I mean, the people who designed the 16 traded note, they understood that this was a possibility, so they had a sort of descriptions in their contract for what it would mean.

But yeah, always surprising to watch something suddenly go out of business.

RITHOLTZ: We seem to get a thousand year flood every couple of years. Maybe we shouldn’t be calling these things thousand year floods, right? That’s a big misnomer.

MCAULIFFE: As statisticians, we tell people, if you think that you’ve experienced a Six Sigma event, the problem is that you have underestimated Sigma.

RITHOLTZ: That’s really interesting. So given the gap in the world between computer science and investment management, how long is it going to be before that narrows and we start seeing a whole lot more of the sort of work you’re doing applied across the board to the world of investing?

MCAULIFFE: Well I think it’s happening, it’s been happening for quite a long time. For example, all of modern portfolio theory really kind of began in the 50s with, you know, first of all Markowitz and other people thinking about, you know, what it means to benefit from diversification and the idea that, you know, diversification is the only free lunch in finance.

So I would say that the idea of thinking in a systematic and scientific way about how to manage and grow wealth, not even just for institutions, but also for individuals, is an example of a way that these ideas have kind of had profound effects.

RITHOLTZ: I know I only have you for a little while longer, so let’s jump to our favorite questions that we ask all of our guests, starting with, tell us what you’re streaming these days. What are you either listening to or watching to keep yourself entertained?

MCAULIFFE: Few things I’ve been watching recently, “The Bear” I don’t know if you’ve heard of it.

RITHOLTZ: So great.

MCAULIFFE: So great, right?

RITHOLTZ: Right.

MCAULIFFE: And set in Chicago, I know we were just talking about being in Chicago.

RITHOLTZ: You’re from Chicago originally, yeah.

MCAULIFFE: So.

RITHOLTZ: And there are parts of that show that are kind of a love letter to Chicago.

MCAULIFFE: Absolutely, yeah.

RITHOLTZ: As you get deeper into the series, because it starts out kind of gritty and you’re seeing the underside, and then as we progress, it really becomes like a lovely postcard.

MCAULIFFE: Yeah, yeah.

RITHOLTZ: Such an amazing show.

MCAULIFFE: Really, really love that show. I was late to “Better Call Saul” but I’m finishing up. I think as good as “Breaking Bad”. Maybe when you haven’t heard of, there’s a show called “Mr. In Between”, which is —

RITHOLTZ: “Mr. In Between”.

MCAULIFFE: Yeah, it’s on Hulu, it’s from Australia. It’s about a guy who’s a doting father living his life. He’s also essentially a muscle man and hit man for local criminals in his part of Australia. But it’s half hour dark comedy.

RITHOLTZ: Right, so not quite “Barry” and not quite “Sopranos”, somewhere in between.

MCAULIFFE: No, yeah, exactly.

RITHOLTZ: Sounds really interesting. Tell us about your early mentors who helped shape your career.

MCAULIFFE: Well, Barry, I’ve been lucky to have a lot of people who were both really smart and talented and willing to take the time to help me learn and understand things.

So actually my co-founder, Michael Kharitonov, he was kind of my first mentor in finance. He had been at D. E. Shaw for several years when I got there and he really taught me kind of the ins and outs of market microstructure.

I worked with a couple of people who managed me at D. E. Shaw, Yossi Friedman, and Kapil Mathur, who have gone on to hugely successful careers in quantitative finance, and they taught me a lot too. When I did my PhD, my advisor, Mike Jordan, who’s a kind of world-famous machine learning researcher, you know, I learned enormously from him.

And there’s another professor of statistics who sadly passed away about 15 years ago, named David Friedman. He was really just an intellectual giant of the 20th century in probability and statistics. He was both, one of the most brilliant probabilists and also an applied statistician. And this is like a pink diamond kind of combination. It’s that rare to find someone who has that kind of technical capability, but also understands the pragmatics of actually doing that analysis.

He spent a lot of time as an expert witness. He was the lead statistical consultant for the case on census adjustment that went to the Supreme Court. In fact, he told me that in the end, the people against adjustment, they won in a unanimous Supreme Court decision. And David Friedman told me, he said, “All that work and we only convinced nine people.”

RITHOLTZ: That’s great. Nine people that kind of matter.

MCAULIFFE: Yeah, exactly. So it was just, it was a real, it was kind of a once in a lifetime privilege to get to spend time with someone of that intellectual caliber. And there were others too. I mean, I’ve been very fortunate.

RITHOLTZ: That’s quite a list to begin with. Let’s talk about books. What are some of your favorites and what are you reading right now?

MCAULIFFE: Well, I’m a big book reader, so I had a long list. But probably one of my–

RITHOLTZ: By the way, this is everybody’s favorite section of the podcast. People are always looking for good book recommendations and if they like what you said earlier, they’re going to love your book recommendations. So fire away.

MCAULIFFE: So I’m a big fan of kind of modernist dystopian fiction.

RITHOLTZ: Okay.

MCAULIFFE: So a couple of examples of that would be the book “Infinite Jest” by David Foster Wallace, “Wind Up Bird Chronicle” by Haruki Murakami. Those are two of my all-time favorite books. There’s a, I think, much less well-known but beautiful novel. It’s a kind of academic coming of age novel called “Stoner” by John Williams. Really moving, just a tremendous book. Sort of more dystopia would be “White Noise” DeLillo, and kind of the classics that everybody knows, “1984” and “Brave New World.” Those are two more of my favorite.

RITHOLTZ: It’s funny, when you mention “The Bear” I’m in the middle of reading a book that I would swear the writers of the bear leaned on called “Unreasonable Hospitality” by somebody who worked for the Danny Myers Hospitality Group, Eleven Madison Park and Gramercy Tavern and all these famous New York haunts. And the scene in “The Bear” where they overhear a couple say, “Oh, we visited Chicago, and we never had deep dish.”

So they send the guy out to get deep dish. There’s part of the book where at 11 Madison Park, people actually showed up with suitcases. It was the last thing they would eat doing before they’re heading to the airport. And they said, “Oh, we ate all these great places “in New York, but we never had a New York hot dog.” And what do they do? They send someone out to get a hot dog. They plate it and use all the condiments to make it very special.

MCAULIFFE: I see.

RITHOLTZ: And it looks like it was ripped right out of the barrel or vice versa. But if you’re interested in just, hey, how can we disrupt the restaurant business and make it not just about the celebrity chef in the kitchen but the whole experience, fascinating kind of nonfiction book.

MCAULIFFE: That does sound really interesting.

RITHOLTZ: Yeah, really. You mentioned “The Bear” and it just popped into my head.

Any other books you want to mention? That’s a good list to start with.

MCAULIFFE: Yeah, my other kind of big interest is science fiction, speculative fiction.

RITHOLTZ: I knew you were going to go there.

MCAULIFFE: Unsurprisingly, right, sorry.

RITHOLTZ: Let’s go.

MCAULIFFE: Sorry, but so there are some classics that I think everybody should read. Ursula Le Guin is just amazing. So “The Dispossessed” and “The Left Hand of Darkness.” Those are just two of the best books I’ve ever read, period. Forget it.

RITHOLTZ: “Left Hand of Darkness” stays with you for a long time.

MCAULIFFE: Yeah, yeah, really, really amazing books. I’m rereading right now “Cryptonomicon” by Neil Stevenson. And one other thing I try to do is I have very big gaps in my reading. For example, I’ve never read “Updike.” So I started reading the Rabbit series. –

RITHOLTZ: Right, “World According to Garp”, and they’re very much of an era.

MCAULIFFE: Yeah, that’s right.

RITHOLTZ: What else? Give us more.

MCAULIFFE: Wow, okay. Let’s see, George Saunders, he, oh wow. I think you’d love him. So his real strength is short fiction. He’s written great novels too, but “10th of December” this is his best collection of fiction. And this is more kind of modern dystopian, kind of comic dystopian stuff.

RITHOLTZ: You keep coming back to dystopia. I’m fascinated by that.

MCAULIFFE: I find it’s very different from my day-to-day reality. So I think it’s a great change of pace for me to be able to read this stuff.

So some science writing, I can tell you probably the best science book I ever read is “The Selfish Gene” by Richard Dawkins, which kind of really, you have a kind of intuitive understanding of genetics and natural selection in Darwin, but the language that Dawkins uses really makes you appreciate just how much the genes are in charge and how little we as the, as the, you know, he calls organisms survival machines that the genes have kind of built and exist inside in order to ensure their propagation.

And the whole point of view in that book just gives you, it’s really eye-opening, makes you think about natural selection and evolution and genetics in a completely different way, even though it’s all based on the same kind of facts that you know.

RITHOLTZ: Right. It’s just the framing and the context.

MCAULIFFE: It’s the framing and the perspective that really kind of blow your mind. So it’s a great book to read.

RITHOLTZ: Huh, that’s a hell of a list. You’ve given people a lot of things to start with. And now down to our last two questions. What advice would you give to a recent college grad who is interested in a career in either investment management or machine learning?

MCAULIFFE: Yeah, so I mean, I work in a very specialized subdomain of finance, so there are a lot of people who are going to be interested in investment and finance that I couldn’t give any specific advice to. I have kind of general advice that I think is useful, both for finance and even more broadly. This advice is really kind of top of Maslow’s pyramid advice if you’re trying to kind of write your novel and pay the rent while you get it done, I can’t really help you with that.

But if what you care about is building this career, then I would say number one piece of advice is work with incredible people. Like far and away, much more important than what the particular field is, the details of what you’re working on is the caliber of the people that you do it with. Both in terms of your own satisfaction and how much you learn and all of that.

I think you’ll learn, you’ll benefit hugely on a personal level from working with incredible people. And if you don’t work with people that are like that, then you’re probably going to have a lot of professional unhappiness. So it’s kind of either or.

RITHOLTZ: That’s a really intriguing answer.

So final question, what do you know about the world of investing, machine learning, large language models, just the application of technology to the field of investing that you wish you knew 25 years or so ago when you were really first ramping up.

MCAULIFFE: I think one of the most important lessons that I had to learn the hard way, kind of going through and running these systems was that it’s, kind of comes back to the point you made earlier about the primacy of prediction rules. And it may be true that the most important thing is the prediction quality, but there are lots of other very necessary mandatory ingredients and I would put kind of risk management at the top of that list.

So I think it’s easy to maybe neglect risk management to a certain extent and focus all of your attention on predictive accuracy. But I think it really does turn out that if you don’t have high quality risk management to go along with that predictive accuracy, you won’t succeed.

And I guess I wish I had appreciated that in a really deep way 25 years ago.

Jon, this has been really absolutely fascinating. I don’t even know where to begin other than saying thank you for being so generous with your time and your expertise.

We have been speaking with Jon McAuliffe. He is the co-founder and chief investment officer at the $5 billion hedge fund Voleon Group.

If you enjoy this conversation, well, be sure and check out any of the previous 500 we’ve done over the past 9 years. You can find those at iTunes, Spotify, YouTube, or wherever you find your favorite podcast. Sign up for my daily reading list @Ritholtz. Follow me on Twitter @Barry_Ritholtz until I get my hacked account @Ritholtz back.

I say that because the process of dealing with the 17 people left at once Twitter, now X is unbelievably frustrating and annoying. Follow all of the fine family of podcasts on Twitter @podcast.

I would be remiss if I did not thank the crack team that helps put these conversations together each week. Paris Wald is my producer. Atika Valbrun is my project manager. Sean Russo is my director of research. I’m Barry Ritholtz. You’ve been listening to Masters in Business on Bloomberg Radio.

END

 

~~~

 

Print Friendly, PDF & Email

Posted Under