**financial**** market **Current computational power , much greater than in the past, and the possibility of using a huge amount of data – *big data* – have been opening up new possibilities for analysis and reforming the previous paradigms that we used to model economic and financial phenomena. It is not a linear process and is often confusing. The purpose of this article, therefore, is to throw a little more light on the subject, without intending to exhaust it, not least because this is a very limited space for that. We live in a huge frenzy with regard to startups, Artificial Intelligence, Big Data and other tools, in a diffuse mixture in which optimism, misconceptions and naivety come together, being essential and very desirable to separate the credible and realistic from the fantasy , sometimes deliberately fraudulent. To separate the wheat from the chaff, our strategy is to understand the main elements and put them in historical perspective, so that much of what we believe to be revolutionary has actually been around for over 30 years . The difference now is in the popularization of computational processing capacity and data storage . The supposed revolution is in the simple fact that before these things were not computationally feasible and scalable. That is, almost every theory already existed , as well as its mathematical frameworks, but we could not have sufficient amounts of data and calculation power.

Especially in view of the computational problem, we prefer more simplified mathematical and econometric instruments, which, in turn, also imply the assumption of oversimplifying premises, in the face of a complex world in which the occurrence of rare events , such as those seen in March 2020 or the “ Black Monday ” of 1987 is more frequent than historically presumed. To get an idea of the limitation of more traditional mathematics applied to this topic, whose framework is largely based on simpler and more analytically tractable statistical distributions, we reached absurd conclusions, such as that the event observed in March 2020 corresponds to something observable at a frequency of 1 in 20 million; or that the misfortune of the Dow Jones falling 7.7% in a single day would be one in 50 billion; or that even the Black Monday event would be 1 to 10 raised to 50 – calculations here borrowed from one of the most recent books by Mandelbrot , one of the exponents in this area of knowledge. Clearly, given this frequency of catastrophic events, the widely held assumptions must be wrong, even though the tools to adequately quantify such possibilities have existed since the 1960s. In terms of credit policies, it is possible to see that eventually even more naive assumptions still permeate how we see default risk . Especially in a scenario that advances in terms of collateralized instruments (with guarantee), such as home equity, payroll loans and discounting of bills, not seeing how several variables, such as labor lawsuits and social security debts, impact the probability of nullity or unavailability of real guarantees seems to be surreal, indeed an alarming mistake . Apparently, this diverse information needs to be crossed, eventually using artificial intelligence, to obtain a more realistic precision of the default probabilities . In fact, in times of digital bookkeeping, open banking, among others, not seeing this entire informational ecosystem is despising unfavorable chance . And this is precisely what promises to be a disruptive factor for large and traditional financial institutions. It is no longer necessary to have a huge computational park. And even if yes, depending on the solution and the problem, this is just a click away at very low costs, on the most diverse computing platforms available.

This is the challenge we have in our daily lives with our partners and customers: how to create and scale databases and computational capacity? How to create econometric models that preserve intelligibility and simplicity as much as possible? These are the key business premises where *machine learning* and *big data* are applied to finance. Unlike other sectors, such as medicine, in which it doesn’t matter (immediately) to explain why a patient has cancer (the hurry lies in diagnosing whether or not he has the disease, in the best possible way), in finance and economics there is a need to understand the reason for things and correct the trajectory of decisions when necessary . All this with the incredible challenge that datasets in finance are much scarcer than in other areas, especially equities, derivatives and interest. We only have one story from each company; a single history of Brazil’s monetary policies; we observe a single event from the Twin Towers, and so on. The Herculean challenge is to employ the most computationally and mathematically sophisticated techniques in order to preserve the intelligibility of the algorithms’ outputs, given the idiosyncrasies of financial data. But it’s not magic or rocket science . It is computational power with a tooling already quite scientifically consolidated. We are talking about Markov chains , Gibbs Sampler, Wavelets , fractional Brownian motion , among others, subjects that already exist and have been studied since the middle of the last century. It is worth remembering that the Medallion fund , owned by legendary manager Jim Simmons, has operated these tools for decades. Finally, we take the opportunity to paraphrase Benoit Mandelbrot – one of our quantitative references, alongside Simmons and Bishop – about risk, return and ruin . Identifying opportunities first involves analyzing the most likely value that an opportunity will generate. The typical problem resides in risk analysis and, probably to a lesser extent but equally important, in the events of investor ruin – which are generally highly improbable – that guide the attractiveness of these opportunities. Most of the simpler models allow a reasonably good analysis of the most probable events, but tend to hugely underestimate the unlikely events, making agents take much greater risks than they should or could. Portfolio managers operating at excessive leverage and less restrictive credit policies in certain situations are classic examples.