We’re happy to share the latest in AltsTech’s series profiling how investment managers are using AI, tech, and analytics to generate alpha. We’re fortunate to interview Jordan Thaeler, Cofounder, Simplectica.
David Teten: Please give us an overview of your firm.
Simplectica is a long/short systematic fund based on proprietary mathematics in the area of Brownian motion.
We only look at the most highly traded equity instruments (stocks and ETFs) and require that a name has been in the top [X] percent by daily traded volume consistently for one quarter. This means we avoid IPOs, meme stocks, and microcaps.
For each of these instruments, we use our proprietary Simplectica Covariance Matrix math to produce a broad range of predictive features (in the thousands) which we then feed to a state-of-the-art machine learning algorithm. The end result is a battery of predictors for returns, volatility, and other aspects of the price-volume process, on a time horizon of 1 day to 1 month.
We monetize our predictors through a long/short investing (not market-making) strategy. Our strategy constructs its portfolio quite simply by taking a long position on the N stocks that are predicted to have the best returns/volatility profile over the prediction time horizon, and similarly a short position on the N stocks that are predicted to have the best returns/volatility profile as shorts. Depending on investor requirements, we can construct the portfolio to be dollar neutral and/or beta neutral relative to the broad market and/or any sector ETF
David Teten: Who are your peers/competitors, and how do you differ?
There are thousands of hedge funds, and 90% of them don’t beat their own benchmark (e.g., SPY).
We currently trade a strategy with:
- A net Sharpe of 3.4 (fewer than 1% of the ~6,000 listed funds on BarclayHedge achieve this Sharpe ratio);
- 217% CAGR over a time period of a year;
- Turnover at most once daily
- US equities capacity in the billions of dollars.
David Teten: What’s your background? How and why are you in your role today?
The founders are mult-time entrepreneurs with particular experience in data science, machine learning, fintech, and quantitative finance. Most of life is just figuring things out: failing, learning, iterating, and repeating until you succeed or give up.
One founder started in investment banking and then moved into startup operations, founding three different fintech startups.
Another founder has worked in senior roles at quantitative funds in addition to senior data science roles at unicorn startups.
David Teten: What are the tools you’re using for your front office: sourcing, LP relations, investing analysis, etc.? What are the strengths and weaknesses of these providers?
We use very basic tools here. For cold outbound to institutional fund of funds/allocators, we use Apollo.io – they offer 1,000 free monthly leads, and we find their contact coverage to be better than competitors. We use a very small amount of Apollo.io, however. We otherwise don’t do mass outbound. Our initial interest was inbound, and from here the known allocators make for a short list. We will socialize capital raising with prime brokers as well, who are often a great first step.
Our CRM is simple: a Google spreadsheet. Because we’re focused in our efforts, things don’t get lost in the cracks, meaning we don’t need more sophisticated cadencing tools for follow ups since our net isn’t too wide.
We’ve been prop trading our own money and only recently received inbound interest, so the tools will probably change/become more formal and expensive if we have to manage outside capital.
David Teten: What tools do you find helpful for expediting due diligence on potential LPs?
When DDd’ing potential investors, the most accurate answer is simply talking with their other investments. How has the investor added value? Do they subtract value? Are they reasonable people? Are they culturally aligned? As a systematic fund, we’ve heard horror stories about fundamental LPs becoming excited about quantitative investment opportunities, only to discover that their ignorance forced them into drastic action when a systematic investment had a drawdown.
David Teten: What are the tools you’re using for your middle office: tracking, risk management, etc.? What are the strengths and weaknesses of these providers?
We have our own proprietary risk models that, we believe, drastically outperform available risk models on the market (i.e. Barra, Acxioma). Put quantitatively, Barra – at best – explains 50% of market volatility due to their aggregated factor approach.
Because we have solved unpublished math, we are able to avoid aggregation and can explain volatility in a full rank manner, surpassing 99% of explainability.
The challenge of using our own risk tool is that investors have become ingrained on using a household-name factors model, even though it is quite quantitatively inferior. It’s the classic trope, “nobody gets fired for choosing IBM.”
Our other reporting tools are provided by our prime broker, Interactive Brokers. Because we are prop, we don’t need additional tools yet. Interactive Brokers is a good “startup” brokerage but lacks the more sophisticated capabilities and network of the larger prime brokerage providers. For example, Interactive Brokers has much larger slippage costs. We fortunately don’t trade all that often, trading no more than 1x daily, but this would be problematic for different strategies.
David Teten: What are the tools you’re using for your back office: settlements, records maintenance, accounting, human resources, etc.? What are the strengths and weaknesses of these providers?
Interactive Brokers bleeds into some of this. We take all of our trade information and push it through to Quickbooks for taxation purposes. Again, admittedly lightweight given that we don’t have outside capital.
David Teten: A huge amount of valuable data flows through your pipes. What are you doing to capture that data and mine it? Can you share any patterns you have identified?
We’ve built our own data protocols (we call it Velo internally) since traditional data science tools (e.g. Pandas, Parquet) do not handle large, real-time data sets as gracefully. We estimate that our computing costs and speeds are more than 1 order of magnitude more performant than traditionally-accepted tools.
For example, Velo file formats are more than 70% more compressed than Parquet depending on fill ratios. We’ve enabled conventional compression algorithms like gzip and zstd to achieve higher compression ratios by favoring repetitiveness in the file structure relative to more complex data formats like Parquet.
Velo automatically generates performant low-level C++ code, wrapped into user-friendly high-level C++ and Python (other languages are technically possible). A Velo-encoded file parses at > 5GB/s.
Streaming mode allows Velo data to be processed while it is being read and decoded, thereby dramatically decreasing both output latency and memory footprint. In streaming mode, Velo loads the file one small chunk at a time, reducing memory usage and dramatically improving speed. This is particularly effective at reducing the cost of a serverless architecture like AWS Lambda that explicitly charges for RAM usage. Further, streaming mode allows Velo to bring the benefits of modern multicore CPUs to inherently single-core technologies like Pandas, by handing each chunk of the Velo file to Pandas in a separate core, saving time on serverless or money on a virtual machine by optimizing RAM.
We otherwise use Python and C++, as most quantitative funds employ. There are various frameworks we use, such as Numpy, but we are not overly reliant on a set way of doing things.
David Teten: Do you see any room to use AI to exploit your dataset? If so, what are you doing to move that forward?
Before there was the AI craze, statistics were simply called statistics, or perhaps data science. Everything we do could be considered “AI”, but we exist at the foundational levels of mathematics. Gradient Boosting and Neural Nets are great for marketing, but often underperform good mathematical hygiene.
The misfortune of today’s business cycle is that people extol AI as a panacea and so people are actually penalized for thinking correctly about a problem and instead revert to using “AI” to solve whatever it is.
Not only does this almost-always end with inferior results, it’s more costly to stand up.
David Teten: What are the most creative or unusual ways you’re using AI & analytics in your organization?
We’ve invented machine learning approaches that, as far as literature tells us, don’t exist. We view them as proprietary and are thus reluctant to share much more 🙂
David Teten: What are your unmet technology needs? Places in your firm where you’re seeking a solution and haven’t found an appropriate one?
We’re probably the outliers here since we often build our own tooling to wrangle data.
David Teten: What processes are you focused on improving?
More resources for research improvements would be great. We’d have to scale AUM to justify it.