Searching for a Black Cat in a 2000-Dimensional Dark Room
I ran a stress test for machine learning algorithms.
Most benchmarks are boring. They use simple datasets like MNIST or Titanic. I wanted to push models to their breaking point.
I pitted 21 algorithms against each other. This included:
- Traditional models: Linear Regression, k-NN, SVR.
- Tree ensembles: Random Forest, ExtraTrees.
- Boosting heavyweights: XGBoost, LightGBM, CatBoost, HistGradientBoosting.
- Neural Networks: Multi-layer perceptrons and TabNet.
- The underdog: Polyharmonic Cascade.
The task looked simple. I asked models to learn a complex 3D surface. But then I added two massive hurdles:
Dimensionality Noise: I gave them 2,000 features. Only two were real. The other 1,998 were pure noise. This mimics real-world data like genomics or sensor readings.
Coordinate Rotation: I rotated the entire feature space. This means the useful signal was no longer aligned with any single column. It was smeared across all 2,000 dimensions.
The results were shocking.
Tree-based models like XGBoost and LightGBM are kings of tabular data. They win when data aligns with columns. But when I rotated the space, they collapsed. They could not find the signal in the noise.
Neural networks survived the rotation, but they struggled with the high dimensionality. They became slow and lost accuracy as noise increased.
Then there was the Polyharmonic Cascade.
This model does not use standard gradient descent. It uses pure mathematics based on random function theory. While the heavyweights failed, the Cascade thrived. It handled the rotation and the 2,000 features with ease. It outperformed almost every other participant in the hardest rounds.
The lesson is clear: Modern tabular ML is often axis-dependent. It works great until your data geometry changes. If you work with complex, rotated, or highly noisy data, your standard tools might fail you.
You can find the full code and results on GitHub. I invite you to replicate this experiment.
Optional learning community: https://t.me/GyaanSetuAi
