Member-only story

The Paradox of Enterprise AI: Evaluating Popularity versus Importance

2 min readMay 3, 2024

I propose that the most prevalent enterprise AI applications today aren’t necessarily those addressing the most critical issues or generating the highest revenue. Instead, they tend to be the ones easiest to assess.

Let’s examine some typical enterprise AI use cases: recommender systems, fraud detection, coding, and LLM-powered classification.

Recommender System: Success is typically gauged by metrics like increased engagement or purchase rates.
Fraud Detection: Evaluation revolves around the amount of money saved through fraud prevention.
Coding with LLMs: Unlike many text generation tasks, coding’s correctness can be assessed functionally. Generated code is considered correct if it compiles and produces expected outputs.
Classification Tasks: Despite the open-ended nature of LLMs, approximately a third of observed applications are close-ended, such as intent classification. Classification tasks offer straightforward evaluation compared to open-ended ones.

From a business standpoint, this emphasis on measurable outcomes is understandable, as companies seek tangible returns on investment. However, if this hypothesis holds true, it leads to two significant implications.

The Paradox of Enterprise AI: Evaluating Popularity versus Importance

Written by Amber Ivanna Trujillo

No responses yet