Reinventing Data & AI: Unlocking the Potential of Generative AI

By Subhashis Nath, AVP and Head of Analytics, Infosys.

3 months ago Posted in AI Tech & Trends

No one contests the enormous potential that generative AI has for business transformation and data transformation even. And yet, when one probes deeper for real world examples of deployment at scale, the silence one’s met with is broken only by concerns around data availability and reliability, regulations, and trust issues with AI and the underlying data.

Allow me to share a telling example, from the world of Consumer-Packaged Goods (CPG) industry. Consider the task of preparing data for an AI model designed to predict market share based on product attributes. In this sector, product attributes are frequently represented by abbreviated, cryptic descriptions, such as:

· APLG NTR EGG CHCK SSG RED PR ON SPNC FRT KP FRZN SNGL CRS TRY IN BOX 8.4 OZ.

Generative AI, when appropriately trained and guided by prompts that confine its interpretations to food brands, products, and properties, can decode this into “Applegate Natural Egg Chicken Sausage Red Pepper Onion Spinach Frittata Keep Frozen”. Similarly, “GD FD MD SMP EGG WHT PT KP FRZN SNGL CRS BAG 10 OZ” is translated to “Good Food Made Simple Egg-White Patty Keep Frozen”.

The complexities start to emerge with 2 letter terms like in “EVL KP FRZN BRK SSG UNC BCN PT EG BLPR CHD PRM CH SC SNGL CRS TRY IN BOX 7.5 OZ” which translates to “Evol. Keep Frozen Breakfast Sausage Uncured Bacon Patty Egg Bell Pepper Cheddar Parmesan Cheese Sauce. That, however, would be incorrect. The PT should have translated to Potato instead of Patty.

But an AI agent, unlike a seasoned human agent, wouldn’t know that. One option would be to train the AI with samples that humans have worked on, but no sample set covers all possible abbreviations. Looking up a code (UPC) table, which also may not comprehensively cover all attributes, is not a solution either.

A more robust approach leverages the AI's ability to accurately translate longer, less ambiguous segments within the description. For instance, “Evol. Frozen Breakfast Sausage Uncured Bacon Bell Pepper Cheddar Parmesan” can be reliably extracted. This partial name serves as a query for an AI agent to search web sources, including online grocers and brand websites, for matching product descriptions or images. By triangulating the initial translation with the the web-sourced description, the AI can identify and correct errors. In this case, it clarifies that "PT" should be "Potato," not "Patty," within the product's context.

Thus, translating data and getting it ready for AI is a vital pre-task.

When faced with enterprise-scale data preparation, leveraging the existing Business Intelligence (BI) landscape offers a significant advantage. Consider, for example, the BI-defined equation: “revenue = invoiced amount - value of returns”, can serve as a reliable reference for AI agents needing to source revenue data. This eliminates the ambiguity arising from multiple potential data sources, such as sales tables, online business tables, and after-sales service tables. BI translations, acting as a curated data layer, provide reusable definitions and data elements for key organisational metrics, accelerating AI's data utilisation. The BI layer effectively acts as a distilled, trustworthy layer above the raw data.

Curating data for AI can be significantly expedited by repurposing the BI landscape.

Providing adequate access control for data, in an agentic environment, is key. Maintaining the "command signature," which tracks the request's origin throughout the agentic flow, is vital for upholding defined access privileges. For example, if branch managers are to be restricted from accessing sensitive real estate deal data, while procurement managers require such access. AI assistants for these two personas must mirror these distinctions. Achieving this fidelity in a multifaceted, multi-agents systems necessitate precise traceability of each request. However, this is not always straightforward. For example, pricing used in some deals for product A may not be available to a manager of product B when querying their AI assistant.

However, the AI assistant, in the interest of organisational efficiency, can make available to product B’s manager a price range suggestion such as ‘ideal price range for product B would be between $55 and $60; competing products have seen success when price rises are between 10% and 20% of the current price.’ This, however, implies that the command signature needs to specify both origin and purpose.

Data access control has several layers of complexity that needs to be navigated before AI gets to it.

Translation, curation, and access management are essential for successful AI implementation. To unlock gen AI’s potential, significant effort must first be expended to get data ready for AI, ready for business.

Reinventing Data & AI: Unlocking the Potential of Generative AI

By Subhashis Nath, AVP and Head of Analytics, Infosys.

The end of AI theatre – why 2026 will be defined by AI impact, not inputs

AI’s Data Privacy Wake-Up Call: Why Sensitive Data in AI Training Is a Regulatory and Data Breach Time Bomb

Should leaders be using AI to Bridge the Gap Between IT and Business?

The ouroboros: When AI models eat their own tail

Unlocking True Value with a Solution-First Mindset

Why a unified data stack matters in the era of AI agents

Technology Integration is Shaping the Future of IoT

AI’s growing pains reveal how sustainable IT can solve hardware shortage