In every data warehouse there’s a goldmine of customer and product data just waiting to be enriched with AI.
Think customer feedback classified as positive or negative. Product descriptions enhanced with competitor insights. Customer records enriched with external context.
With ChatGPT API and Fabric, this enrichment is now both possible and practical.
To demonstrate this, I’ll walk through a simple example.
I have a lakehouse in fabric, with a customer table containing the top 25 companies in Denmark.
Using that, we’ll enrich the data by looking up their reported revenue for 2020 in chat gpts API.
Of course, if you’re building a production-grade data pipeline, it makes far more sense to fetch actual revenue data directly from official sources — such as the CVR API (virk.dk), which provides structured financial and registration data on all Danish companies.
However, the purpose of this example is not accuracy, but simplicity.
This demonstrates just how easy it is to call the ChatGPT API and bring external knowledge into your lakehouse — with just a few lines of Python.
The reason it is 2020, is that the API don’t like to give specific details about turnover, so in this example, it will just be easier with 2020 data.
To start my table looks like this:

As of now, I have an empty column called Turnover2020, and that is the one I will enrich with data from chatgpt.
To get started, you will need an OpenAI account, and after that an Open AI API KEY.
To get the key, go to https://platform.openai.com/
If you don’t have the account, click sign up. To use what I show you here, you will also need credit on your account, so add that also.
Create a new project, and go to ‘User Settings’ > ‘API keys. Click create new secret keys as shown below.

In our example we will call it DWHEnrichment

When it is done. Copy the key.
When this is done, we create a notebook to our lakehose, and type in the code as below. I have added comments in the code, to explain it.

But to explain some of it, in more detail. The promt we send to OpenAI, we specify, that it should only answer with a number, and nothing else. Because, that is what we need in our DWH.
We set the temperature to 0.9. It has to be a number between 0 and 1. The closer to 0, the more deterministic and precise. But because of the API’s ability to answer with specific turnover data, we set it to 0.9.
The result from the code look like this:

So, as of now, it can’t give us the result for all the companies, but we can work with this in our ETL flow later.
Conclusion
This example shows how easy it is to enrich structured data with external insight using generative AI.
With just a company name and a simple API call, we can add estimated financial values directly into our Fabric-based data warehouse.
Is the data perfect? No — ChatGPT is not a replacement for audited reports or official sources. But as a tool for prototyping, exploring data potential, or adding indicative values for segmentation or prioritization, it’s incredibly powerful.
As generative AI evolves, so will its ability to support data enrichment pipelines. And with tools like Fabric and OpenAI, the barrier to trying it out is lower than ever.
Leave a Reply