GenerativeAI in Practice: Techniques and Tips for Real-World Applications

Torsten Volk
FAUN — Developer Community 🐾
8 min readMar 15, 2024

--

One of my favorite features of GPT (generative pre-trained transformer) is its ability to ‘read stuff for me’ and provide me with a handy digest of the key themes and topics from large quantities of complex and highly technical text materials. For example, when you request the identification and summary of the most important themes from the KubeCon 2024 cloud native conference in Paris, you simply need to feed it the 493 session titles and descriptions and wait for a few seconds. The resulting summary is a fantastic digest of the key themes and topics at KubeCon that would take us humans many hours to manually create. I generally do not like to use the term ‘game changing’ at all, but the ability to pre-process vast amounts of unstructured data for humans to ingest at a glance is absolutely worthy of this predicate.

Unedited summary of the 493 KubeCon session titles and descriptions by GPT4-Turbo (March 15, 2024)

Learning through Experimentation

But relying on GPT’s ability to augment our limited human ability to process data comes with the responsibility for us humans to understand, at least in broad strokes, how this is done. Instead of starting with the technical part of things, let us run a quick experiment.

Experiment: Repeating the Same Analysis Five Times

What happens if you make GPT solve the same task of summarizing all 493 KubeCon 2024 session titles and descriptions, with a combined 66,432, words, five times in a row? If you want to rely on GPT completing this type of task for you, it is crucial to understand the answer to this question.

Model Input

I used the exact same prompt, dataset (493 session summaries from KubeCon 2024 in Paris) and model parameters for this experiment. To prevent previous completions of the task to influence subsequent ones, I made sure to discard the session state after each summarization request. Then I compared the results.

Test Results

Comparison of the 5 summaries (extract)

After requesting the same summary of all KubeCon sessions five times and storing the results in an overview table, here are the key findings:

Consistent Identification and Explanation of Core Topics

GPT4-Turbo generated topic map of the key concepts of all five summary charts.

All 5 summaries included the three core topics that are ever-present across all sessions at KubeCon 2024: AI/ML, Security and Compliance, and Sustainability. Knowing that GPT can be relied upon to retrieve the most important information from the 66,432 words-long text is critical for us humans to trust the model for important tasks. While each summary included all three of these key topics, the individual summaries put an emphasis on slightly different aspects of each topic:

Summary 3 (S3) looked at AI/ML from three different angles (1. AI/ML for platform and development optimization, 2. AI/ML Workloads on Kubernetes, and 3) AI and Data Processing tasks), while S1 and S5 only looked at AI/ML on Kubernetes , S4 only focused on LLMs and AI at the edge, and S2 focused on integrating AI/ML capabilities with cloud native application architecture. All of these summaries are completely legitimate and could have originated from a human subject matter expert instead of being authored by GPT. Most importantly, only using a one of the 5 summaries would still provide the human user with a comprehensive understanding of the three core topics of KubeCon 2024.

Explaining the Differences: There Is No One Best Way

While it is impressive that GPT correctly identified the key topics, how could a large language model (LLM) ‘decide’ to give a different flavor to each one of the five summaries? Isn’t the LLM supposed to precisely calculate the ‘one best’ way of summarizing my content?

Similarities to Human Information Ingestion

If I had taken many hours to sort through the 66,432 words of the 493 KubeCon 2024 sessions, I would have made my own intuitive choices in terms of which aspects of the three core topics to emphasize. My initial choices would have individually and collectively influenced later ones. For example, the terms ‘Kubernetes’, ‘GPU’, and ‘scalability’ might have caught my in a few session titles and descriptions early on and I may have made the subconscious decision to focus my summary of AI/ML on the ability of Kubernetes to run AI/ML workloads that consume GPUs in a scalable manner. Alternatively, my attention may have been caught by sessions that were all about how to harness AI/ML to optimize the operation of cloud native environments. In that case, my summary may have emphasized this aspect of AI/ML within the context of KubeCon 2024. In a nutshell, one small change in how I read a certain sentence or section can lead to my human brain to focus on one aspect of a topic over another, when writing my KubeCon session summary. This is very similar to how GPT works as well.

Shifts in Attention

GPT4 introduces a certain, user configurable, degree of freedom that determines how the LLM handles uncertainty. Even the minimum amount of freedom provided, as in the case of my analysis of the KubeCon session catalog, introduces a small amount stochasticity into the model’s text perception. The longer the text the model is asked to summarize, the more even these small amounts of randomness can add up, similar to the human eye ‘catching’ different aspects of a large body of text (as described in the previous paragraph).

But Isn’t there only one Truth?

The stochastic nature of LLMs allows them to explore different interpretations of the text, leading to summaries that might focus on slightly different angles on the same set of themes. You can compare this to the model playing the role of 5 different A-students at school. Each one of them will get an A for their class work, despite all five essaies having their own take on exactly how to complete the task. In other words, randomness is a concept inherent to use humans too.

How Do the Results Compare to Traditional Algorithms

TF/IDF is the most used algorithm when it comes to determine the importance of terms within large collections of text. TF/IDF always follows the same formula and therefore, when you run it five times, the results will be the same each time. TF-IDF is effective because it manages to highlight words that are characteristic of a document while filtering out common words that appear in many documents (such as “the”, “is”, and “and”). This makes it a fantastic tool for tasks like search engine optimization, where the goal is to identify and prioritize content that is most relevant to a specific query. It’s also widely used in document classification, information retrieval, and natural language processing applications to improve the accuracy of algorithms by focusing on the most relevant terms. Let’s look at the TF/IDF chart that shows the Top 20 most important terms of the 493 KubeCon sessions and compare to the GPT4-Turbo summary.

As expected, based on the GPT4 results, we can find AI and security at the top of the chart. However, sustainability did not make it onto this chart. Why not? Did GPT hallucinate, as it included the term in all 5 of its summaries? When looking at the complete body of text (all 493 session descriptions) it quickly becomes clear that ‘sustainability’ can be found ‘between the lines’ of a large number of sessions, without being explicitly mentioned. The ability to pick up on this fact is the key strength of an LLM and the reason why LLMs require much more hardware cycles than a quick TF/IDF calculation. These cycles are used to ‘understand’ underlying themes that are not described with explicit keywords. These are the themes missed by the purely quantitative analysis of TF/IDF.

Training As the Key Difference between LLM and TF/IDF

LLMs are trained on billions of documents that are different in style, type, and content. This provides LLM’s with an ‘understanding’ of reality that goes far beyond ‘word counting’:

Contextual Understanding: LLMs are trained on a diverse dataset that includes a wide range of topics, discussions, and literature. This training enables them to understand context deeply and recognize themes that are becoming increasingly relevant, even if they are not the main focus of a text. For instance, sustainability might be mentioned in relation to technology and innovation in various contexts, allowing the LLM to understand its growing importance across industries.

Pattern Recognition: LLMs are adept at recognizing patterns in text that may indicate emerging trends. They can detect subtle cues and references that might elude a more straightforward analysis, such as the mention of “green technology” or “eco-friendly practices” in discussions about cloud-native applications and AI, signaling the theme of sustainability.

Cross-referencing: Through its extensive training data, an LLM can cross-reference the information presented in a static text with broader discussions happening across multiple sources. If the model notices that topics like AI and security are frequently discussed alongside sustainability in the broader tech discourse, it might infer the emerging importance of sustainability in the context of a conference like KubeCon.

Inference and Deduction: LLMs can make inferences based on the relationships between different concepts and topics. Even if a theme like sustainability is not explicitly highlighted as a main topic, the model can deduce its relevance from related discussions, such as the impact of technology on the environment or the industry’s response to climate change.

Semantic Analysis: Beyond just identifying keywords, LLMs perform semantic analysis to understand the meaning and significance of terms within the text. This allows them to grasp the importance of themes based on how they are discussed, even if they are not among the most frequently mentioned terms.

While this article only scratches the surface of how LLMs really work, my goal is to provide LLM users with the conceptual understanding of what they can expect when using LLMs for text processing tasks. In a follow-on article I will provide a number of practical tips for practitioners to get the most out of their use of LLMs.

👋 If you find this helpful, please click the clap 👏 button below a few times to show your support for the author 👇

🚀Join FAUN Developer Community & Get Similar Stories in your Inbox Each Week

--

--

Artificial Intelligence, Cognitive Computing, Automatic Machine Learning in DevOps, IT, and Business are at the center of my industry analyst practice at EMA.