Read Less

Generative AI for Network Operations: Building an AI CableLabs Expert

Austin Pahl
Architect

Tyler Glenn
Principal Engineer

Oct 16, 2024

Key Points

ChatGPT and other large language models (LLMs) are enabling new ways to interact with computers, but they often make up false information about real subjects. This tendency to “hallucinate” makes it hard to rely on LLMs for real-world applications.
CableLabs is building a Domain Expertise Layer that solves the hallucination problem by giving LLMs access to CableLabs specifications and SCTE standards as sources of truth for more accurate answers, backed by a growing validation dataset.

This is the first in a series of blog posts about large language models and CableLabs’ efforts to apply them to transform knowledge-based work across the cable industry.

What happens if you ask ChatGPT cable-related questions? To conjure an example, we asked the AI chatbot to describe adjacency misalignment and its symptoms. Here is ChatGPT’s response:

Adjacency misalignment, also known as adjacency mismatch, is a common issue in network routing. It occurs when two routers have different views of their network adjacency. In other words, one router believes it has a direct connection to another router, but the second router does not have the same understanding. This can happen due to various reasons such as incorrect configuration, network congestion, or hardware failure.

[ChatGPT then describes symptoms of this issue.]

If you're unfamiliar with adjacency misalignment, the ChatGPT definition sounds pretty convincing! If you are familiar with adjacency misalignment, however, you probably noticed right away the answer is completely false. Adjacency misalignment is actually a radio frequency (RF) impairment. ChatGPT's answer wasn't even at the right level of the network stack.

ChatGPT and other tools of its kind are amazing for what they're beginning to achieve across industries and use cases, but answers like this aren’t helpful at all. CableLabs is actively solving this problem to help bring unprecedented AI advancements to the cable industry. Let's see how.

Why Is ChatGPT So Confidently Wrong?

ChatGPT and other generative AI products are built on revolutionary machine learning (ML) technology commonly known as large language models, or LLMs. The ability to talk naturally to a computer was science fiction only a few years ago, but LLMs have made human-to-AI conversation easier than entering a Google search.

Unfortunately, LLMs often fall short when it comes to technical, domain-specific questions like the example above. Unless you already know the answer to your question, it becomes difficult to check the LLM’s response is factually correct.

At their core, LLMs are like extremely sophisticated “predict-the-next-word” machines. When you provide an LLM with a sequence of text, the text gets chopped into small chunks called “tokens,” which the LLM can understand. Then, the LLM performs processing to guess the most likely token that would come after the input sequence. The method may sound basic, but when text is repeatedly added to the end of a sequence you get a chatbot!

The main reason LLMs act so confident when they're wrong is that optimizing for "predicting the next word" makes them expert at sounding convincing — but not at recognizing truths and falsehoods. If they don’t “know” an answer to a question, they still try to sound legitimate, and this often leads to making something up. The AI community has called this behavior “hallucination,” because the LLM acts like its own made-up writings are real. In reality, hallucinations are an extension of the LLM’s generative nature; they are simply not rooted in reality.

A Huge Opportunity Within Reach

If we can solve the hallucination problem — that is, build an AI that can reliably give accurate answers to cable-related questions — then we have the potential to revolutionize all sorts of knowledge work across the industry, including field operations, customer experience and innovation.

The industry has decades of technical, operational and business knowledge just waiting to be harnessed to help enable more seamless connectivity and network platform evolution, as well as pervasive intelligence, security and privacy. These key themes form the foundation of CableLabs’ Technology Vision.

For that reason, CableLabs has been hard at work building our solution to the knowledge accuracy problem faced by LLMs.

The Solution: CableLabs Expert LLM

CableLabs' Expert LLM is a first-of-its-kind chat application that answers questions based on a domain-expertise layer consisting of hand-curated, authoritative sources on cable technologies. If we ask the CableLabs Expert LLM the same question from the beginning of this article, you’ll get an answer like this:

Adjacency misalignment is an RF impairment recognizable as a difference in channel power between adjacent channels or groups of channels. It may be observed multiple times at different points in the spectrum and can be attributable to RF combining or source issues at the headend or hub or in a node utilizing a broadcast/narrowcast overlay or a split band configuration when the two spectrums are not combined correctly.

When adjacency misalignment is observed, the lower power channels may indicate poor performance, seen as a poor modulation error ratio (MER) when the delta between channels is large. This condition can manifest as lost packets, video tiling, freezing, or in very extreme cases, black screens at a customer’s home. Because adjacency misalignment is introduced very early in the downstream signal path, it has the potential to impact a significant number of customers.

Sources:

- SCTE 280
- CM-SP-PHYv4.0-I06-221019

Much better!

As you can see, the CableLabs Expert even cites its sources. Currently, the system has access to the specifications for DOCSIS 3.0, 3.1 and 4.0, as well as select SCTE documents including 280 and 40. Soon, we will expand support to other key sources of information related to cable broadband technologies.

The application supports all the latest state-of-the-art LLMs, including the GPT series, Claude, Mistral, Llama and many more. Whenever a useful new model comes out, the application can be extended in minutes to support the model in minutes without expensive fine-tuning or training from scratch.

The CableLabs Expert LLM's capabilities are mainly thanks to a powerful technique known as Retrieval Augmented Generation (RAG). In a nutshell, RAG is like giving an LLM an open-book test. When a user asks a question, the words are converted into a numerical representation known as "vector embeddings," and then those representations help us automatically pick out snippets of the CableLabs specifications and SCTE standards that are most likely to have the user's answer therein. The LLM is given those snippets as context for it to make an accurate, fact-based answer. Additionally, RAG can run on cheap, low-end hardware as opposed to alternative methods like fine-tuning, which requires GPUs to complete in a timely manner.

In addition to the chat interface, the CableLabs Expert application provides a comprehensive validation dataset and benchmarking framework to automatically evaluate models against a large body of known questions and answers from varied sources. Model evaluation is a critical part of this process: We must be able to precisely understand how well our system is performing, especially when comparing specific approaches, datasets or models.

Building for the Future

Generative AI is here to stay. ChatGPT captured the imagination of people around the world, across all business sectors and walks of life. Everybody agrees that it is a disruptive force, but the real question is who will disrupt and who will be disrupted. At CableLabs, we are building a better future for the broadband industry using cutting-edge AI technologies.

Foundational discussions are happening now between CableLabs and our members to bring the industry together for generative AI innovation and interchange standards.

Stay tuned for future blog posts on generative AI for network operations, in which we'll take a closer look under the hood of the CableLabs Expert LLM! Next time, we'll explore evaluation and analysis of the Expert's writings.

If you want to know everything about CableLabs' work with LLMs and RAG, check out our technical paper, "The Conversational Network: AI-Powered Language Models for Smarter Cable Operations," which was presented at TechExpo 2024.

DOWNLOAD THE PAPER

Read Less

Immersive Media

Gaussian Splatting: Painting Immersive Scenes With Reality

Austin Pahl
Architect

Nov 3, 2023

The state of the art of immersive media is evolving so rapidly that it’s hard to keep up! It often feels as if you can choose a random limitation in the latest research, wait a few weeks and find a new paper that’s solved that problem. Here, we’ll look at an example of exactly that, pushing the bar to get faster, higher-quality immersive content.

If you didn’t catch our previous post, we looked at Neural Radiance Fields (NeRF) for their ability to “memorize” beautiful, photorealistic 3D snapshots of the real world. In the past three years, a lively community has sprung up around NeRF. Developers and artists have created content, built tools and pushed boundaries on all the ways NeRF can be used.

One of NeRF’s biggest remaining limitations is that real-time interactive viewing of NeRF-based content generally requires reducing image quality, which can cause fog-like visual artifacts and color inaccuracies in the scene.

As it turns out, the answer to this problem came in a paper at SIGGRAPH 2023: 3D Gaussian Splatting for Real-Time Radiance Field Rendering. Despite having little to no conceptual connection to the original NeRF methodology, Gaussian Splatting dramatically improved both visual fidelity and performance of real-time viewing. The results speak for themselves: In just the few months since the paper was released, we’ve seen dozens of product enhancements and launches incorporating Gaussian Splatting functionality.

The Bridge at Argenteuil, Claude Monet, 1874 (Collection of Mr. and Mrs. Paul Mellon)

How It Works: Computer-Generated Impressionism

If you’re a fan of Monet or Renoir, you’re likely familiar with Impressionism. This 19th century art movement is known for large, distinct brushstrokes and an emphasis on larger forms, as you can see in the above example. Try looking too close and you’ll mostly see brushstrokes; the full scene comes together when you gaze at it from far enough away.

As it turns out, Impressionism is a useful analogy for Gaussian Splatting. Creating a scene with Gaussian Splatting is like making an Impressionist painting, but in 3D. The scene is composed of millions of “splats,” also known as 3D Gaussians. Each splat is like a voluminous cloud painted onto an empty 3D space, and each splat can show different colors from various angles to mimic view-dependent effects like reflections. When you build a scene from lots of small splats, the result can be amazingly photorealistic!

Here’s an example. I recorded this cellphone video at the Duke Gardens in Durham, North Carolina:

Here’s the result as an interactive Gaussian Splatting scene via Luma AI. You can click and drag to move the scene around.

You can view a couple more examples from my visit to the gardens here and here.

From a technical perspective, 3D Gaussians are a unique variant of point clouds, where each point encodes spherical harmonics for view-dependent color and a covariance matrix for describing shape (some sort of directionally scaled sphere). Although splat-based rendering has existed for a long time, the Gaussian Splatting paper was the first to show that 3D Gaussians serve as an excellent scene representation, and it describes new methods to create and efficiently render these scenes. For more details, refer to the SIGGRAPH paper or the video overview provided on the authors’ website.

Gaussian Splatting scenes tend to be large compared to other scene formats, on the order of hundreds of megabytes to gigabytes. Each splat is 248 bytes, and a scene is typically composed of millions of splats. However, programmer Aras Pranckevičius has a great technical deep dive showing that Gaussian Splatting is ripe for compression, bringing sizes under a gigabyte with little to no visual impact, or smaller if you can accept “lossy” visuals.

Network Traffic of the Future

With all this said about Gaussian Splatting, where are we going next?

The dust hasn’t settled on immersive scene representations. A new research preprint already proposes combining the strengths of NeRF and Gaussian Splatting into a hybrid approach. Still, everything is so fast that the state of the art could change any day. When things do settle, the next step will be standardization.

If Gaussian Splatting is here to stay, we should expect file sizes to grow with the scale of the use cases at play. For example, a real estate agent selling a house may want to deliver an online virtual tour that allows viewers to experience granular details like the sparkle of a fine granite countertop while also walking through the rooms and seeing the house from the outside.

Going even bigger, consider a power transmission/distribution company that constructs a visual digital twin of its entire power grid across a city, then sync that across cloud simulations and user interfaces. Whereas previously we discussed scenes on the order of millions of splats, eventually we’ll need billions and beyond.

CableLabs’ Immersive Media Experiences team engages with immersive standards activities and monitors the state of the art of immersive media to understand and communicate key trends and their impact on the cable industry. Subscribe to our blog for more updates from the Immersive Media Team and other activities at CableLabs.

Read Less

Innovation

How NeRF Technology Is Creating the Next Generation of Media

Austin Pahl
Architect

Jun 8, 2023

The ways we create and consume visual media are constantly evolving, allowing us to experience places and things as if we’re physically present in those environments. Today, thanks to a fast-growing technology called a Neural Radiance Field ("NeRF"), anyone with a regular camera can make and share "3D photographs" of the real world. NeRFs have been around since the 2020 publication of Representing Scenes as Neural Radiance Fields for View Synthesis, but recent developments have made it easier than ever to start making immersive 3D media.

If you’ve ever viewed a 3D house tour or a 3D piece of furniture on an e-commerce website, you might be wondering: What makes NeRF unique? The answer is that NeRF introduces unprecedented photorealistic detail, including the ability to see reflections and transparencies like never before. You can see an example of NeRF in this capture created by our intern, Tyler McCormick:

NeRF makes high-quality 3D content creation fast and intuitive. CableLabs' Immersive Media Experiences team has been following the developments surrounding NeRF and other forms of immersive media to understand how these technologies transform the ways we live, learn, work and play. In time, immersive applications may emerge as major drivers of network traffic, so we’re working to understand the resources required to deliver these next-generation experiences.

In this blog post, we take a look at how NeRF works, how to use it yourself and how it’s influencing the future of immersive media.

NeRF in a Nutshell: How It Works

Essentially, NeRF is a machine learning system that takes photos or videos of a subject and memorizes the appearance of that subject in 3D. The NeRF-creation process looks something like this:

Record a regular video or take a set of photos of your subject. Your phone will do!
Take each of those images and figure out their positions relative to each other. You can do this with sensors fixed to the camera or, more easily, with an AI pipeline such as COLMAP.
Train a multi-layer perceptron (a kind of neural network) to behave like a renderer that’s specialized at producing images of this subject.
Now, you have a NeRF! You can use this neural network to create new images and videos of your subject, as in the above example.

When NeRF was first published in 2020, this creation process took hours. Today, advancements such as NVIDIA’s Instant Neural Graphics Primitives have brought the time down to the order of minutes or even seconds!

When we called NeRF a “3D photograph” earlier, we meant it. Essentially, a NeRF tries to describe the color and density of light emitted at each point in a 3D space. If you look at the same point of a real object from various angles, you might see different colors and densities. NeRF reproduces this effect to achieve reflections and transparencies, just as if you were viewing a real 3D object.

The NeRF process results in a high level of detail, but there’s one catch: The NeRF model assumes that you’re working with a still, unchanging scene. Light-based effects are “baked in,” meaning that you can’t add new objects to the scene and see them cast shadows or appear in reflections. If subjects move or change over time in the input video, the NeRF output will appear blurry or misshapen. New research papers have identified ways around these limitations, but those solutions haven’t yet reached wider adoption. In the meantime, anyone want to bring back the Mannequin Challenge?

Getting Started

It’s easy to start playing with NeRF. For example, Luma AI has built an app for iPhones and the web that automatically builds NeRFs from your videos. Once you have a NeRF, you can make videos and export them to other content-creation tools, including the Unreal game engine. Luma has a gallery of diverse NeRF-based content submitted by their users here.

If you want to take a more hands-on approach to NeRF creation, nerfstudio is a free, open-source toolset for creating NeRFs and designing advanced 3D graphics pipelines with the new technology. The learning curve is steeper, but power users and developers may enjoy the increased flexibility that this method offers.

NeRF and Next-Generation Media

Improved 3D capture of real-world subjects opens up opportunities across multiple industries. Here are a few examples.

Digital productions and VFX artists are already finding ways to incorporate NeRF into creative workflows. The most obvious use in content creation is converting real-world subjects to 3D representations that can be combined with synthetic content, but NeRF can also be used to smooth camera movements or compose multiple camera shots into unified sequences. To see for yourself, check out this Corridor Crew video on YouTube and this McDonald’s commercial about the Chinese New Year (including the additional behind-the-scenes content in the replies).

Digital twins and simulations, as described by platforms like NVIDIA Omniverse, have presented a compelling value proposition for accurate digital modeling of real-world systems such as factories and autonomous vehicles. Where applicable, NeRF may be an effective way to digitize real-world environments for use in models and simulations. One example in the wild is Wayve Technologies’ effort to build city-scale NeRFs for autonomous vehicle simulations, as presented at NVIDIA GTC 2023.

Finally, metaverse initiatives often aim to empower users to build and share their own content and experiences. Games like Minecraft and Roblox provide user-friendly content-creation tools, but photorealistic content creation is usually reserved for experts with training on professional tools or access to specialized photogrammetry software. Now, cloud-hosted apps like Luma and nerfstudio make it possible to generate photorealistic content in minutes with your smartphone and a network connection.

NeRF Is Accelerating Immersive Media

Immersive media comes in many forms, including but not limited to virtual reality, augmented reality, mixed reality and light field displays. NeRF alone isn’t going to make or break any of these technologies as they continue to mature and enter the market, but it gives creators and developers another tool to get one step closer to a photorealistic holographic immersive experience.

In the past, we’ve asked readers to imagine that we had a way to capture life-like holograms of subjects. Thanks to NeRF and related technologies, there's no need for make-believe. Subscribe to our blog for more updates from the Immersive Media Team and other activities at CableLabs.

Read Less

Innovation

10G and Immersive Media Experiences

Austin Pahl
Architect

Feb 10, 2022

Imagine if you could create a life-like hologram of a given subject—and then be able to study and experience every detail of that subject later without being physically near it. Sounds like science fiction, right? We’re living in an era when such futuristic technology is already available to us! To make this kind of experience a reality, we can capture the rays of light that bounce off a particular subject, and what makes this possible is “light field media.”

CableLabs’ Immersive Media Experiences team has been researching the ways that light fields can transform the ways we live, learn, work and play. Today, there are already many ways to capture light fields, ranging from the latest smartphone cameras to professional light stage studios that capture the tiniest of details. To view a light field, the latest holographic displays provide high-resolution 3D video without the necessity of headwear or face tracking. Experiencing this technology in person feels like magic!

OTOY’s LightStage

How 10G Will Deliver an Immersive Future

10G will bring unprecedented speed, reliability and security to the world, which is why it’s essential for enabling light field media. Light fields require tremendous amounts of data—more than any other technology that currently exists. Traditional photographs and videos store only a grid of pixels, whereas light fields track exponentially more light rays of colors and directions. Although the ecosystem continues to evolve and reach more people, the cable industry is also preparing to deliver these immersive experiences over the network.

As part of our strategy to support the emergence of immersive media, CableLabs is a contributing member of the Immersive Digital Experiences Alliance (IDEA), a collaboration between diverse experts across immersive media technologies. IDEA is producing royalty-free specifications that enable standardized end-to-end conveyance of immersive media. These standards will make it possible to create, distribute and enjoy immersive content as the landscape grows richer over time.

From CableLabs’ Near Future series

Watch our 10G and immersive media experiences video, in which the Immersive Media Experiences team demonstrates how light fields work and showcases the latest commercially available light field displays.

Watch 10G and Immersive Media Experiences

Read Less

Alberto Campos

Andy Dolan

Anju Ahuja

Ann Finnie

Arianne Hinds

Arun Yerra

Austin Pahl

Barry Ferris

Belal Hamzeh

Ben Hepworth

Bernardo Huberman

Bob Lund

Brian Scriber

CableLabs

Carmela Stuart

Casey Turtel

Chad Riland

Chris Lammers

Chris Sibley

Chris Stengrim

Christian Pape

Christie Poland

Clarke Stevens

Curtis Knittle

Darshak Thakore

Daryl Malas

David Debrecht

Debbie Fitzgerald

Don Clarke

Dorin Viorel

Doug Jones

Eric Klassen

Eric Winkelman

Gabby Gordon

Greg White

Hans Geerdes

Ike Elliott

Jacob Malone

Jason Rupe

Jay Zhu

Jeff Leget

Jennifer Andreoli-Fang

Joey Padden

John Bahr

John Bevilacqua

John Cernazanu

Jon Schnoor

Josh Redmore

Jud Cary

Karthik Sundaresan

Kelton Shockey

Kevin Noll

Kyle Haefner

Lili Hervieu

Lisa Schwab

Luther Smith

Mariam Sorond

Mario Di Dio

Mark Bridges

Mark Poletti

Mark Walker

Martha Lyons

Massimiliano Pala

Matt Forbes

Matt Schmitt

Megan Collins

Michael Glenn

Michelle Rousseau

Michelle Vendelin

Mickie Calkins

Mitchell Ashley

Mohammed Zourob

Nicolas Metts

Omkar Dharmadhikari

Paul Fonte

Pete Smyth

Phil McKinney

Priya Shrinivasan

Rachel Beisel

Rahil Gandotra