How Nvidia Reduced JSON Processing Costs by Over 80%

In this video, we're going to learn how Nvidia was able to cut on processing time by 4x, reduced costs by 80%. 

Clarification on the parallel tokenization: 

Let's start with the problem: character-by-character processing is slow because it requires tracking the depth and structure while continuously parsing every character.

In parallel tokenization, the parsing of the JSON is split into independent tasks, with each thread handling a specific portion of the structure. For $.inventory.quantity, one thread first identifies the depth and position of the "inventory" key and its associated object, while another thread begins looking for the "quantity" key inside the "inventory" object, based on pre-computed positions. These threads work simultaneously without waiting for each other to finish, allowing the GPU to efficiently extract the desired value by directly jumping to "quantity" at the correct nesting level. 

----

Also apologies for earlier viewers of this video. I previously had a section called testing where it was saying "Their test dataset consisted of 200k rows of json data across five columns, totally 9.5GB uncompressed The results were compelling. The processing time for their production workload decreased from 16.7 hours to just 3.8 hours". But to clarify the 200k rows were for their testing and the timing was from the retailers production env. 

Timeline:
00:00:00 - Intro
00:00:21 - Background
00:01:38 - Issue with CPU processing 
00:02:18 - Rapid Accelerator 
00:03:06 - Issues with GPU processing
00:05:04 - Fixes / optimization
00:06:55 - Future improvements
00:07:11 - Ending


Nvidia resources:
https://developer.nvidia.com/blog/acc...
https://docs.nvidia.com/cuda/cuda-c-p...
https://docs.nvidia.com/spark-rapids/...

Apache Spark:
https://aws.amazon.com/what-is/apache...

Videos:
   • Hanging up flip phone(VINE)  

Music credit:
♪ [non copyright music] Lofi Type Beat - Saturday Morning | aesthetic lofi music / Lofiru Link :    [![](https://www.gstatic.com/youtube/img/w...[non copyright music] Lofi Type Beat ...](   • [non copyright music] Lofi Type Beat - Sat...  )

로딩 중...

How Nvidia Reduced JSON Processing Costs by Over 80%

스프링 OAuth2 리소스 서버 : 4. 리소스 응답

Meta AI Drama Explained

'중력을 거스른다'는 유니트리 로봇 #shorts

NODE Launch Trailer

The Java Cloud-Native Stack for Microservices and Serverless Architecture - by Markus Kett

맥에서 윈도우 렌탈하면서 ollama 이용하기