Neural networks have grown substantially over the recent years from the iteration of the first Generative Pre-trained Transformers. As NVIDIA has refined their latest graphics cards and Tensor Cores, the capabilities of those networks has been greatly improved in recent years. Traditionally just text-based inputs, NVIDIA has been leveraging TensorRT for deep learning and audio interpretation. In two separate hands-on demonstrations with Inworld AI technology at GDC 2024, we had the opportunity to try and turn both Ubisoft’s NEO NPCs and NVIDIA’s Covert Protocol.
Ask any DM who championed a session of Dungeons & Dragons that I’ve taken part in in the past decade: taking the opportunity to push the boundaries of the campaign (and morality) is where the fun is. Both demos offered me free reign to control either experience almost entirely by voice while my handler sat off to the side, presumably taking notes or shaking their heads and debating whether or not to pull off the plug. By the time I finally got to see Ubisoft’s NEO NPCs program upfront, I was already working on my second cup of coffee before 10:00 AM and already abuzz with a font of PG-13 rated creativity.
For Ubisoft Paris’ Inworld demonstration, I went hands-on with a combination of Inworld’s AI Character Engine combined with NVIDIA’s Audio2Face automatic facial animations. A buzzword of ‘authenticity’ popped up a few times throughout my session and for the most part, it did hold true, as their guardrails were in place to try and keep the illusion that I was being recruited into “the resistance” by way of a beanie-wearing urbanite named Bloom.
To gamify the Inworld AI experience, Ubisoft kept a popup list of recommended activities to engage in, from learning about the megacorps or Bloom and the resistance at a holistic level. Any time I tried to veer the conversation, Bloom would bring up those invisible barriers, provide an insightful little quip about the humor to my request, then go right back to trying to get me to grill him about the resistance. There’s a great deal of persistence to my actions, or rather words themselves, throughout the demonstration. When asked about my unique set of skills, I brought myself up as a barista-cum-crypto enthusiast who would’ve much preferred to be known as Poncho rather than Nelson. Bloom was all too glad to oblige and we shared some anecdotes back and forth before my ultimate goal to gain some insight into the purpose of the demo came to a close.
The second portion of the demo moved into more of a freeform brainstorming session with Iron, a female lead in the resistance who seemed to outrank Bloom in a significant manner. My mission the second time around was to discuss methods to break into a vault. By picking up a controller and walking around the virtual briefing room, I was greeted by lists and keywords that Ubisoft clearly wanted me to bring up and engage with Iron to collaborate with a solution to a light B&E after brunch.
Although I had the tools laid before me, I tried to assemble them in more creative ways. A ladder was an obvious way to climb up and enter a second-floor building, but why couldn’t I take that ladder with me and use it to climb up to a security camera in the hallway and hang up a Polaroid photo taken of the hallway it was guarding in order to trick the camera into letting me pass? Knocking out a guard that kept me from traversing from the entry balcony into the office was a simple enough affair on its own for anyone who’s played a Metal Gear Solid title, but how do you transmit the intent via voice commands? Iron would simply shrug off my requests to load a grappling hook into the business end of a shotgun or fall into the old standard of climbing inside a cardboard box and sneaking past as a form of nonviolent resistance. No, I couldn’t actually progress into the next step without coming up with a solution to incapacitate the guard despite my vocal objections. Because of all of my scuttlebutt and wasted time, I was quietly rushed off the demo before I could come up with a final solution for breaking into the office and making off with the precious intel that was promised.
NVIDIA’s Covert Protocol demo was a much less guided experience and the backstory to that particular session was to discover the room number where a person of interest was staying at a five-star hotel. My cover story of choice? Courier with some illicit goods that would require a direct signature in order to release. The bellhop that was the first Inworld NPC I interacted with seemed far more agreeable to my suggestions than either Bloom or Iron turned out to be. While his conversations were far more reactionary, he did seem open to the suggestions that I get a foot massage or invite him out for karaoke once my mission was complete. Sadly, I could not convince the AI to belt out some showtunes on the fly but he did make a point of telling me that his old karaoke standby was Psy’s megahit ‘Gangnam Style’. In a spark of creativity, I made a point of giving him a bit of hell when he brought up my haircut and I informed him that I was bald and not in need of any styling trips. Ending my conversation, I walked up to the entry doors to the hotel and was about to set foot in the reception when I saw a tanned visage in the glass staring back at me, complete with a well-shined top; certainly not a mophead in the slightest. If I had the time to play a second time, I would be curious to see if the player character actually would come equipped with luscious locks depending on the player’s responses or if each demo would start as a bald and beautiful gentleman.
The other two AI NPCs were far less malleable in their responses and neither seemed to want to give an outsider to the story the time of day. That was until I located a conveniently placed work badge off to one side. Once I had that in my inventory, the personalities of both of the AI seemed to actually listen to my intended requests and actually provide me with worthwhile information towards my intended goal. This switch that flipped took a bit out of the immersion and while I understand for gameplay reasons that yes, there do need to be objectives for the player to engage with instead of just conversing back and forth about the weather or favorite shampoo flavors, it took away from the sense of roleplaying to be forced into discrete objectives rather than the promise of free form gameplay going forth.
No matter how absurd my request was, as long as I kept the inquiries PG-13 and avoided anything truly obscene, the language models would acquiesce to my requests and typically respond in humorous and insightful retorts. The quality of the experience with these characters and language models is ultimately as good as what they’re trained on and the processing power to decipher unusual requests. Ubisoft and NVIDIA have each shown that they have different paths towards creating realistic conversation with the player while keeping guardrails in place that could serve to be useful to integrate into games in the future. Whether these tools can actually be integrated into games before the current console generation comes to a close remains to be seen. Still, my skepticism has been abated for now, and I’ve found myself interested to see what the future can hold for these AI-empowered NPCs.