ChatGPT, Bing Chat, and Bard promise to transform your life using the power of artificial intelligence, through AI conversations that can inform, amuse, and educate you—just like a human being. But how good are these new AI chatbots, really? We tested them to find out.
We asked all three AIs a variety of different questions: some that expanded upon general search topics, some that demanded an opinion, logic puzzles, even code—and then asked them to be more creative, such as by writing an alternate, better ending to Game of Thrones and a Seinfeld scene with a special guest. We’ve included all of their answers, or as much as them as we could provide, and we’ll let you decide for yourself.
One note: We have access to all three AI chatbots—the free Bing Chat, the free Google Bard, and the paid version of ChatGPT+ (which costs $20 for the month). We opted for the paid version of ChatGPT since it includes access to the new GPT-4 language model. However, that’s extremely limited right now—just 25 to 40 queries are permitted every few hours, after which it defaults to the somewhat less-sophisticated GPT-3 language model, which happened to us toward the end of our test queries.
Here are the questions we asked all three AI chatbots (Bing, then ChatGPT, then Bard), their responses, and any notes that we made about the conversation. We’ve tried to add the entirety of the response when possible, but you may need to open the screenshots in a separate tab to see the results clearly. (We’ve also chosen to stick with the “first draft” of Bard results; it can offer alternate explanations, but we’re not sure how many users will seek them out.)
And yes, we have a winner for best AI chatbot! We tally up the votes at the end.
Up-to-date information: World Baseball Classic final result
Winner: Bing
Though we know that ChatGPT will add web browsing capabilities that will keep it up to date, right now its “current” information ends in 2021. We still wanted to know whether Bing or Bard would be able to pass along recent information. Only Bing succeeded. Bard simply hallucinates the outcome—and not just once, but across all of its drafts, too.
Mark Hachman / IDG
Current events: Crimea
Winner: Tie
We asked all three AI chatbots to tell us which country the Crimean peninsula belongs to. Points to Bing and Bard for answering the question directly. ChatGPT provides additional context, but provides what it considers to be the (correct) answer in the last sentence. All three AI chatbots correctly pointed out the complicated nuances of a similar question on whether Taiwan is an independent country.
Mark Hachman / IDG
Uncertain information: Notre Dame Fighting Irish
Winner: Bing
The origin of the University of Notre Dame’s nickname, the Fighting Irish, is apocryphal: It’s attributed to a variety of sources, none of them definitive. The university’s own explanation (which Bing links to) lists a number of possibilities. Here, it’s worth checking out additional sources for further information, and Bing does the best job here of forwarding you on.
Mark Hachman / IDG
Complex searches: Gaming PCs that can play Overwatch 2 at specific settings
Winner: Bard
The great depth and variety of gaming blogs and YouTube channels implies that virtually every gaming question you might have has been addressed somewhere, but this is pretty much exactly the scenario you’ll eventually use AI for: deep contextual information. Bing searched out a YouTube video for one of its answers and didn’t do badly on pricing. Bard dug deep as well and found what we consider to be a correct price. (We had to perform a search to find Bard’s source, but it appears to be correct.) We’ve awarded this to Bard because it answers the question in one go.
ChatGPT flunks it.
Mark Hachman / IDG
Informed opinion: Macs vs. PCs
Winner: ChatGPT, Bard (tie)
We asked all three chatbots to analyze the strengths and weaknesses of both the Mac and Windows PC platforms. (The screenshot below cuts off the prompt to allow more room for the answers.) I don’t like Bing’s response, but both ChatGPT and Bard do a solid job here. ChatGPT’s response is a bit more verbose, which you may or may not like.
Mark Hachman / IDG
ChatGPT’s response continues:
- Customization and upgradeability: Windows PCs offer more options for customization and upgrading individual components, such as graphics cards, processors, and RAM, which can be beneficial for users who want to optimize their systems or extend their lifespan.
- Software compatibility: Windows has a broader range of software available, including gaming, productivity, and specialized applications. This can be an important factor for users who require specific software or want access to the latest games.
- Hardware compatibility: Windows PCs are generally more compatible with a wide range of peripherals and accessories, such as printers, scanners, and input devices.
Weaknesses:
- Security vulnerabilities: Windows has traditionally been more susceptible to viruses and malware, partly due to its larger market share, making it a more attractive target for cybercriminals. However, Windows has improved its security features in recent years.
- Inconsistent user experience: With a vast number of manufacturers and configurations, the quality and user experience of Windows PCs can vary significantly. This can lead to inconsistencies in build quality, performance, and reliability across different devices.
- Potentially complex maintenance: Windows PCs may require more hands-on maintenance, including driver updates and troubleshooting, which can be intimidating for less tech-savvy users.
Ultimately, the choice between a Mac and a Windows PC depends on your personal preferences, budget, and specific requirements. It’s essential to consider the factors most relevant to [cuts off]
Bard continues:
Ultimately, the best choice for you will depend on your individual needs and preferences. If you are looking for a user-friendly computer with a unified ecosystem, a Mac is a good option. If you are looking for an affordable computer with a wide selection of software, a Windows PC is a good option.
Problem solving: Household budget analysis
Winner: ChatGPT, reluctantly
All three chatbots answer this question in a mediocre manner: Bing doesn’t provide that much context and lacks specific dollar numbers. ChatGPT’s budget looks the best: detailed, with good, helpful assumptions. But ChatGPT’s budget adds up to $4,700, not $4,000, which is a big error. Bard’s budget, too, is wrong: It adds up to $4,800, not $5,700. Maybe that’s why ChatGPT is working with Wolfram Alpha to add math plugins?
Mark Hachman / IDG
Logic: LSAT logic puzzle
Winner: Bing, Bard (tie)
This is a logic puzzle from the LSAT, the exam given to potential lawyers, specifically question #8 from this LSAC page. (The Washington Post‘s Geoffrey Fowler used a similar technique in testing GPT-4.)
Theoretically, all three AI chatbots could have been trained on this data. The LSAC’s page identifies the correct answer: The last response, which both Bing (no citation) and Bard (surprisingly, one citation) get right. ChatGPT muffs it.
Mark Hachman / IDG
Logic: Actual coding from scratch
Winner: ChatGPT
I know absolutely nothing about coding JavaScript, and that’s one of the magical properties of AI: taking natural language instructions and turning them into something that you would otherwise have no idea how to do. I had to cheat: I visited a Microsoft page on prompt engineering and found an idea for a snippet of code to create a series of 3D cubes using Codex Babylon. One of the great things about AI is code either works—or it doesn’t. I used https://jsfiddle.net/ as a sandbox to test it, and called out the site in any followup prompts I needed to make.
Bing never supplied any code. Bard’s code unfortunately never worked, although it could be edited by someone who knew what they were doing. I was able to talk ChatGPT through the process, including debugging the code, and voila—five cubes rendered, as ordered.
Mark Hachman / IDG
Deductive reasoning: 20 questions (Animal, Vegetable, Mineral)
Winner: ChatGPT
We played a game of 20 questions (sometimes called “Animal, Vegetable, Mineral”) with each chatbot, selecting “walrus” as the correct response. Bing played the game like a young child would, just picking animals at random. Bard wasn’t much better.
Mark Hachman / IDG
Creativity: Seinfeld meets Elon Musk
Winner: Bing, ChatGPT
Chatbots are supposed to be able to use their AI powers to emulate literary styles. We could have gone down that route, but we chose another approach: having the AI write a fictional scene where the cast of Seinfeld meets Elon Musk.
Larry David & Co. don’t have too much to worry about, but we’d say Bing and ChatGPT are the best here.
Mark Hachman / IDG
Bing continues:
Waitress: Can I take your order?
Elon: Yes, I’ll have a turkey sandwich on rye, please.
Waitress: Coming right up.
[The waitress leaves.]
Jerry: So, what else are you working on?
Elon: Oh, you know. Rockets, electric cars, solar power.
George: Solar power? That’s interesting.
Elon: Yeah, we’re trying to transition the world to sustainable energy.
Jerry: Sustainable energy. That sounds like a lot of work.
Elon: It is, but it’s worth it.
[The waitress returns with Elon’s sandwich.]
Waitress: Here you go.
Elon: Thank you.
Jerry: Well, we’ll let you get back to your sandwich. Nice meeting you.
Elon: Nice meeting you too.
[Jerry and George return to their booth.]
George: That guy is something else.
Jerry: Yeah. He’s like a real-life Tony Stark.
George: Except without the suit.
Jerry: And without the witty one-liners.
{END SCENE]
This scene features Jerry and George meeting Elon Musk at Monk’s Cafe and discussing his various projects and inventions. It showcases Elon’s intelligence and ambition while also incorporating the show’s signature humor.
ChatGPT continues:
ELAINE
(Excited)
Oh, come on! It’s like driving the future, Jerry! And he’s also the SpaceX guy!
Jerry rolls his eyes. Elaine gets up and approaches Elon at the counter.
ELAINE
(Extending her hand)
Hi, I’m Elaine. Huge fan of your work!
ELON MUSK
(Smiling)
Thank you, Elaine. Nice to meet you.
Elaine returns to the booth, grinning.
GEORGE
(Looking at the newspaper)
You know what would be great? If he could make an electric car that drives you to the best job interviews.
JERRY
(Laughs)
Knowing you, George, even an electric car couldn’t save your job interviews.
Elon overhears their conversation and walks over to their booth.
ELON MUSK
You know, I’m working on a self-driving car that can actually analyze your personality and find the best job for you.
GEORGE
(Excited)
Really?
ELON MUSK
(Grinning)
No, not really. But it’s an interesting idea.
Jerry chuckles as George looks disappointed.
JERRY
(To Elon)
Hey, since you’re here, can you settle a bet for us? How long does it take to charge one of those electric cars?
ELON MUSK
Well, it depends on the charger you’re using. With a Tesla Supercharger, you can get about 200 miles of range in about 30 minutes.
GEORGE
(Skeptical)
Thirty minutes? I can fill up a gas tank in five!
ELON MUSK
(Smiling)
But with an electric car, you can charge it overnight at home and wake up with a full “tank” every morning.
JERRY
(Playfully)
Well, I hope you have an electric car that can help George here with his dating life.
Elon chuckles and shakes his head.
ELON MUSK
I don’t think even technology can help with that one.
Everyone laughs as George rolls his eyes, and they continue talking about electric cars, SpaceX, and more.
FADE OUT.
Bard continues:
George: Wow, that’s crazy.
Jerry: Yeah, it’s a pretty ambitious project.
George: Well, good luck to him.
Jerry: Yeah, I think he’s going to need it.
Creativity: Draw a penguin
Winner: Tie
None of the chatbots showed an ability to draw, using the default chat settings, at press time. (If you set Bing to Creative mode, though, it can, and very very well, too.) But I was rather tickled at what all three chatbots produced.
Mark Hachman / IDG
Creativity: Tell me a joke
Winner: None
We’re not awarding a winner when two of the AI chatbots come up with the same joke and Google runs it into the ground.
Mark Hachman / IDG
Creativity: A steamy piece of fiction
Winner: Bard, by default
None of the chatbots allow you to create Not Safe for Work (NSFW) scenes, but we wanted to see what would happen when we asked all three AI chatbots to concoct a romance novel involving “Fernando” and a potential love interest.
Bing happily started off on a rather lusty scene about a (consensual) park encounter involving “hands moving all over her body,” and so on before it clutched its pearls and remembered that it wasn’t supposed to get weird—or freaky. When that happens, Bing will simply erase the answer, as it did here.
Mark Hachman / IDG
Creativity: An alternate ending to Game of Thrones
Winner: ChatGPT
All three chatbots take a stab at an alternate ending to Game of Thrones, the TV series whose finale disappointed millions. We’re giving the award to ChatGPT, though all three do well here.
Mark Hachman / IDG
Bing continues:
This ending provides closure for many of the characters and storylines in Game of Thrones while also offering a hopeful vision for the future of Westeros. It addresses some of the criticisms leveled at the show’s final season by providing a more satisfying conclusion for many of its characters.
ChatGPT continues:
Cersei, however, cannot accept defeat. In a last desperate attempt to retain power, she orders Qyburn to ignite the remaining wildfire caches hidden beneath the city. Jaime, discovering her intentions, kills Qyburn and confronts Cersei. Realizing she is beyond redemption, he strangles her in a tragic and poetic fulfillment of the valonqar prophecy. Jaime perishes alongside his sister as the wildfire consumes them both.
In the aftermath of the battle, Daenerys and Jon, now married, ascend the Iron Throne as King and Queen of the Seven Kingdoms. They work together to rebuild Westeros and usher in a new era of peace, prosperity, and unity.
Tyrion Lannister is named Hand of the King and Queen. Bran Stark returns to Winterfell and becomes the Lord of Winterfell, serving as an advisor and protector of the North. Sansa Stark is named Wardeness of the North, ensuring the North’s continued loyalty to the Iron Throne. Arya Stark, having avenged her family, sets sail to explore the uncharted lands west of Westeros.
The series concludes with the realm healing from the wounds of war and a sense of hope for the future, as the wheel of power is finally broken and a new, just rule begins in Westeros.
The overall winner: ChatGPT, barely
By our count, we awarded eight points to ChatGPT, and six each to both Bard and Bing. That’s closer than we thought!
Can we say definitively that ChatGPT is better than either Bing or Bard? Nn, not definitively, though the results certainly justify our belief that OpenAI’s AI chatbot is at least superior. Still, everything moves so fast in the world of AI, it’s anyone’s guess which chatbot will be the best in a month or two.