AI Build Week – Day 1: Prompt Arena Showdown 🔥

Welcome to Day 1 of AI Build Week – Prompt Arena!
Today we’re kicking ai-build-week off by building a Chatbot Arena to compare LLMs in real time.

:clock10: Live at 10am PT — streaming right here
:speech_balloon: Say hey in the comments if you’re tuning in! We’ll be chatting, sharing builds, and answering Qs all day.


Today we’re diving into prompt engineering and testing different LLMs side by side in a Chatbot Arena.
@angeliklaboy will show you how to spin up an app that compares outputs from GPT-4o, Claude, LLaMA, and more — and why prompt testing beats benchmarks every time.


What’s inside:

  • Quick overview of the LLM landscape
  • Why prompt testing > model benchmarks
  • How to build your own Prompt Arena in Retool

Jump in:
:inbox_tray: Download today’s resources
:ballot_box: Poll: Which LLM are you using most right now? (Vote below!)
:speech_balloon: Drop your first impressions or prompt results in the thread
:gift: Best Q or comment = surprise prize at the end of the week :eyes:


← Back to full AI Build Week schedule

4 Likes

:ballot_box: Poll: Which LLM are you reaching for most these days?

  • GPT-4o
  • Claude 3
  • Gemini
  • LLaMA / Mistral
  • Wait… what’s an LLM?
  • I’m just here for the vibes
0 voters
1 Like

Thanks for the tutorial! I get the following errors when running:
updateModelInDB failed (0.247s): type "llama" does not exist
updateModelInDB failed (0.288s): type "claude" does not exist
updateModelInDB failed (0.328s): trailing junk after numeric literal at or near "4o"
Any tips on what I did wrong?

I put quotes around the model_string entries in the updateModelsInDB query and that resolved it.

'{{ modelA.value.model_string }}'
2 Likes

Nice - thanks for sharing how you solved it!

2 Likes

Thanks to everyone who joined us live for Day 1 of AI Build Week — our Prompt Arena Showdown! :microphone::robot:
We explored how to compare LLM outputs and write smarter prompts, and y’all asked some :fire: questions. Here’s a quick recap of the top Qs (and answers):


:brain: Prompting & LLM Access

Q: Do I need to bring my own API keys to access different LLMs in Retool?
A: You’ll have access to default models in Retool AI, but if you want to use your own OpenAI, Anthropic, or other API keys, you can configure them directly.

Q: Are there any extra costs when using AI in Retool self-hosted?
A: If you're self-hosting and using your own model keys, charges will come directly from your LLM provider — Retool doesn’t add extra fees for that.

Q: Will we be discussing specific prompts in the demo?
A: Yep — we showcased prompt construction live, and you can rewatch the session anytime above!


:electric_plug: External APIs & Integration

Q: Can I integrate external APIs with these workflows?
A: Absolutely. Whether you’re calling third-party services, webhooks, or internal APIs, you can integrate them directly into your workflows or app logic.

Q: Can I make a chatbot built in Retool public?
A: Yes — Retool apps can be shared publicly, but make sure to configure proper permissions and security depending on what data and features you expose.


:mortar_board: Learning Retool & Sharing Takeaways

Q: Is there a certification or course I can take to learn more?
A: Check out Retool University for guided tutorials and educational content to help you get hands-on.

Q: Will these sessions be recorded?
A: Yep — all session recordings are available in the AI Build Week Community Hub. You can also rewind and rewatch anytime.


Let us know your top takeaway from Day 1 — or share how you’re thinking about building with LLMs! :point_down:
:mag_right: Up next: Day 2 – RAG + Vectors

1 Like

Correct me if I'm wrong.........

If there are multiple actions defined in the event handler, they won't be executed in a synchronized or sequentially blocking manner. Instead, they'll be triggered in order but will run in parallel.

This means that at 23:15, even if the first step is newMatch.trigger(), the following steps—modelAQuery.trigger() and modelBQuery.trigger()—won’t wait for the first one to complete. If newMatch.trigger() executes quickly, everything should work fine. However, if there's a delay (e.g., due to a database issue), the subsequent steps might either reuse cached model selections from a previous run or error out because the model hasn't been properly defined yet.

So I think the more guranteed method is to use custom script?

newMatch.trigger({
    onSuccess: function() {
        modelAQuery.trigger();
        modelBQuery.trigger();
    }
})