More

k9294 · 2026-05-03T09:51:33 1777801893

What is yours agentic development experience with elixir? I used to like elixir a lot during a pre agentic era, but with coding agents it feels like the language isn't the best choice - slow compile time, weak type system (at least it was a year ago, I know there is work on that front), small ecosystem...

chickensong · 2026-05-03T20:40:00 1777840800

I find agents work well with Elixir, but you should reach for it when the product benefits from Elixir/BEAM features. A slow compile time is a minor annoyance compared to the greater architectural decision. Elixir hot-reloading with Tidewave works well for agent loops though.

k9294 · 2026-05-03T09:40:21 1777801221

Small advice - make one repo “main” and link to it from the website instead of an organisation.

I wanted to star the project to track the progress but it feels a bit weird.. Which repo shall I track? Server? Cli? Sounds like a misc repos.

brendanmc6 · 2026-05-03T09:48:34 1777801714

Ooh thought I did that, thanks.

k9294 · 2026-04-23T06:53:18 1776927198

That's really cool!

One thing I'm confused with is how to create a shared resources like e.g. a redis server and connect to it from other vms? It looks now quite cumbersome to setup tailscale or connect via ssh between VMS. Also what about egress? My guess is that all traffic billed at 0.07$ per GB. It looks like this cloud is made to run statefull agents and personal isolated projects and distributed systems or horizontal scaling isn't a good fit for it?

Also I'm curious why not railway like billing per resource utilization pricing model? It’s very convenient and I would argue is made for agents era.

I did setup for my friends and family a railway project that spawns a vm with disk (statefull service) via a tg bot and runs an openclaw like agent - it costs me something like 2$ to run 9 vms like this.

k9294 · 2026-04-07T16:24:57 1775579097

We at ottex.ai use bunny.net to deploy globally an openrouter like speach-to-text API (5 continents, 26 locations, idle cost 3$).

Highly recommend their Edge Containers product, super simple and has nice primitives to deploy globally for a low latency workloads.

We connect all containers to one redis pubsub server to push important events like user billing overages, top-ups etc. Super simple, very fast, one config to manage all locations.

arto · 2026-04-08T05:23:26 1775625806

That's inspirational. Do you perhaps have an architectural writeup somewhere?

k9294 · 2026-04-08T09:04:57 1775639097

Nope, but I will think about this, thank you for the idea. Maybe it's time to start a technical blog for ottex

jonas21 · 2026-04-07T16:31:40 1775579500

Are cold starts an issue?

k9294 · 2026-04-07T19:49:55 1775591395

There is no cold starts at all. It’s running non-stop.

Bunny bills per resource utilization (not provisioned) and since we run backend on Go it consumes like 0.01 CPU and 15mb RAM per idle container and costs pennies.

k9294 · 2026-04-07T11:21:38 1775560898

Anecdotally, I’ve been seeing a lot of weird behavior from Opus when it decides, mid-execution, to switch to a different "simpler" solution, and that really pissed me off.

At one point, I carefully designed a spec document, forced Opus to reread it, create a plan with the planning tool that followed the spec, and use the task tool to track the implementation... AND AFTER OPUS READS THE FIRST FUCKING FILE, it says, "Oh, there are missing dependencies in project X. It’ll be hard to add them, so I’m going to throw away the whole plan and just do a simple fix..."

After that, I canceled my $200 Max plan, which I’d been subscribed to since June 2025, and decided to check out Codex

k9294 · 2026-04-07T07:54:17 1775548457

Try ottex.ai - it has an OpenRouter like gateway with most STT models on the market (Gemini, OpenAI, Groq, Deepgram, Mistral, AssemblyAI, Soniox), so you can try them all and choose what works best for you.

My favorites are Gemini 3 Flash and Mistral Voxtral Transcribe 2. Gemini when I need special formatting and clean-up, and Voxtral when I need fast input (mostly when working with AI).

k9294 · 2026-03-03T17:40:52 1772559652

You can test Gemini 3.1 Lite transcription capabilities in https://ottex.ai — the only dictation app supporting Gemini models with native audio input.

We benchmarked it for real-life voice-to-text use cases:

                <10s    10-30s   30s-1m    1-2m    2-3m
  Flash         2548     2732     3177     4583    5961
  Flash Lite    1390     1468     1772     2362    3499
  Faster by    1.83x    1.86x    1.79x   1.94x   1.70x

  (latency in ms, median over 5 runs per sample, non-streaming)

Key takeaways:

- 1.8x faster than Gemini 3 Flash on average

- ~1.4 sec transcription time for short to medium recordings

- ~$0.50/mo for heavy users (10h+ transcription)

- Close to SOTA audio understanding and formatting instruction following

- Multilingual: one model, 100+ languages

Gemini is slowly making $15/month voice apps obsolete.

simianwords · 2026-03-03T17:44:33 1772559873

You know what would be great? A light weight wrapper model for voice that can use heavier ones in the background.

That much is easy but what if you could also speak to and interrupt the main voice model and keep giving it instructions? Like speaking to customer support but instead of putting you on hold you can ask them several questions and get some live updates

k9294 · 2026-03-03T18:27:08 1772562428

It's actually a nice idea - an always-on micro AI agent with voice-to-text capabilities that listens and acts on your behalf.

Actually, I'm experimenting with this kind of stuff and trying to find a nice UX to make Ottex a voice command center - to trigger AI agents like Claude, open code to work on something, execute simple commands, etc.

stri8ted · 2026-03-03T17:56:53 1772560613

Can you show some comparisons for WER and other ASR models? Especially for non english.

k9294 · 2026-03-03T18:20:22 1772562022

I've been experimenting with Gemini 3.1 Flash Lite and the quality is very good.

I haven't found official benchmarks yet, but you can find Gemini 3 Flash word error rate benchmarks here: https://artificialanalysis.ai/speech-to-text/models/gemini — they are close to SOTA.

I speak daily in both English and Russian and have been using Gemini 3 Flash as my main transcription model for a few months. I haven't seen any model that provides better overall quality in terms of understanding, custom dictionary support, instruction following, and formatting. It's the best STT model in my experience. Gemini 3 Flash has somewhat uncomfortable latency though, and Flash Lite is much better in this regard.

k9294 · 2026-03-03T16:29:32 1772555372

Gemini 3.1 Flash-Lite is our most cost-efficient Gemini model, optimized for low latency use cases for high-volume, cost-sensitive LLM traffic.

It provides a significant quality increase over Gemini 2.0 Flash-Lite and Flash-Lite models, matching Gemini 2.5 Flash performance across key capability areas:

Improved response quality: Aims to match 2.5 Flash performance and align with target Flash-Lite use cases.

Improved instruction following: Targeted improvements to serve as a reliable migration path for complex chatbot and instruction-heavy workflows.

Improved audio input: Improved audio-input quality for tasks like Automated Speech Recognition (ASR).

Expanded thinking support: You can control how much reasoning the model performs by choosing from minimal, low, medium, or high thinking levels. This feature lets you balance response quality and speed for your specific use case.

---

Already available in Google AI Studio and OpenRouter

https://openrouter.ai/google/gemini-3.1-flash-lite-preview

k9294 · 2026-02-17T10:36:56 1771324616

Try ottex with Gemini 3 flash as a transcription model. I'm bilingual as well and frequently switch between languages - Gemini handles this perfectly and even the case when I speak two languages in one transcription.

k9294 · 2026-02-17T10:34:25 1771324465

You can try ottex for this use case - it has both context capture (app screenshots), native LLMs support, meaning it can send audio AND screenshot directly to gemini 3 flash to produce the bespoke result.