Why isn't everyone using Cerebras?

3 points by tghack a day ago

I work at a mid-sized startup dealing with latency issues in customer-facing flows that use LLMs. Using OSS-120B seems preferable to 5-mini or Anthropic models in many cases when we need speed, intelligence, and cost control. Is there some catch here beyond needing to acquire higher rate limits?

jpau 21 hours ago

I love Cerebras. I also love that they've started to scale rate limits to useful levels (which is relatively new).

I still don't know how long they'll support our chosen model.

On Oct 22 I got an email saying that

```

- qwen-3-coder-480b will be available until Nov 5, 2025

- qwen-3-235b-a22b-thinking-2507 will be available until Nov 14, 2025

```

That's not a lot of notice!

I don't want to spend all my time benchmarking new models for features I already built. I don't want my users' experience to be disturbed every few months.