Hacker news

  • Top
  • New
  • Past
  • Ask
  • Show
  • Jobs

FLUX.2 [Klein]: Towards Interactive Visual Intelligence (https://bfl.ai)

202 points by GaggiX about 21 hours ago | 55 comments | View on ycombinator

vunderba about 14 hours ago |

I haven’t gotten around to adding Klein to my GenAI Showdown site yet, but if it’s anything like Z-Image Turbo, it should perform extremely well.

For reference, Z-Image Turbo scored 4 out of 15 points on GenAI Showdown. I’m aware that doesn’t sound like much, but given that one of the largest models, Flux.2 (32b), only managed to outscore ZiT (a 6b model) by a single point and is significantly heavier-weight, that’s still damn impressive.

Local model comparisons only:

https://genai-showdown.specr.net/?models=fd,hd,kd,qi,f2d,zt

codezero about 19 hours ago |

I am amazed, though not entirely surprised, that these models keep getting smaller while the quality and effectiveness increases. z image turbo is wild, I'm looking forward to trying this one out.

An older thread on this has a lot of comments: https://news.ycombinator.com/item?id=46046916

pajtai about 15 hours ago |

It cannot create an image of a pogo stick.

I was trying to get it to create an image of a tiger jumping on a pogo stick, which is way beyond its capabilities, but it cannot create an image of a pogo stick in isolation.

psubocz about 17 hours ago |

> FLUX.2 [klein] 4B The fastest variant in the Klein family. Built for interactive applications, real-time previews, and latency-critical production use cases.

I wonder what kind of use cases could be "latency-critical production use cases"?

pavelstoev about 17 hours ago |

If we think of GenAI models as a compression implementation. Generally, text compresses extremely well. Images and video do not. Yet state-of-the-art text-to-image and text-to-video models are often much smaller (in parameter count) than large language models like Llama-3. Maybe vision models are small because we’re not actually compressing very much of the visual world. The training data covers a narrow, human-biased manifold of common scenes, objects, and styles. The combinatorial space of visual reality remains largely unexplored. I am looking towards what else is out there outside of the human-biased manifold.

Mashimo about 12 hours ago |

Neat, I really enjoyed flux 1. Currently use z image turbo for messing around.

I will wait for invoke to add flux2 klein.

Nora23 about 8 hours ago |

How does this compare to GPT version in terms of interactive capabilities?

dfajgljsldkjag about 17 hours ago |

I appreciate that they released a smaller version that is actually open source. It creates a lot more opportunities when you do not need a massive budget just to run the software. The speed improvements look pretty significant as well.

airstrike about 16 hours ago |

2026 will be the year of small/open models

SV_BubbleTime about 18 hours ago |

Flux2 Klein isn’t some generation leap or anything. It’s good, but let’s be honest, this is an ad.

What will be really interesting to me is the release of Z-image, if that goes the way it’s looking, it’ll be natural language SDXL 2.0, which seems to be what people really want.

Releasing the Turbo/Distilled/Finetune months ago was a genius move really. It hurt Flux and Qwen releases on a possible future implication alone.

If this was intentional, I can’t think of the last time I saw such shrewd marketing.

tonyhart7 about 14 hours ago |

damn, they really counter attack after z-image release huh

good competition breed innovation