| | Last week I built a local pipeline where a state machine + LLM watches my security cam and yells at Amazon drivers peeing on my house. State machine is the magic: it flips the system from passive (just watching) to active (video/audio ingest + ~1s TTS out) only when a trigger hits. Keeps things deterministic and way more reliable than letting the LLM run solo. LLM handles the fuzzy stuff (vision + reasoning) while the state machine handles control flow. Together it’s solid. Could just as easily be swapped to spot trespassing, log deliveries, or recognize gestures. TL;DR: gave my camera a brain and a mouth + a state machines to keep it focused. Repo in comments to see how it’s wired up. submitted by /u/Weary-Wing-6806 |
edit/fyi: i originally posted this on their official sub, but they literally locked the thread…
Traditional search engines have historically relied on keyword search.
By Harshad SaneRanker is one of the largest and most complex services at Netflix. Among many…
Large language models (LLMs) perform well on general tasks but struggle with specialized work that…
The flexibility of Google Cloud allows enterprises to build secure and reliable architecture for their…
Gebbia was reportedly spotted at a San Francisco coffee shop using an unidentified pair of…