The First AI Software Engineer.
Devin, the "first AI Software Engineer", is apparently 2x-3x better than industry standards at the moment (re: SWE bench), being able to solve multitude of software eng. problems - from basic de-bugging to full on Computer Vision Upwork jobs (or at least that's what the demo shows).
My thoughts? It is important to note that this benchmark is fairly recent since it comes from ICLR 2024, so I would be a bit conservative, but still very impressive. The demo also claims it solves problems "unassisted" (whilst Claude 2, Llama 13B, GPT-4, etc. need assistance even when they're fine-tuned). I would be conservative here again as to what exactly "unassisted" means.
It is also important to note that trends come and go (Remember AutoGPT?). Even non-OpenAI LLM use cases are running away from LangChain-based frameworks, which were all the hype last year, being replaced with LlamaIndex frameworks or even simpler RAG-chatbots.
Overall, this is definitely an exciting step in the "right" direction, but we're still quite a bit away from the future of a "1 PM - 1 Devin" product team. Gartner Hype Cycle thinking would suggest this will likely be a prominent reality in 5-10 years, but who knows? Thoughts anyone?