You might have seen the video of the so-called “first AI software engineer” named Devin by Cognition labs floating around. As the next step, you searched online forums for more information about the truth in this. If you are weak, chances are that you have succumbed to the rhetoric and narratives there – thinking about the possibilities of how developers are going to be replaced forever.
The truth is, AI is the new plastic, except, it’s not that bad, and it is here to stay.
I blame the short attention span and lack of depth and clarity in the general thinking process of the masses today. Now, if that is too much to ask for, I suggest you read the rest of the blog to bust these myths.
Devin’s claims and promises
To begin, the claim from Cognition labs is a well crafted pitch for investors. I don’t blame them. If you or I would have come up with some invention, making it a successful business would have been our choices too. Isn’t it? I do give credits to them because they have managed to stir the buzz around Devin successfully.
As per the claims, Devin was able to resolve 13.86% of the Github issues and PRs in a benchmarking test conducted using SWE-bench. This may not sound much, but it is almost 8x better than GPT4, and 26x better than ChatGPT3.5. ChatGPT was first made public in November 2022, and the last time both were evaluated was in Oct 2023. Without being biased, I think this is a great achievement and this had to happen – if not now, it will at some point in the near future. LLMs have opened Pandora’s box, but in a good way. But most of the time we fail to see the good side of it.
Cognition labs have a few YouTube videos to demonstrate a few use cases with Devin. They have been able to replicate how a developer would have developed a software, when they are assigned a task. Looking at those, if I had to make a hiring decision with Devin’s current capabilities, then my decision would have been a negative. As a product manager, that work does not meet my expectations, and moreover it takes significant chat time and effort to make it deliver the working result. Cognition labs may claimed it to be the “first AI software engineer”, but I am sure even they know there is a loonng way to go.
While all of this is happening, it is important to note that as a company they need to pitch the highest promise to position themselves. That’s not a bad attempt – look how developers around the globe are just waiting for Devin to come out in the public “and take their jobs”.
When Devin scores 100%?
For a moment, let’s assume Devin is able to resolve all the Github issues successfully on SWE-bench. To add fuel to your fire, let’s also assume that there are other models available for cheap/free at 99% accuracy. We cannot deny this possibility, can we? So that leaves us with the question – will software engineers be outdated? I mean this is a good theme for sci-fi fiction. But how can we completely ignore the amount of work that goes into developing a product or service?
Think about it – teams are hired for the UI/UX designs, for developing frontends and backends, CI/CD pipelines, etc. Then there are architectural aspects like tightening the security, following best practices, making sure of HA and DR, database and storage routines, etc. All these efforts have to align with the vision and roadmap which are unique to each organization and customers.
One thing I know about AI – and this is my firm belief – is that it will never develop independent consciousness. As an analogy, medical science has been successful in synthesizing biological body parts, but nobody has been able to figure out how to create a living entity by imparting a life pulse.
The other thing I know about AI is that it works on the data that exists today. Tech businesses and especially startups thrive because they want to disrupt industries with innovation, or at least bring something new to the table. That requires consciousness, which results in visions and dreams, which is required for progress of humanity and in general. Without it, we would still have been in the stone age, or perhaps not even that. Devin, is an outcome of the same.
Forget about the ultra broad/god level/jargony language in the last paragraph. When farm tractors were invented, it didn’t put farmers out of work. It simply scaled their produce. When we, the developers, were busy automating linear tasks and putting people “out of job” (well, they got reskilled/upskilled) without any guilt, why cry now? Innovation and evolution have always been opposed by the world, out of sheer laziness.
If Devin and similar AI products get to 100%, then I think we should break the whining pattern and welcome these changes, and adapt. This is a tall assumption anyway for 2024, we still have enough time to adapt! We are good at it, and it is definitely not the case where we are born with a predefined goal of writing x lines of code in this lifetime.
Let’s get real
Software engineering is one of the most complex jobs that exists today. It is an intricate blend of art and science, which requires us to think extremely logically at various levels with enough room for human touch. Think of it in this way – it is probably easier for Gen AI models to perform C-level work, than that of a software engineer’s.
What happened to ChatGPT? Has it replaced non-tech text based professionals? Or is it the case that these professionals are still working under the grace of their employers? In today’s world of mass layoffs, this is hard to believe.
Instead we have learnt to use Gen AI products to boost our productivity. And I mean boost, not cheat. Sure, you can blindly cheat but that will take you nowhere – we all know that. We have grown to identify the bs being generated by AI models. I am not even talking about measures taken by search engines or AI detector tools. Even we as individuals are able to gauge whether a piece of text is generated by AI or not.
Honestly, I think this is what would happen practically with tools like Devin as well (if it gets better). Perhaps, we can use Devins to resolve low-medium impact bugs and create a PR to begin with, while we let the real developers focus their efforts on more important issues.
To take this a step further, like any other ChatGPT offshoot “AI App”, if Devin is just a wrapper around Gen AI model, then there is no big deal associated with it. We could easily do the wrapper’s job in a much better way. Coding assistants like Github Co-pilot and AWS Code Whisperer still make more sense that way.
And on top of all this, the regulatory concerns, plus access to private data and code bases by organizations, to an AI software engineer with 13.86% accuracy… hmm.
PS: I am having way more fun reading about this topic on Reddit. Example. Just want to settle the uncertainty amongst new developers coming into tech. I understand it sucks to hear such news, but I personally think there is no need to worry in the near future.