AI code assistants have emerged as powerful tools that can aid in the software development life-cycle and can improve developer productivity. Unfortunately, such assistants have also been found to produce insecure code in lab environments, raising significant concerns about their usage in practice. In this paper, we conduct a user study to examine how users interact with AI code assistants to solve a variety of security related tasks. Overall, we find that participants who had access to an AI assistant wrote significantly less secure code than those without access to an assistant. Participants with access to an AI assistant were also more likely to believe they wrote secure code, suggesting that such tools may lead users to be overconfident about security flaws in their code. To better inform the design of future AI-based code assistants, we release our user-study apparatus and anonymized data to researchers seeking to build on our work at this link.
↫ Neil Perry, Megha Srivastava, Deepak Kumar, and Dan Boneh
I’m surprised somewhat randomly copying other people’s code into your program – violating their licenses, to boot – leads to crappier code. Who knew!
But doesn’t this also show an opportunity? If Gpt start giving more secure code then average code quality will improve.
Its bound to be better than copy and pasting from stackoverflow!
Adurbe,
People are lashing out at AI over their moral objections to it, there may even be a desire for the AI not to get better because of these objections. But I don’t think they will stop AI getting better with time and these technical criticisms will no longer apply in the future.
To be clear, when I say “AI”, I intend it as a superset of language models. I am very impressed with how far language models have come, but combining these with other techniques could address the known computational shortcomings of language models. Generative Adversarial Networks routinely beat humans in specialized domains using reinforcement learning. If we manage to combine the two, it seems like code generator that is superior to the best humans could be within reach.
Where do you think it got its code from…
The LLM’s get their code from examples, So if the code you’re below the average coder, then yes AI helps make it better, but if you’re already better than most use AI but double check what it spits out. Think of it as another coworker rather than some God like AI.
Exactly this. AI is a great tool for coding and debugging, IMO. But you still need to know what you’re doing in order for it to be useful.
I think you def need to add value to the code as a programmer/engineer and make it appropriate to the work you do. I see it as a resource, not a replacement.
But if I ask the AI to write me (say) an HTTP endpoint it could return “HTTP isn’t recommended please use HTTPS unless you have a specific need. Here is an example with HTTPS..”
Right now it’s quite in/out but given time it could start providing examples that incorporate best practice.
If your starting point is one where it’s done in a” secure” way you are more likely to end up with a secure solution (unless you choose to strip out the security aspects… Then that’s on you!)
I understand you’re point but thats not how these LLM’s work. It can’t reason. Only if its been trained on text that responds to prompts asking for HTTP endpoints suggesting HTTPS instead, will it give you a response like that. So being unsatisfied with its security suggestions is a criticism of the training data, and can most easily be corrected by more carefully curating the training data for this aspect which is time and expertise intensive. In addition to that, the sources for training are starting to dry up, and will continue to in light of AI training on them for free. There needs to be a better compensation model and some incentive for people to provide training data. Otherwise it will be a time capsule of computing from the dawn of the internet age until 2023. Just imagine if the AI had been trained on a source of C from the K & R era, then tried writing a modern C app. It would be a trip.
Adurbe,
I don’t think it would be hard to train a simple NN to do this, but if I’m being honest I often find such canned responses unhelpful when they come up on sites like stack overflow.
Anyway, I do agree with you that outputting a “best practices” and other documentation could be helpful along with the code.
Hrm. Seems like I’ve overlooked some backslashes, turning my “quote ends” into “quote starts”, and the ability to edit comments for a few minutes has disappeared.
Brendan,
You’re right: garbage in, garbage out.
Would you agree that stack overflow answers are useful for both AI as well as humans? How would you pollute the data for one but not the other?
The quotes make it sound like my head cannon!
You are completely right, source is important. If it’s Stackoverflow, it can benefit from the same peer review system or a GPT could Only use official docs.
I would be really interested in a solution that did that. Most needs/mistakes are because people haven’t read the source doc but instead guessed/interpret around it or simply didn’t understand how a function works.
A code example with sources from only offical amazon (AWS) and official Java docs would be incredibly powerful to me.
Adurbe,
Actually, what about the reverse? Given the source code as input, generate accurate and up to date documentation automatically. I think this is an awesome feature! Comprehensive and up to date documentation is something that even competent developers can struggle to stay on top of.
To be fair, sometimes official documentation is useless/incomplete/out of date, but it depends on the project. Sometimes I end up having to write a test stub and/or digging up clues from the library’s own source code.
The inadequate documentation of memory semantics can lead to software that inadvertently leaks memory. In my libav project, valgrind found tons of memory leaks, which I thought were my fault, but after cutting everything out of my own code valgrind was identifying upstream leaks. Obviously this is something that human developers find extremely difficult to do consistently. Rust-lang builds explicit correctness into the language, but for unsafe languages, perhaps this is another avenue where an AI assistant could help humans improve code quality.
That would be awesome… Hire me when you do it! *work for food
Nit picking here, but as you mention Rust worth pointing out that leaks aren’t considered a safety issue there, and are perfectly possible to cause with no unsafe blocks present.
PhilPotter,
I’m sorry, I’m having a bit of trouble understanding that last sentence.
If you’re saying safe languages can have bugs, then that’s true. Rust is useful against memory/thread faults, but it’s possible to write valid rust code to do the wrong thing. While rust is smart enough to stop you from loosing a reference to allocated memory without freeing it, it won’t stop you from keeping unneeded references around.
A high level code review may determine this to be a bug, but from a low level perspective it’s not considered “unsafe” to hold on to references even if it’s longer than needed.
As an amateur programmer, i did copy pasted code from the internet long ago. It is ok to get an example to play around and help understand an api or a concept, but final code should always be perfectly understood original code.
gagol2,
That’s true, some may look to the internet to “do their homework” so to speak, but frankly good examples can be extremely valuable to experienced programmers as well. Just recently I was hunting down example code for libav/ffmpeg, not because I intended to copy somebody else’s work, but because examples show you exactly what you need to do with an API that requires quite a lot of unclear boilerplate code. Sometimes working code is the best documentation.
If we take the attitude that “using knowledge acquired from copyright sources is infringement”, then everyone is guilty of it. It’s not a mater of malice, but just a matter of how we learn. If someone wants to take the attitude that humans should be allowed to learn from copyright sources but machines should not, then so be it, but at least own up to that and don’t pretend it’s not a double standard for AI.
Thom Holwerda,
That’s not automatically true though.
1) fair use is a thing, especially as contributions from any given work become infinitesimally small.
2) Copying the expression of something infringes copyright, but rewriting it (as these AI assistants do), is not traditionally considered infringing.
3) In principal, one could use a compatible license as the the training data, which is explicitly allowed.
It’s easy to see you’ve been bent on this message lately, but AI is technically doing the exact same thing every fleshy software developer has been doing all along. The difference is that it’s automated. Before punishing this human! behavior, we should seriously discuss what it means for all developers, not just the AI. Given too much power to inflict damages for copying bits of knowledge here and there, we could easily end up with abusive copyright trolls in the same vein as patent trolls.
No that is not the case. Fair use is 13 seconds. Be it an idea, a video or a piece of music. The US is however excluded from the ber conventions and can steal whatever they like. on the basis “what are you going to do about it” edison vs marconi is such an exemple.
NaGERST,
What isn’t the case, exactly?
Anyway, unfortunately today some massive copyright holders have turned to harassing those making fair use reproductions. If you were to publish even just 10s excerpt of commercial music monitored by one of these copyright abusers, you’d very likely get a takedown notice and accumulate copyright strikes on your channel despite the law. Youtube/google has become complicit in automatically favoring the take down notice and denying fair use rights by default. In this way many youtuber creators have become victims, disallowed from practicing their fair use rights 🙁
Yes and yes in both questions. However give a “task” to an AI to create a 8bit operating system it will just steal GeOS, AI is not there yet, but it is great.
The idea that AI simply reproduce what they have seen can only be taken seriously by people who have not talked to GPT-4.
Came across this article today…
“The tech sector is pouring billions of dollars into AI. But it keeps laying off humans”
https://www.cnn.com/2024/01/13/tech/tech-layoffs-ai-investment/index.html
I think it would be beneficial to cover the socioeconomic problems relating to AI replacing human jobs more directly. Human job displacement is a significant long term threat and I think this is going to be a far bigger challenge for society than perceived AI copyright issues or quality issues.
Alfman – sorry, it will only let thread comments reach five levels deep, so couldn’t reply above. I was attempting to point out that Rust will in fact let you loose a reference to allocated memory without freeing it. Example being freeing/deallocating a node with a shared reference cycle, but other possibilities too (often explicit albeit).
PhiliPotter,
Yeah, the problem with cycles is that the reference count never goes to zero.
https://doc.rust-lang.org/book/ch15-06-reference-cycles.html
GC languages like java can catch this (are there any exceptions?), but rust has no process to scan the memory hierarchy. I haven’t looked much into them, but it does appear that some people have worked on garbage collectors for rust…
https://github.com/withoutboats/shifgrethor
Although part of the appeal of rust is to have memory safety without a garbage collector.
Yeah if memory serves me rightly, the first Rust iterations from way back when actually did support/include optional GC. That could just be a ‘hallucination’ on my part though 😉
Don’t get me wrong though, big fan of the language so far based on what I’ve experienced. Trying to think of how I can sneak it in through the back door at work!
*lose even 🙂