

the 1000x before bit has quite a few sideffects to it as well.
- lesser used languages suffer because there’s not enough training data. this gets annoying quickly when it overrides your static tools and suggests nonsense.
- larger training sets contain more vulnerabilities as most code is pretty terrible and may just be snippets that someone used once and threw away. owasp has a top 10 for a reason. take input validation for example, if I’m working on parsing a string there’s usually context such as is this trusted data or untrusted? if i don’t have that mental model where I’m thinking about the data i might see generated code and think it looks correct but in reality its extremely nefarious.
they promised Delamain and all we got is Brendan