Securing the AI software development tools: IDEsaster

I just quote two MaccariTA (Ari Marzouk) statements:

IDEs were not initially built with AI agents in mind. Adding AI components to existing applications create new attack vectors, change the attack surface and reshape the threat model. This leads to new unpredictable risks.

[…]

AI IDEs effectively ignored the base IDE software as part of the threat model, assuming it’s inherently safe because it existed for years. However, once you add AI agents that can act autonomously, the same legacy features can be weaponized into data exfiltration and RCE primitives.

and, as you can read in the original post, it is not only “risks” but also a quite long list of vulnerabilities (with 24 CVE) which affect, one way or or another, almost all AI IDE tools.

The issue is one we saw many times in the past: first features and functionalities, then we fix security. I agree that without features and functionalities any software product does not make any sense, but with security as a post add-on, there is the well known risk to have to pay a large security bill for a long time.

The “Bizarre” Case Involving ChatGPT, a Divorce and Coffee Cups Reading

I am not commenting on the case itself which you can read for example here, but just thinking about how good an AI, such as a chatbot, with a Human Interface is:

  • writes / talks as or better than an average human
  • is very convincing
  • answers questions on everything (knows it all!)
  • provides references (if asked) about anything, even imaginary!
  • is always available and answers within a few seconds
  • is very good at explaining
  • etc.

How can we not trust it?

AI and Security Bug Bounty

This is not an AI problem, it is a Human problem.

Security Bug Bounty rewards those who find a security bug in an application. But what if I ask an AI chatbot to produce a report of a “new” vulnerability in an application and then send it to the application maintainer hoping to get the reward?

Actually, it seems that this has been going on for some time, see here for example, and is starting to overwhelm application maintainers.

AI tools can be very helpful in analyzing and discovering security vulnerabilities in applications, but they must be used as one of the tools in the security practitioner toolbox.

Artificial Intelligence and “Artificial Science”

A weird phrase is plaguing scientific papers – and we traced it back to a glitch in AI training data” is an interesting article about what can go wrong in training Machine Learning models. An error in scanning old printed scientific papers, and a similar error in translating from Farsi to English, made it so that the phrase “vegetative electron microscopy”, which is nonsense, became part of training datasets for many current advanced AI models and started appearing in published scientific papers.

The problem is how to get rid of this and similar other errors in AI training data.

Are these errors going to be our future “digital fossils”?

LLMs Still Not so Good at Math Reasoning

A recent study (here the paper, and here some comments) shows that the latest LLMs models, though are getting good in mathematical computations, still lack mathematical reasoning, that is the ability to provide a detailed and exact proof of a mathematical statement with rigorous reasoning (unless they have been already trained with the proof or have access to it). The researchers evaluated some of the top LLMs on the six problems from the 2025 USA Math Olympiad just hours after their release, assuring in this way that the detailed solutions were not known to the LLMs.

Anthropic: “Reasoning models don’t always say what they think”

Interesting article by Anthropic, it seems that there is still a lot to understand before reaching “Explainable AI”. Quoting: “our results point to the fact that advanced reasoning models very often hide their true thought processes, and sometimes do so when their behaviors are explicitly misaligned.

AI and Professional Work

Apologies if I am late on these considerations but the implementation of the AI Act brings up an interesting aspect.

As ethically required, we will need to declare which parts of professional jobs are performed not by us humans, but by AI assistants/agents. But there are also free, or almost free, AI assistants/agents that can be directly used by the customers.

I am not a market expert, but I can imagine that this will lead to (and it reminds me of years ago when Google Search arrived): 1) professional jobs performed without or with little AI assistants/agents’ contribution, + 2) direct customers use of AI assistants/agents to perform the job; and very little in between.

A non-IT example can be the legal profession: asking an AI chatbot for legal advice is for free (or it can appear to be so), but it is not the same as paying a lawyer for legal support.

Image Recognition and Advanced Driving Assistance

Securing the Perception of Advanced Driving Assistance Systems Against Digital Epileptic Seizures Resulting from Emergency Vehicle Lighting” is an interesting research study on the current status of image recognition for advanced driving assistance and autonomous vehicle systems. The study found that some standard Driving Assistance Systems can be completely confused by emergency vehicle flashers with the risk of becoming the cause of serious incidents. Machine Learning models can be part of the cause of this vulnerability, as well as part of the solution proposed by the researchers called “Caracetamol“.

Quantum Computers and Error Correction / Mitigation

Error correction still remains one of the main hurdles in the development of Quantum Computers. Recent developments (see here) by IBM point to first gaining performance improvements and error mitigation on Quantum Computers with a limited number of qubits, such as the latest 156 qubits IBM’s Heron processor, instead of pushing for Quantum Computers with thousands of qubits without having a better approach to error mitigation and correction.