Generative AI Models, Human In The Loop vs. Human Accountability

Posted on 2026-06-21 by Andrea Pasquinucci

“Human in the Loop” (HITL) is considered a requirement for all current Generative AI models, since we cannot yet trust them enough to really perform actions fully independently. But the question is: what does “Human in the Loop” mean in practice?

This is not an easy question to answer, as reported by those currently integrating AI Agents in their business workflows (see for example here with some recent information from Amazon, Google, IBM and Microsoft). And possibly a better approach is to start from what can be called “Human Accountability end to end,” so that humans are accountable for the entire business workflow, including all steps performed autonomously by AI Agents but without the need to directly verify each single step of their actions.

Unfortunately, Accountability is not easy to manage either, and in some cases even to attribute.

Generative AI: Creativity vs. Knowledge

Posted on 2026-01-28 by Andrea Pasquinucci

Anthropic published a quite interesting blog article, “Designing AI-resistant technical evaluations,” worth reading. Quoting from the article, “I needed a problem where human reasoning could win over Claude’s larger experience base: something sufficiently out of distribution. […] I implemented one medium-hard puzzle and tested it on Claude Opus 4.5. It failed. I filled out more puzzles and had colleagues verify that people less steeped in the problem than me could still outperform Claude.”

It should not come as a surprise that current Generative AI models have a larger knowledge base than any single one of us humans, but when creativity and deep reasoning are required to solve a puzzle, we can still beat them.

“VoidLink: Evidence That the Era of Advanced AI-Generated Malware Has Begun” by Check Point Research

Posted on 2026-01-21 by Andrea Pasquinucci

I am just quoting the first Key Point from this Check Point Research blog article:

Check Point Research (CPR) believes a new era of AI-generated malware has begun. VoidLink stands as the first evidently documented case of this era, as a truly advanced malware framework authored almost entirely by artificial intelligence, likely under the direction of a single individual.

Though quite technical, I recommend its reading to anyone involved or just interested in the interplay of Cybersecurity and Artificial Intelligence.

Are AI Coding Assistants Declining in Quality?

Posted on 2026-01-10 by Andrea Pasquinucci

“AI Coding Assistants Are Getting Worse” is an intriguing article on IEEE Spectrum. According to the author (Jamie Twiss, CEO of Carrington Labs), the quality of the AI Coding assistants is declining contrary to what it may look at first sight.

Indeed, the author noticed that some of the most recent AI models produce, more often than previous models, code which runs but which fails to perform as intended, even when given wrong instructions which cannot lead to a running code. In the opinion of the author, this can be due to the quality of the large volumes of training data needed by the latest models, and on the direct interaction of the AI Assistants with the users, which can “push” the AI models to produce code which runs.

Securing the AI software development tools: IDEsaster

Posted on 2025-12-07 by Andrea Pasquinucci

I just quote two MaccariTA (Ari Marzouk) statements:

IDEs were not initially built with AI agents in mind. Adding AI components to existing applications create new attack vectors, change the attack surface and reshape the threat model. This leads to new unpredictable risks.

[…]

AI IDEs effectively ignored the base IDE software as part of the threat model, assuming it’s inherently safe because it existed for years. However, once you add AI agents that can act autonomously, the same legacy features can be weaponized into data exfiltration and RCE primitives.

and, as you can read in the original post, it is not only “risks” but also a quite long list of vulnerabilities (with 24 CVE) which affect, one way or or another, almost all AI IDE tools.

The issue is one we saw many times in the past: first features and functionalities, then we fix security. I agree that without features and functionalities any software product does not make any sense, but with security as a post add-on, there is the well known risk to have to pay a large security bill for a long time.

The “Bizarre” Case Involving ChatGPT, a Divorce and Coffee Cups Reading

Posted on 2025-05-10 by Andrea Pasquinucci

I am not commenting on the case itself which you can read for example here, but just thinking about how good an AI, such as a chatbot, with a Human Interface is:

writes / talks as or better than an average human
is very convincing
answers questions on everything (knows it all!)
provides references (if asked) about anything, even imaginary!
is always available and answers within a few seconds
is very good at explaining
etc.

How can we not trust it?

AI and Security Bug Bounty

Posted on 2025-05-08 by Andrea Pasquinucci

This is not an AI problem, it is a Human problem.

Security Bug Bounty rewards those who find a security bug in an application. But what if I ask an AI chatbot to produce a report of a “new” vulnerability in an application and then send it to the application maintainer hoping to get the reward?

Actually, it seems that this has been going on for some time, see here for example, and is starting to overwhelm application maintainers.

AI tools can be very helpful in analyzing and discovering security vulnerabilities in applications, but they must be used as one of the tools in the security practitioner toolbox.

Artificial Intelligence and “Artificial Science”

Posted on 2025-05-07 by Andrea Pasquinucci

“A weird phrase is plaguing scientific papers – and we traced it back to a glitch in AI training data” is an interesting article about what can go wrong in training Machine Learning models. An error in scanning old printed scientific papers, and a similar error in translating from Farsi to English, made it so that the phrase “vegetative electron microscopy”, which is nonsense, became part of training datasets for many current advanced AI models and started appearing in published scientific papers.

The problem is how to get rid of this and similar other errors in AI training data.

Are these errors going to be our future “digital fossils”?

LLMs Still Not so Good at Math Reasoning

Posted on 2025-04-28 by Andrea Pasquinucci

A recent study (here the paper, and here some comments) shows that the latest LLMs models, though are getting good in mathematical computations, still lack mathematical reasoning, that is the ability to provide a detailed and exact proof of a mathematical statement with rigorous reasoning (unless they have been already trained with the proof or have access to it). The researchers evaluated some of the top LLMs on the six problems from the 2025 USA Math Olympiad just hours after their release, assuring in this way that the detailed solutions were not known to the LLMs.

Anthropic: “Reasoning models don’t always say what they think”

Posted on 2025-04-10 by Andrea Pasquinucci

Interesting article by Anthropic, it seems that there is still a lot to understand before reaching “Explainable AI”. Quoting: “our results point to the fact that advanced reasoning models very often hide their true thought processes, and sometimes do so when their behaviors are explicitly misaligned.“

Andrea Pasquinucci

A blog covering ICT, Security and Technology

Tag Archives: Artificial Intelligence