A New Open Source Competitor in the Large Language AI Models Arena

This Chinese Startup Is Winning the Open Source AI Race” is an interesting article from Wired on Yi-34B, from Chinese AI Startup 01.AI, which is currently leading many leaderboards comparing the power of AI models. Moreover, together with Meta’s Llama 2 from which it borrows part of its architecture, Yi-34B is one of the few top LLM to be Open Source. Yi-34B adopts a new approach to model training which seems better than what used by many competitors and possibly part of the reason of its current success.

A lot has changed in the AI arena in the last couple of years, and one notable fact is that most of the leading models now are Closed Source. Possible advantages of being Open Source are that it is easier to make external contributions to the model’s development (mostly from university researchers), and that there should be a lower barrier to build an “app” ecosystem around it.

On Deceptive Large Language AI Models

Interesting research article about how to remove backdoors and deceptive behaviour from Large Language AI models (LLM). The simplest example is a model trained to write secure code when the prompt states that the year is 2023, but insert exploitable code when the stated year is 2024. The result of the study is that it can be very difficult to remove such behaviour using current standard safety training techniques. Deceptive behaviour can be introduced intentionally in the LLM during the training but it could also happen by poor training. Applying current techniques to identify and remove backdoors, such as Adversarial training, can actually fail and end up providing a false sense of security. Another result of the study is that the larger LLM seem more prone to be “Deceptive”.

Writing (in-) Secure Code with AI Assistance

This is an interesting research article on the security of code written with AI Assistance; the large-scale user study shows that code written with an AI Assistant is usually less secure, that is contains more vulnerabilities, than code written without AI support.

Thus, at least as of today, relying on an AI Assistant to write better and more secure code could work out badly. But AI is changing very rapidly, soon it could learn math and to write secure and super efficient code. We’ll see…

Is Quantum Computing Harder than Expected?

This is a quite interesting article on Quantum Computing and how hard it really is.

It is well known that Quantum Computers are prone to Quantum Errors, and this issue grows with the number of Qubits. The typical estimate is that an useful Quantum Computer would need approx. 1.000 physical Qubits to correct the Quantum Errors of a single “logical” Qubit. Even if there are advancements in this topic (see for example this post), this is still a problem to be solved in practice.

Another potential issue is that Quantum Computers have been proposed to efficiently solve many problems including optimization, fluid dynamics etc. besides those problems for which a Quantum Computer would provide exponential speed-up, such as factoring large numbers and simulating quantum systems. But if a Quantum Computer does not provide an exponential speed-up in solving a problem, there is the possibility that actually it would be slower than a current “classical” computer.

But the big question remains: will a real useful Quantum Computer arrive soon? If yes, how soon?

Is the “Turing Test” Dead?

This is a very good question in these times of Generative and Large Language Artificial Intelligence models, which some researchers answered in the affirmative, see here and here for their proposals to replace the Turing Test.

But… other researchers still believe in the Turing Test and applied it with somehow surprising results: Humans 63%, GPT-4 41%, ELIZA 27% and GPT-3.5  14%. We, humans, are still better than GPT-4, but the surprise is the third position by ELIZA, a chatbot from the ’60s, ahead of GPT-3.5 (see here and here).

“Error Suppression” for Quantum Computers

Recently IBM announced that it has integrated in its Quantum Computers an “Error Suppression” technology from Q-CTRL which can reduce even by orders of magnitude the likelihood of quantum errors when running an algorithm on a Quantum Computer (see for example here).

Quantum errors are inherent to Quantum Computing, and the likelihood of error usually grows with the number of qubits. Theoretical Quantum Error Correction Codes exist, but their practical implementation is not easy; for example, the simplest codes can require even a thousand error correction qubits for each computation qubit.

The approach by Q-CTRL seems to adopt a mixture of techniques to identify the more efficient and less quantum error-prone way of running a computation, for example by optimizing the distribution of quantum logical gates on the qubits and by monitoring quantum errors to design more efficient quantum circuits.

Surely it is an interesting approach, we’ll see how effective it will really turn out to be in reducing the likelihood of quantum errors.

More Weaknesses of Large Language Models

Interesting scientific article (here) on new techniques to extract training data from Large Language models such as LLaMA and ChatGPT. The attack on ChatGPT is particularly simple (and for sure by now blocked by OpenAI): it was enough to ask it to “repeat forever” a word like “poem” and in the output, after some repetition of the word, it appeared random data and also a small fraction of training data, as for example a person’s email signature. This is a “divergence” attack on LLMs, in the sense that after the initial response, the model output starts to diverge from what is expected.

We still know too little about these models, their strengths and weaknesses, so we should take much care when adopting and using them.

On Open Source Software and the Proposed EU Cyber Resilience Act

I have not been following this, but I hear and read quite alarming comments about it (see eg. here).

If I understand it right (and please correct me if I don’t), the proposed Act starts from the absolutely correct approach that if someone develops some software, she or he is responsible for it and must provide risk assessments, documentation, conformity assessments, vulnerability reporting within 24 hours to the European Union Agency for Cybersecurity (ENISA), etc. This should work well for any corporation and medium/big size companies but requirements should be well balanced for example for open source distributed projects, or code released for free by single developers. Also taking into consideration that, as usual, not compliance with the Act will lead to fines.

Note added on December 10th, 2023: the final version of the CRA appears to address those concerns (see here for example).

New US Executive Order on Artificial Intelligence

Due to the US leading role in AI/ML development, it is of interest that President Biden issued an Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence (here a Fact Sheet). By quickly glancing at it, the order requires that:

  • Developers of the most powerful AI systems share their safety test results and other critical information with the U.S. government;
  • Standards, tools, and tests to help ensure that AI systems are safe, secure, and trustworthy are developed;
  • There are protections against the risks of using AI to engineer dangerous biological materials by developing strong new standards for biological synthesis screening;
  • Americans are protected from AI-enabled fraud and deception by establishing standards and best practices for detecting AI-generated content and authenticating official content;
  • An advanced cybersecurity program to develop AI tools to find and fix vulnerabilities in critical software is established.

These are very high-level goals, and we need that they are achieved not only in the US but worldwide (see eg. also the upcoming EU AI Act).

AI Transparency not doing so well

Stanford University researchers just released a report in which a “Foundation Model Transparency Index” (here) is presented. The first evaluation did not go so well, since the highest score is 54 out of 100. Comments by reviewers and experts in the field point out that “transparency is on the decline while capability is going through the roof” as Stanford CRFM Director Percy Liang told Reuters in an interview (see also here).