A Red Cross Report on Cyber Attacks

The International Committee of the Red Cross (ICRC) has published an interesting report on humanitarian consequences of cyber attacks, it can be downloaded here (PDF) and a short summary can be found here.

It is really difficult to realize how pervasive Information Technology (IT) and Internet are today, and what the consequences of cyber attacks can be on everyday life.

Another nail in the coffin of SHA1

This recent paper “From Collisions to Chosen-Prefix Collisions – Application to Full SHA-1” by G. Leurent and T. Peyrin puts another nail in the coffin of SHA1. The authors present a chosen-prefix collisions attack to SHA1 which allows client impersonation in TLS 1.2 and peer impersonation in IKEv2 with an expected cost between 1.2 and 7 Million US$. The authors expect that soon it will be possible to bring down the cost to 100.000 US$ per collision.

For what concerns CA certificates, the attack allows, at the same cost, to create a rogue CA and fake certificates in which the signature includes the use of SHA1 hash, but only if the true CA does not randomize the serial number field in the certificates.

It is almost 15 years that it is known that SHA1 is not secure: NIST deprecated it in 2011, it should not have been used from 2013 and substituted with SHA2 or SHA3. By 2017 all browsers should have removed support for SHA1, but the problem is always with legacy applications that still use it: how many of them are still out there?

Patching timing is getting tight

The US Cybersecurity and Infrastructure Security Agency (CISA) has recently reviewed the guidelines (actually a Binding Operational Directive for US Federal Agencies) on patching (see the Directive here and a comment here).

Now vulnerabilities rated “critical” (according to CVSS version 2) must be remediated within 15 days (previously it was 30 days) and “high” vulnerabilities within 30 days from the date of initial detection by CISA weekly scanning.

Due to the short time between detection and remediation of vulnerabilities, applying patches in time is going to be difficult, due to the possibily missing availability of patches by the vendors and the time needed anyway to test them in each IT environment.

This implies that there must be in place processes to design, test and deploy temporary countermeasures to minimize to an acceptable level the risks due to these vulnerabilities. And these processes must be fast, they should take at most a few days.

The evolution of DNS

DNS, that is the Domain Name System protocol and services, is a fundamental pillar of Internet since it allows to resolve domain names in IP addresses. Recently the number and severity of attacks to the DNS infrastructure have increased noticeably (see for example this US-CERT Alert). At the same time, the discussion on who should manage and how this global infrastructure should be managed, keeps expanding.

Alternative proposals to the ICANN overseen global DNS infrastructure have appeared, starting from the “.onion” hidden TOR domains to, among others, the more recent OpenNIC project and the Blockchain-based BDNS system.

The security and privacy of Internet access and navigation depend crucially on the resolution of domain names to IP addresses. Even if the deployment of DNSSEC will help to improve security and privacy, it is badly needed to give more consideration DNS and help designing a forward path for it as a global service which must be able to guarantee access, privacy, security, integrity, fairness etc. It is a lot to ask, but we will really need it.

On Firefox “Send”

Mozilla has just released a new service, Firefox Send,  to share files with a higher level of security. Firefox Send is quite easy to use, just access the web-page and upload a file (up to 1GB, one needs to register to upload files up to 2,5GB). The service then returns a link to download the file which the user can choose to be valid up to 1 week and to 100 downloads. For an extra layer of security, the user can also add a Password which is then requested before the download.

Under the hood, the file is encrypted and decrypted in the browser of the user using the Web Crypto API and 128-bit AES-GCM. A short description of how encryption is implemented is provided in this page. The secret encryption key is generated by the browser and appended to the link returned by the server for the download, as in (this is not a valid URL)


where the last part of the URL is the secret key.

This is very nice and simple, but to achieve a higher level of security the user has to find a secure way to share with her parties the download link, and sending it by email is not a good idea.

Obviously the use of a Password which can be communicated in other ways (eg. by telephone) makes it more secure. Still the Password is used to create a signing key (with HMAC SHA-256) and uploaded to the server. Then the server checks that a user requesting the file knows the Password by making her sign a nonce. So the Password is not used to encrypt the file but only for an authentication exchange.

I have not found a full security and threat scenario description for this service (some information can be found here and here), but it would be nice to know which are the use-cases that Mozilla has considered for it. Moreover, from a very quick look at the available documentation, it is not very clear to me which are the information that the server can access during the full life-cycle of an uploaded encrypted file.

In any case, Firefox Send seems to be a new and possibly very interesting competitor in the arena of online file sharing services together with Dropbox, Google Drive etc.

Recent Results on Information and Security

I recently read two articles which made me think that we still do not understand well enough what “information” is. Both articles consider ways of managing information by “side channels” or through “covert channels”. In other words, whatever we do, produces much more information than what we believe.

The first article is “Attack of the week: searchable encryption and the ever-expanding leakage function” by cryptographer Matthew Green in which he explains the results of this scientific article by P. Grubbs et al. The scenario is an encrypted database, that is a database where column data in a table is encrypted so that whoever accesses the DB has no direct access to the data (this is not the case where the database files are encrypted on the filesystem). The encryption algorithm is such that a remote client, who knows the encryption key, can make some simple kind of encrypted searches (queries) on the (encrypted) data, extracting the (encrypted) results. Only on the remote client data can be decrypted. Now an attacker (even a DB admin), under some mild assumptions, with some generic knowledge of the type of data in the DB and able to monitor which encrypted rows are the result of each query (of which she cannot read the parameters), applying some advanced statistical mathematics in learning theory, is anyway able to reconstruct with good precision the contents of the table. A simple example of this is a table containing the two columns employee_name and salary, both of them with encrypted values. In practice this means that this type of encryption leaks much more information than what we believed.

The second article is “ExSpectre: Hiding Malware in Speculative Execution” by J.Wampler et al. and, as the title suggests, is an extension of the Spectre CPU vulnerability. Also the Spectre and Meltdown attacks have to do with information management, but in these cases the information is managed internally in the CPU and it was supposed not to be accessible from outside it. In this particular article the idea is actually to hide information: the authors have devised a way of splitting a malware in two components, a “trigger” and a “payload”, such that both components appear to be benign to standard anti-virus and reverse engineering techniques. So the malware is hidden from view. When both components are executed on the same CPU, the trigger alters the internal state of the branch prediction of the CPU in such a way to make the payload execute malign code as a Spectre speculative execution. This does not alter the correct execution by the CPU of the payload program, but through Spectre, extra speculative instructions are executed and these, for example, can implement a reverse shell to give external access to the system to an attacker. Since the extra instructions are retired by the CPU at the end of the speculative execution, it appears as if they have never been executed and thus they seem to be untraceable. Currently this attack is mostly theoretical, difficult to implement and very slow. Still it is based on managing information in covert channels as both Spectre and Meltdown are CPU vulnerabilities which also exploit cache information side-channel attacks.

Hardware Enclaves and Security

Hardware enclaves, such as Intel Software Guard Extension (SGX), are hardware security features of recent CPUs which allow the isolated execution of critical code. The typical threat model of hardware enclaves includes the totally isolated execution of trusted code in the enclave, considering all the rest of the code and data, operating system included, un-trusted. Software running in a hardware enclave has limited access to all data outside the enclave, whereas everything else does not have any access to what is inside the enclave, hypervisor, operating system and anti-virus included. Hardware enclaves can manage with very high security applications as password and secret-key managers, crypto-currency wallets, DRM etc.

But what could happen if a malware, for example a ransomware, is loaded in a hardware enclave?

First of all, a malware hidden in a hardware enclave cannot be detected since neither the hypervisor, operating system nor any kind of anti-virus can access it. The software to be loaded in a hardware enclave must be signed by a trusted entity, for example for SGX by Intel itself or by a trusted developer. This makes it more difficult to distribute hardware enclave malware, but not completely impossible. Finally, applications running inside a hardware enclave have very constrained access to the outside resources and it was believed that malware could use a hardware enclave (that is part of it could run in a hardware enclave) but that it was not possible for a malware to fully run inside an enclave without any component outside it.

M. Schwarz, S. Weiser and D. Gruss have instead recently shown in this paper that, at least theoretically, it is possible to create a super-malware run entirely from within a hardware enclave. This super-malware would be undetectable and could act as a normal malware on the rest of the system. At the moment countermeasures are not available, but similarly to the case of Spectre and Meltdown they could require hardware modification and/or have impact on the speed of the CPUs.

A New Theoretical Result on the Learnability of Machine Learning (AI)

Theoretical mathematical results have often little immediate practical application and in some cases initially can seem obvious. Still they usually are not obvious as such since it is quite different to imagine that a result holds true, and to prove it mathematically in a rigorous way. Moreover such a proof often helps explaining the reasons of the result and its possible applications.

Very recently a theoretical (mathematical) results in Machine Learning (the current main version of Artificial Intelligence) has been announced: the paper can be found in Nature here and a comment here .

Learnability can be defined as the ability to make predictions about a large data set by sampling a small number of data points. This is what usually Machine Learning does. The mathematical result is that, in general, this problem is ‘undecidable’, that is it is impossible to prove that it always exists a limited sampling set which allows to ‘learn’ (for example to always recognise a cat in an image from a sample of a limited number of cat’s images). Mathematicians have proven that Learnability is related to fundamental mathematical problems going back to Cantor’s set theory, the work of Gödel and Alan Turing, and it is related to the theory of compressibility of information.

This result poses some theoretical limits on what Machine Learning can ever achieve, even if it does not seem to have any immediate practical consequence.

Like a Movie Plot: the 3ve Defrauding Scheme

Recently I read about the interesting fraud on digital advertising named “3ve”. It is possible to read about it for example on ArsTechnica here or in a paper by Google and White Ops here. But I kept thinking about this story and how much (at least to me) it resembles a movie plot, something like an Ocean’s movie. So, as the first blog entry of 2019 I have decided to write down a short background of facts and ideas that resembles a movie plot. Obviously, in what follows most technical details are skipped or not completely described, but if interested you can read the articles I mentioned above on the true story.

So here it goes.

We are all used to the advertisements which appear on web-sites and mobile apps pages. Indeed it is quite simple to make little money by reserving space for advertisement on a web-site. These advertisement spaces are used by digital advertising companies. The idea is that when a visitor clicks on an advertisement, the owner of the web-site earns a very small amount of money. But how to get a lot of money out of it?

By now it is a well-known fraud to create web-sites with advertisements and have programs to click on the advertisements. Digital advertising companies have therefore introduced countermeasures to be able to distinguish between a real person and a program.

But as usual, it is possible to have “smart” ideas…

To simulate real “persons” it is possible to:

  • create a web-site with plenty of space for advertisement
  • make a contract with a digital advertising company to place advertisements on your web-site
  • develop a special program which clicks on the advertisements on your web-site
  • rent one or more botnets of Personal Computers (PCs) infested with malware
  • install your program through the malware on these PCs.

This will make it look like as if the owner of the PC has visited your web-site and clicked on the advertisement. In principle you should be paid accordingly.

However there are costs not only due to the rental of the botnet, but also to the special program which must be continuously developed and updated. Indeed digital advertising companies are well aware of this fraud and monitor all clicks on the advertisements to distinguish between a real person and a program. They collect a lot of information on the visitor making the click like: cookies, fingerprint of the browser and PC, navigation on the web-site, language of the user etc. The special program must be able to fake all this information and all checks the digital advertising companies keep introducing.

And this in not all, also anti-viruses sooner or later identify the malware, the special program and clean up the PCs, which requires to start all over again.

Moreover digital advertising companies check also the internet IP address of the PC, its geolocalisation and the time of access to be sure that they are consistent. For example: it is not possible that millions of different “persons” click on the same advertisement connecting from the same unique IP address, or that millions of americans click on an advertisement in Europe in the middle of the night in a language they typically do not understand.

To bypass these checks, it is simpler to adopt the following:

  • set up your own servers (without anti-viruses)
  • run on these servers multiple copies of the special program
  • assign to the servers appropriate IP addresses that mimics a real person including location, timezone, language etc.

This eliminates the need for botnets, related malware and updates due to anti-viruses detection.

But how to get appropriate IP addresses? And here comes the “smart” idea…

First of all, it is necessary to create a few Internet Provider companies, for example one in Europe, one in North America etc., which host the servers and provide also access to internet to some normal companies so to gain business credibility.

The next step requires a short reminder on how internet IP addresses are assigned. Regional Internet Registries like ARIN, RIPE, APNIC, assign blocks of IP addresses to companies which ask for them. A company which asks for IP addresses, is assigned one (or more) Autonomous System (AS) number to which in turn are assigned the blocks of IP addresses.

However if a company closes, goes bankrupt etc. for some time the AS number and the blocks of IP addresses remain assigned to the company but are unused. So here is the trick:

  • identify valid but unused AS numbers and blocks of IP addresses
  • create fake contracts between the companies rightful assignee of the AS numbers and the Internet Provider you have created to fake business credibility
  • assign these AS numbers to the Internet Provider routers (this is called “BGP Hijacking”)
  • assign and fast permute the related IP addresses to the servers running your program.

This way of hijacking IP addresses has been until recently with low chances of detection.

Another way of hijacking blocks of IP addresses is to steal unused IP addresses assigned to active companies, that is to used AS numbers. But in this case there are higher chances of detection due to company checks.

There exists a relative high number of unused AS numbers and unused blocks of IP addresses with different geolocalisation and this makes it possible to fake millions of clicks which will bypass the elaborate checks of the digital advertising companies. In this way it is possible to steal millions of dollars from digital advertising companies.

In a movie plot the story would end here, the money would be collected and the entire operation would be closed down forever.

In real life it is not easy to keep such a big operation unnoticed. Indeed sooner or later digital advertising companies would wonder why a certain web-site generates so many clicks and they will start to investigate and “follow the money”. From the technical point of view, after a deeper investigation it would turn out that the visits to that web-site all come from the same Internet Provider and that most of the companies which are “customers” of the Internet Provider are actually closed or bankrupted. Moreover, companies that monitor traffic on all their used and unused IP addresses, will easily detect if some addresses have been hijacked.

In the real case of 3ve, after having managed to defraud the digital advertising companies of $29M, some of the culprits have indeed been caught and apprehended.

AutoCAD malware

Recently I have paid some attention to AutoCAD and similar software. Not that I use them or that know much about them, but it definitively striked me both the complexity and the amazing features that some of these applications have. But with complexity, large number of features and dimension of code, come also vulnerabilities, even security vulnerabilities.

A few days ago I noticed this article (here a less technical summary) about AutoCAD malware, which has been around for more than 10 years. The purpose of this malware can be twofold: just another malware infecting channel, or more likely, a very targeted attack channel. Indeed CAD software is used for designing buildings, bridges, tunnels, roads etc., and some blueprints can be worth millions. Companies have taken notice of this, and security features have been introduced in the applications.

But the issue which does not seem to be appreciated enough (I have no statistics though, so I can be wrong on this) is the patching process (and this is not limited to CAD software but applies to other specialised software as for example digital audio or gaming). It seems to me that some of these applications are seldom updated (one needs to download/buy a new version) or that security patches are bundled together with new functionalities which can come at a cost, at least after the initial few years of support.

In my opinion, in an ideal world security patches should be provided for free to anyone until the program is supported. Obviously this can have economical impacts on the company producing the software and could require changes in the way software is built, sold and distributed (costs again).