What an exciting event! This was my first time participating in the OpenStack Summit series, and the May 2014 summit was located in hot and rainy Atlanta
GA left me with sense of being part of something big, and a strong desire to participate in the upcoming event (and not just because of the Paris location). As you entered the event, you could see the sponsor wall proudly presenting PrivateCore among many great OpenStack companies.
The show floor was very busy, and the casual dress code suggested this is going to be a fun event, where I would get my fair share of geeking out time. As you can read below, I wasn’t disappointed.
OpenStack is a growing force as indicated by the bi-annual user-survey. And the survey tracks Dev/QA, PoC, and Production deployment stages independently. Thank you OpenStack community for some great information!
Being a founder of a security company, I have a slight security bias, and the first two days offered a wealth of security-related talks. Below are some notes that I thought might be interesting to PrivateCore blog readers.
Russell Haering talk on Multi-Tenant Bare Metal Provisioning with Ironic triggered a set of question around firmware security. The problem presented by several attendees is the following: “how could one detect or prevent a bare metal tenant attempt to reflash the BIOS firmware or any other IO-device firmware?”. My best recommendation for detecting firmware updates that will run on the main CPU is to take advantage of the Trusted Platform Module (TPM) chip on your servers to validate the firmware before any sensitive data touches the server. Our vCage Manager can be of help here. As for IO-device firmware, unfortunately, the answer is not as simple, and my design assumption is assume these IO-devices as malicious, and build your stack to defend against them.
Next was the Bryan D Payne talk on Security for Private OpenStack clouds. The talk was more of an open discussion with OpenStack operators rather than a presentation, providing the opportunity to hear back from the community about their best practices. What caught my attention was a comment from one of the security operators at Yahoo. His claim (if I understood correctly) was that they assume every guest VM will be compromised. So far no big news. Then he added that they assume compromised guest VMs will successfully escape to the hypervisor. Now that is some bold statement. Later he explained to me that through Nova message signing, even compromised hypervisors do not have much of a say on their Control Plane. Unfortunately, our conversation was interrupted, and I was left without understanding the full architecture, I hope to catch up with him back in the Bay Area.
While walking the expo floor I had a chance encounter at the demo theater with an interesting technology from HGST. As you can see, HGST is working on an open architecture, turning a hard-disk into a Linux server. The hard-disk has a dedicated CPU, memory and ethernet port. It runs Linux, and allows applications such as distributed file-system to run directly on the disk, saving CPU cycles, and all related trips on the server bus. My interest in this advancement relates to the possibility of turning this into an “hardware implant for script-kiddies”. In my blog earlier this year, I touched on a leaked NSA software implant called IRATEMONK – a firmware implant affecting many vendor hard-disk controllers, and allowing a stealthy MBR code injection. With the new work from HGST, anyone capable of writing a Linux application will likely be able to do the same. Technology innovation frequently happens without considering the security implications.
As sponsors of the event we had a space to present our warez, and had many lively discussions with the summit crowd. To my pleasant surprise, most attendees we spoke with understood TPMs, Intel Trusted Execution Technology (Intel TXT) and general Trusted Computing concepts. This resulted in lots of deep discussions about implementation of the technology in their environment – the OpenStack crowd understood the value of system integrity controls that PrivateCore brings to OpenStack.
If you had a chance to join Keith Basil TripleO talk, you should have noticed the slide showcasing PrivateCore’s technology integration into OpenStack on OpenStack (TripleO). We have not publicly shared details of integration, but if you are interested learning how trusted computing plays directly into cloud deployment and management, please get in contact with us for a preview.
See you all at November’s OpenStack Summit in Paris!
* Replace Target with your favorite retail chain.
The recent news that Target, Neiman Marcus and perhaps three other retailers suffered breaches involving large volumes of data pilfered is raising concerns among retail security professionals. While details are sketchy and there are plenty of unknowns, it appears that “memory scraping” (also called “RAM scraping”) malware might have played a part in the compromise. There is plenty of research and alerts around memory scraping malware found here, here and here. This sort of malware has been around a while – check out this Dark Reading article from 2009 and this 2009 Verizon Data Breach Investigations piece.
What is memory-scraping malware? What we have seen to date has affected retail point-of-sale (POS) systems and potentially backend systems that are processing various types of payment cards (credit cards, debit cards, prepaid cards, etc.). While standards like the Payment Card Industry Data Security Standard (PCI DSS) call for encrypting cardholder information while at rest (storage) and in transit (in motion on the network), cardholder information is typically unencrypted while in use (memory). If you can access the POS system or server memory, you can extract its contents including the cardholder information.
The data format of such information is clearly defined (see ISO/IEC 7813 and 7816), so attackers can simply implement suitable algorithms in malware which is then installed on the POS machines to harvest cardholder information in memory with those formats in mind.
How can you protect against this sort malware? Antivirus is certainly a necessary component required by PCI DSS for systems handling cardholder information, but AV has been demonstrated to be less than effective in stopping sophisticated threats and updating AV on isolated networks is cumbersome.
One promising countermeasure is attestation. Attestation protects against persistent malware on immutable, “gold” base software images, and ensures – using cryptographic principles and components – that both hardware and software are unchanged. Attesting to the integrity of server and POS systems would validate that the machine (hardware and software) is clean of malware. If a machine was infected, it would fail attestation and could be examined and remediated. Proper attestation supported by strong cryptography would eliminate any chance for otherwise undetected malware persisting.
Naturally, there could be some infection that occurs after attestation that could exploit vulnerabilities, but periodically attested systems (which would typically require a reboot) minimize this window of vulnerability (or opportunity, depending on your perspective). In this situation, malware could infect a machine after it was attested in a known, good state, but that malware would be wiped away the moment the system reboots and that would be validated when the system re-attests.
A normal, stateful machine suffers from malware that can use its hard-drive, or other components, to persist. A stateless machine that relies on a locked-down, base software image and is periodically attested avoids malware that might try burrow its way into a stateful component. POS systems, as well as transaction processing backend systems, are not intended to run arbitrary code. Validating (attesting) such systems against a known, good software image would dramatically reduce the window of opportunity for attackers.
Security measures typically require some change in technology and processes. One change of periodically attesting systems is that it would require downtime as systems reboot and applications restart. The impact of this change could be minimized by rebooting during off hours for POS machines and this could be done in a round-robin fashion among a high-availability (HA) server cluster for mission-critical servers. POS systems are natural candidates for being stateless as they handle stateless data.
No security countermeasure is going to stop all attacks all the time – technology is extremely complex and attackers are very clever. While details of the exact circumstances around the breaches at Target, Neiman Marcus, and other retailers are still unknown, my speculation is that attesting systems would have reduced the chance of a successful attack and minimized the damage of any successful attack by reducing the attack duration.
We’ve recently seen a spate of news stories about hardware-based attacks. For instance, two recent attempted bank heists at Bank Santander and Barclay’s involved criminals stealing millions of dollars via malicious hardware devices. More concerning, recently leaked documents indicate that the NSA may have collaborated with hardware manufacturers to subvert cryptographic hardware implementations. Researchers recently proposed new ways to create hardware backdoors at the sub-gate level, making it hard to detect even to someone inspecting circuit layouts. But are these hardware risks relevant to servers that we use in the cloud?
Modern servers are comprised of many components: processors (CPUs), memory (RAM), disks, buses, network cards, and human interface devices. Each of these components pass through the hands of manufacturers, vendors, supply chains, integrators, and service personnel before ultimately ending up inside servers processing your sensitive data. That server itself may be housed off-premise or be leased from another organization, such as a cloud service provider. Most organization rely on at least some servers that are outside their physical control.
The risk with this loss of control is that is that anyone with access to those components, at any moment in time, could compromise a component or substitute a malicious device in its place. There are many well known boot integrity vector attacks where an attacker could subvert firmware in a system, for example the “Evil Maid Attack”.
Network cards are particularly risky since they have Direct Memory Access (DMA) to all system memory and can exfiltrate stolen data over the network. Some network devices have been found to have remotely exploitable vulnerabilities that allow an attacker to take control of the card and subsequently the host system.
Enterprises are increasingly adopting in-memory architectures to reduce application latency. However, as in-memory architectures become more common, more sensitive data is persisted in memory for long periods of time. While commonly used RAM is generally volatile, it can actually persist its contents after a system loses power. This allows an attacker to literally freeze memory and read its contents in what’s called a “cold boot” attack.
More worrisome, RAM is becoming more persistent with technologies like non-volatile memory in DDR3 form factors. By design, these memory technologies persist contents of memory when power is lost — just like a disk. Attackers could simply walk away with a memory DIMM containing not only private data and software, but also critical cryptographic keys used to secure data at-rest encryption.
PrivateCore’s philosophy to addressing vulnerabilities in hardware and the risk of persistent memory is to minimize the number of components that users must trust in a server. With today’s technology, it’s possible to reduce a server’s security boundary to just one component: the CPU. From just the CPU, it’s possible to establish a trusted compute base safe from the rest of the components in the system.
Protecting against hardware threats is not easy. Some organizations closely audit their supply chains to ensure the provenance of firmware and devices in their systems, and operate their servers in tightly controlled physical environments. For cloud environments or remote locations, this may not be an option. In those cases, minimizing the trust perimeter to a single component may be the best option to reduce the threat of hardware vulnerabilities and protect sensitive data in-use.