Code Reuse Gone Rogue: The Dangers of Overrelying on Third-Party JavaScript Code

March 25, 2020

Cristian-Alexandru Staicu


Code Reuse Gone Rogue: The Dangers of Overrelying on Third-Party JavaScript Code

Time:   11:00am
Location:   Zoom5 - https://zoom.us/j/5911012202 (pass: s3)

Traditionally, the server-side code of websites was written in languages such as PHP or Java for which security issues are well-studied and well-understood. Recently, though, full-stack JavaScript web applications emerged, which have both their client-side and server-side code written in this language. The benefits of such an approach are obvious, e.g., easy knowledge transfer across tiers and uniform usage of tools. However, JavaScript was designed as a scripting language with a thin API and it was expected to run in a tightly-controlled environment, e.g., a sandbox. Taking JavaScript outside of the browser and using it as a general purpose programming language represents a paradigm shift for the web community and, thus, the npm ecosystem emerged to support this change. Npm is supposedly the largest software repository in the world with more than a million reusable packages. Nevertheless, the lack of code isolation and code vetting, the various ways to abuse the JavaScript language, and the plethora of reported vulnerabilities and malware incidents make npm a dangerous ecosystem with unique challenges for the security community. In this talk, we start by analyzing the attack surface of npm, showing that transitive dependencies and the large number of human agents in the ecosystem represent an important risk. We then continue with showing that vulnerabilities in npm packages affect real-world websites and that a motivated attacker can craft exploits against production websites by analyzing third-party, open-source code. Finally, we present a technique for boosting the recall of existing security analyses on JavaScript code that heavily relies on third-party libraries. More precisely, our approach extracts taint specifications for npm packages by using dynamic analysis and by leveraging test suites available in clients of the target package.