Browser Extensions Updates

Title Browser Extensions Updates
Summary Clustering and analyzing browser extensions by update frequency
Keywords Web Security; Browser Extensions; Malware; ML
Prerequisites Python; JavaScript
Supervisor Pablo Picazo
Level Master
Status Draft

Extensions are small applications that either add new functionality to the browser or modify its appearance. In Chrome, every extension stored in the Web Store has its own unique 32 characters long identification that does not change across versions. Browser extensions have a compulsory file in browser extensions called the manifest. Additionally, extensions can include as many static files as needed, e.g., HTML, CSS, JavaScript, fonts, and images. Extensions are stored in private repositories that most vendors manage and users can install extensions from them. One such example is Web Store, the largest browser extensions repository provided by Chrome containing 200,381 extensions as of December 2019.

Once a user starts the installation procedure, Chrome downloads the .crx package of the extension to a temporal directory, it extracts all the files and parses the only mandatory file, the manifest.json. Finally, Chrome detects the family that the extension belongs, moves it to a permanent directory, and installs it in the browser.

In this project, students' goal is to cluster and analyze extensions based on the frequency update. To do that, extensions will have to crawl the Web Store and monitor the extensions to check how often extensions are added/updated/removed. The final goal is to cluster these extensions based on these parameters and check whether there is a correlation between these clusters and malware.