Collaborative Filtering Recommendation System Location Content-based

Title Collaborative Filtering Recommendation System Location Content-based
Summary Analyze the content stored in Collaborative Filtering Recommendation System based on the location of the users
Keywords Web Security; Location-based; CFRS; Privacy
Prerequisites Python; JavaScript
Supervisor Pablo Picazo
Level Flexible
Status Finished

A Collaborative Filtering Recommendation System (CFRS) is a system that keeps track of the users’ preferences to use it afterward to offer new suggestions to other users. Youtube, Amazon, and Netflix are examples of applications that implement CFRSs. This is also the case with the Google Web Store, the online marketplace where browser extensions are freely distributed. Some of the information the Web Store offers for each browser extension is the category it belongs to, the name of the developer, the company, a general description, some privacy practices, users’ reviews, the number of downloads, the rates that users give or metadata like the version of the extension, and when it was updated.

The Web Store implements a CFRS in such a way that extensions are ranked or featured to make it easier for users to find high-quality content. This ranking is performed by a heuristic that considers user ratings and usage statistics, such as the number of downloads and uninstalls over time.

Even though the algorithms used by the CFRS are usually unknown, researchers found attacks against the recommendation system, being pollution attacks the most common ones. Such attacks consist of generating fake data, typically in the form of new users who interact with the system by watching videos, reading books, and rating or downloading items. By doing so, attackers may promote or demote items as desired. Also, the proliferation of crowdsourcing sites like Zeerk, Peopleperhour, Freelancer, Upwork, and Facebook groups, have helped on this matter. Among other things, by boosting some apps, developers may get funding from venture capitalists when their apps are popular among users.

The Web Store implements a set of fraud detection and defense mechanisms so that attackers cannot alter the ranking that easily. Similar to Android Google Play, users can only review and rate an extension only if they 1) are logged in to the Web Store, and; 2) install it first, being easier for Google to detect fake users trying to exploit the CFRS. However, this is not the case with downloads. To download and install extensions, users need a Chromium-based browser, e.g., Chromium, Chrome, and Brave. Therefore, the number of downloads can be easily altered by automatic processes, being difficult to differentiate between real users and automatic downloads.

In this project, students will analyze the behavior of the most important Browser Extensions repositories (Chrome and Firefox). To do so, students will crawl and compare both the Web Store and the Firefox Add-Ons market to see how extensions are ranked and presented to users. The goal is to be able to extract which extensions' metadata (features) are used in these two CFRS and which are the most important ones.