Posts by Collection

portfolio

Manifold

A fully integrated 360◦ video camera supporting 6DoF head motion parallax requires overcoming many technical hurdles, including camera placement, optical design, sensor resolution, system calibration, real-time video capture, depth reconstruction, and real-time novel view synthesis.
Code available at: https://github.com/facebook/facebook360_dep

Intertect

An interactive learning tool for computer architecture. Through this, student will build a MIPS emulator from scratch through a web UI.
Code available at: https://github.com/yashpatel5400/intertect

Neuropath

Branch prediction has become an essential part of the CPU pipeline. This work explored using a perceptron for doing branch prediction in the Gem5 x86 CPU simulator. LTAGE and existing branch predictors were found to generally have better performance overall, albeit with slightly less accuracy in prediction in some cases tested on, than a perceptron.
Code available at: https://github.com/yashpatel5400/neuropath

MIS Ray Tracing

Implementation from scratch of multiple importance sampling for ray tracing. The renderer currently supports rendering an outdoor collection of balls and the Cornell box.
Code available at: https://github.com/yashpatel5400/raytrace-montecarlo

Synalyze

App used for analyzing business meetings and/or conversations that produces analyses on participants. Best Use of Machine Learning: HackPrinceton Spring 2017
Devpost: https://devpost.com/software/synergy (Note: the project was originally named Synergy)
Code available at: https://github.com/yashpatel5400/synalyze

Wizard Chess

Interactive 3D chess game built atop three.js.
Code available at: https://github.com/yashpatel5400/chesswizard

DeepGIF

Video style transfer using convolutional networks, with tracking and masks for GIFs
Code available at: https://github.com/yashpatel5400/DeepGIF

Manhattan Project Visualization

Visualization of the involvement of people from Princeton University in the Manhattan Project. All moving dots (after settling in their final locations) are interactive: clicking them will give a short audio clip of the person’s involvement in their final location. Note: rendering is pretty compute-intensive, so it may be choppy before settling down.
Code available at: https://github.com/yashpatel5400/manhattan

publications

Optimal Charging Station Locations

Tesla and electrical vehicles (EVs) have become more prevalent in the last decade. With the great rise in projected growth in EVs, the issue of placing electrical charging stations has grown to the forefronts of customers’ and business owners’ minds alike. We seek to address this problem, namely by investigating policies to determine the optimal locations to place electrical charging stations in a city setting. For this task, we developed a lookup-table model, with altered updating equations, and tested a few learning policies, in the forms of online and offline Knowledge Gradient Exploration (KG), Interval Estimation (IE), Boltzmann Exploration, and Pure Exploitation. Upon doing so, we found that the Knowledge Gradient Policy was the most effective in maximizing our total usage over all stations within our time horizon. We therefore, recommend it as a baseline for building future policies in this context of maximizing station utilization. Future studies may wish to expand upon the bottleneck employed in the model for charging stations and also time inhomogeneity

An Analysis of Selfish Mining Attacker Incentives in Bitcoin and Ethereum

Both the Bitcoin and Ethereum decentralized systems rely on the same distributed public Blockchain mining model of transmitting and recording history. Previous thought was that this system would be held in check through a balanced proof of work incentive system. However, previous studies have revealed an attack dubbed “selfish mining” whereby miners can exploit this incentive system to increase their expected rewards. Such models have further been applied to studying the transaction fee system that is expected to largely replace the block rewards system over the following years. Despite extensive study in the past, such models have failed to include the associated effects of these selfish mining attacks on exchange rates, which is of primary focus herein. These models are further extend to the context of the Ethereum network, which has not been studied with respect to selfish mining previously. In addition, this study sought to align and compare the current empirical status of the Bitcoin and Ethereum networks to the model results, to determine whether it is currently in the miners’ economic interest to engage in selfish mining or not. In the end, the necessary devaluation was studied as a function of the attacker’s hashrate, selfish mining (SM) hashrate proportion, SM engagement delay, and uncle block reward (Ethereum) were obtained, and it was found that the current state of Bitcoin and Ethereum are highly conducive to selfish mining, making it of interest to find countermeasures thereof in future studies.

FairTear: Automated Probabilistic Analysis on Dataset Models

Given the extent to which machine learning algorithms have come to characterize lives, both on a daily and longscale basis, the study of their ingrained biases is much in order. Many tools have emerged to understand such biases, both those that explicitly look at the underlying classifier code (white-box) and those that are agnostic thereof (black-box). White-box tools can provide greater insight, but are typically limited in the types of models they can analyze. A new tool, FairSquare, provides a method of applying white-box techniques to more complex models. However, since FairSquare requires a new classifier syntax and knowledge of an underlying population model, there was much left to be desired as an end-user. We present a tool, FairTear, which provides a clean UI through which end users can feed in their classifier and view its analysis result from the FairSquare tool. Our tool automates both the process of generating the population model and the process of converting a classifier to the FairSquare syntax. In turn, the user is fully abstracted from the FairSquare back-end, allowing them to determine the fairness of his algorithm without any additional knowledge than what is contained in their code. FairTear is capable of making use of nearly all the supported FairSquare functionality, supporting multi-level conditioning of population model features and different feature distributions (Gaussian and multi-step uniform). FairTear also integrates with the popular scikit-learn Python machine learning package, supporting several of its classifiers (decision trees, SVMs, and neural networks) in addition to additional preprocessing steps (StandardScaler). In doing so, we hope to allow a variety of endusers, from academia and industry alike, to take advantage of our system in real-world machine learning pipelines. Tests revealed full automation on all ends (i.e. supporting each of the classifiers referenced above), with fairness results being displayed on the front-end and an appropriate classifier decomposition visible on the back-end. In line with that, we considered further extensions to both our tool and FairSquare. These largely revolve around supporting a greater extent of the sklearn library, including additional distributions, preprocessing features, and classifiers.

Deanonymizing Bitcoin Transactions An Investigative Study On Large-scale Graph Clustering

Bitcoin has emerged from the fringes of technology to the mainstream recently. With speculation rampant, it has become more and more the subject of harsh criticism in ascertaining its use case. Unfortunately, much of Bitcoin’s present use case is for transactions in online black markets. Towards that end, various studies have sought to partially deanonymize Bitcoin transactions, identifying wallets associated with major players in the space to help forensic analysis taint wallets involved with criminal activity. Relevant past studies, however, have rigidly enforced manually constructed heuristics to perform such deanonymization, paralleling an extensive union-find algorithm. We wish to extend this work by introducing many more heuristics than were previously considered by constructing a separate “heuristics graph” layered atop the transactions graph and performing a graph clustering on this heuristics graph. Towards that end, we explored the performance of various clustering algorithms on the SBM (stochastic block model) as a prototype of the heuristics graph and additionally tested graph preprocessing algorithms, specifically sparsification and coarsening to determine the extent they could speed up computation while retaining reasonable accuracies. We found hierarchical spectral clustering and METIS to have the best performance by the standard purity, NMI, and F-score clustering accuracy metrics. We also found sparsification and coarsening to result in little reduction in time with the former severely detracting from accuracies and the latter less so, suggesting the latter holds potential given implementation optimization in future studies. METIS was subsequently employed to cluster a subset of the full graph due to major time concerns with hierarchical spectral clustering. Several wallet clusters were identified as a result, though the accuracy of this could not be determined due to the limited ground truth available. Future extensions of this work should seek to refine the hierarchical spectral clustering algorithm for its time deficiencies and extend the ground truth available.

Yash Patel