I am a Research Scientist at Softmax. I was a visiting academic at NYU. I recently graduated with a PhD in Computer Science and Engineering from the University of Michigan, advised by David Fouhey.
After working on the sun and hands, I worked on diffusion for world-generation and scene editing. I did my masters at Michigan and my bachelors at the University of Maryland, where I studied Neuroscience and Computer Science.
During my PhD, I worked on visual representation learning. Most recently, I worked on learning compositional verb representations for image editing using latent diffusion. Prior to that, I trained neural nets to segment scenes from pseudolabels (ascribing motion to either camera or hands). I have also used satellite imagery of the sun to improve estimates of the solar magnetic field.
‣ A latent diffusion image editing method that separates states and actions into composable neural noun and verb representations.
paper / site
‣ An improved method for constructing a synthetic instrument trained from aligned SDO/HMI and Hinode/SOT-SP to produce high-quality magnetograms.
paper / site / github
‣ A new dataset, tasks, and model for understanding more complex hand interactions, including bimanual manipulation and tool use.
github
‣ We use disagreement from a background motion model as a pseudolabel to train hand and held-object grouping and association.
paper / site / github
‣ We made EPIC-KITCHENS VISOR, a new dataset of annotations for segmenting hands and active kitchen objects in egocentric video.
paper / site
‣ I trained a neural network to produce synthetic magnetic field inversions and we used the outputs to fix biases in the satellite processing pipeline.
paper
‣ We built a system that predicts segmentation masks for objects held by hands. The system trains from person, object, and background pseudolabels made by subtracting detected people from optical flow.
paper
‣ Hinode's SOT-SP measures small areas of the sun in high spatial and spectral resolution to predict the magnetic field. SDO/HMI measures the full-disk in lower resolution, both spatial and spectral.
‣ By training a neural network to accurately predict Hinode's estimated field using only HMI's input, we created a virtual observatory that melds the best parts of both instruments.
‣ I trained a UNet to predict magnetic field parameters on the sun using polarized light (IQUV's) recorded from the Solar Dynamics Observatory's HMI sensor.
paper / site / github / talk / poster
‣ I constructed topologically associated domains and analyzed RNA-seq data to identify differential gene expression using bioinformatics libraries in R and Python.
‣ I wrote MATLAB functions that classified windows of mouse EEG recordings as seizure/not seizure with max-margin unsupervised learning.
‣ I wrote a Java application that changed underwater images into false-color analogs for different cone opsins, to understand fish conspicuity.
‣ The idea was that these bright fish all have very different color cones and might actually be disguised in the eyes of predators.
‣ I created a world generation system that combines Dust3r/Mast3r with Stable Diffusion for 3D-aware outpainting.
‣ We can also use this system to create a synthetic multi-view dataset.
‣ By allocating more of fixed resources to value function improvement, we were able to train a reinforcement learning model to converge more quickly than a baseline.
‣ The idea is that policy and value networks might need very different batch sizes/GPU usage for stability in different environments and that this can be learnt.
‣ We evaluated the frequency of vehicle safety messages to identify what adjustments could augment vehicle safety and reduce potential network congestion for self-driving cars.
github
‣ We finetuned a Squeeze and Excitation ResNet to classify objects appearing in road-scene images. Finished in the top 10 for the class.
github
‣ We trained a DenseNet language model. We explored how residual connectivity can compare favorably to RNNs.
github
‣ We built a (briefly) state-of-the-art multilabel x-ray disease classifier.
presentation / github
‣ I built a stacked hourglass model with residual connections for nerve segmentation.
‣ I released a keras residual unit for use as a configurable building block on github.
‣ We created a service for deep dreaming your Facebook profile picture with convolutional neural networks.
github
‣ We created a mobile app for sharing photos and videos by location with a map interface. Like Snapmap.
‣ I created a virtual reality browser for exploring the internet as if it were a 3D city with websites as buildings.
‣ The idea here is that similar and inter-linked websites should be located nearby one another.
‣ I trained a convolutional neural network in Caffe to do plankton image classification.
github
‣ I made a neural network in PyBrain and later PyTorch for stock forecasting using convolutional neural networks and policy gradients.
github
‣ We made an Android app for Air Quality, UV, and Pollen Count tracking.
video / video demo / github
‣ We made Android arcade games for Google Glass, controlled through head motions.
‣ We modeled a horse robot in CAD, tested gaits in a simulator, 3D printed it, and surpassed distance requirements in the course.
video
‣ I created a cellular automata of medieval entities that grew and battled on a gridworld.
‣ Like ants, the automata would grow, building new castles and creating more fighters.
‣ We wrote a 5000 line True Basic text-based adventure game.
‣ You played as a knight with spells who grew strong to slay a dragon.
‣ Working on grid worlds like it's 2007.
softmax.com
‣ I did work on hand estimation, 3D scene reconstruction, sun magnetic field prediction, and image editing projects while mentoring students and finishing my PhD.
cs.nyu.edu
‣ I developed a system for estimating 4D hand pose as a computer vision research intern on the Ego How-To team.
meta.com
‣ I led discussions, created assignments with Numpy and Pytorch in Python, graded projects, and hosted office hours for ~150 upper-level CS students.
web.eecs.umich.edu/~fouhey/teaching/EECS442_W19/
‣ I incorporated and trained various object detection neural networks as part of a video analysis platform to identify objects in dashcam footage.
voxel51.com
‣ I built a style-transfer service on AWS that used to process millions of images/day.
‣ I built a GAN that performs face attribute transformation for a social media company.
‣ I built a CNN backend to provide object recognition in a Fortune 500 company iOS app.
‣ I designed many CNN computer vision systems for Fortune 500 clients across industries.
‣ We built an automated document extraction service on AWS, with custom LSTMs for OCR.
minimill.co/unscan
‣ I developed machine learning tools to automatically scale Kubernetes pods based on networks requests, CPU, and memory usage.
ycombinator.com/companies/redspread
‣ I instructed multi-hour discussions on cardiac function, renal system, nervous system, pharmacology, digestion, and more.
science.umd.edu/classroom/bsci440
‣ I was the primary contact with landlords, handled house finances, and organized housing for the next school year.
chum.coop
‣ I taught children how to sail and not crash into expensive boats.
woodsholeyachtclub.org