Samuel Clarke

353 Serra Mall · Stanford, CA 94305 · spclarke [at] stanford [dot] edu

I am a PhD Student in Stanford University's Computer Science Department, advised by Professor Jiajun Wu. Previously, I was a Graduate Research Assistant in Carnegie Mellon's Robotics Institute, co-advised by Professors Chris Atkeson and Oliver Kroemer. Recently, my research has been in the realm of learning for manipulation, especially how robots can use sound during manipulation.

Research

Hearing Anything Anywhere

Mason Wang*, Ryosuke Sawata*, Samuel Clarke, Ruohan Gao, Shangzhe Wu, and Jiajun Wu

IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2024

We collect a dataset of real Room Impulse Responses (RIRs) from four rooms, then introduce a new Novel Viewpoint Acoustic Synthesis method based on differentiable audio rendering. Our method uses physics-based biases to achieve practical sample efficiency, only requiring RIRs to be measured at roughly 12 distinct locations in the room.

SoundCam: A Dataset for Finding Humans Using Room Acoustics

Mason Wang*, Samuel Clarke*, Jui-Hsien Wang, Ruohan Gao, and Jiajun Wu

Conference on Neural Information Processing Systems (NeurIPS) Datasets & Benchmarks 2023

We record thousands of room impulse responses and music clips in different real rooms with humans standing in different positions in the room. Learning-based models can use these minute differences in the room's acoustics to track, identify, or detect humans in the room. Our data can be used to develop more robust and sample-efficient methods, with applications in home assistants, security, and robotics.

RoboCook: Long-Horizon Elasto-Plastic Object Manipulation with Diverse Tools

Haochen Shi*, Huazhe Xu*, Samuel Clarke, Yunzhu Li, and Jiajun Wu

Best Systems Paper at Conference on Robot Learning (CoRL) 2023

We introduce a framework for accomplishing long-horizon tasks in soft-body manipulation and show that it can learn to make dumplings with a variety of tools, with very little training data for each tool. We also show that our framework can learn to use tools to achieve other tasks in soft-body manipulation, such as shaping dough into target shapes, autonomously selecting tools for each step of the task.

RealImpact: A Dataset of Impact Sound Fields for Real Objects

Samuel Clarke, Ruohan Gao, Mason Wang, Mark Rau, Julia Xu, Jui-Hsien Wang, Doug James, and Jiajun Wu

Highlighted at IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2023

We collect 150,000 annotated recordings of impacts of 50 everyday objects, recorded from 600 distinct microphone locations. We show how our data can be used to tune and validate acoustic simulations, or used directly in interesting downstream audio and audiovisual tasks.

DiffImpact: Differentiable Rendering and Identification of Impact Sounds

Samuel Clarke, Negin Heravi, Mark Rau, Ruohan Gao, Jiajun Wu, Doug James, and Jeannette Bohg

Oral presentation at Conference on Robot Learning (CoRL) 2021

Differentiable physics-based models provide a useful bias for learning from impact sounds to solve both forward and backward problems on impact audio. We show we can both infer models from data in the wild, and then use these models to perform source separation better than generic learning-based alternatives.

Robot Learning for Manipulation of Granular Materials Using Vision and Sound

Samuel Clarke (in collaboration with Travers Rhodes and advisors Christopher G. Atkeson and Oliver Kroemer)

CMU Masters in Robotics Thesis 2019

Deep learning-based data-driven models can both predict the effects of a scooping operation on a granular material using vision and can learn to use audio for feedback on scooping and pouring granular materials.

Learning Audio Feedback for Estimating Amount and Flow of Granular Material

Samuel Clarke, Travers Rhodes, Christopher G. Atkeson, and Oliver Kroemer

Presented at Conference on Robot Learning (CoRL) 2018

Deep learning-based data-driven models can accurately predict the amount of granular materials a robot pours or shakes, based only on audio recordings. With machine learning, we can use recordings from a $3 microphone to outperform the measurement resolution of a $3,000 wrist force-torque sensor.

NaturalMotion: Exploring Gesture Controls for Visualizing Time-Evolving Graphs

Samuel Clarke, Nathan Dass, and Polo Chau

Presented at VIS 2016

We took the Matrix Cube, a tool for visualizing time-evolving graphs, and developed a new 3D interface controlled by hand gestures. Gestures were captured with a Leap Motion device.

LatentGesture: Active User Authentication through Background Touch Analysis

Premkumar Saravanan, Samuel Clarke, Polo Chau, and Hongyuan Zha

Presented at Chinese CHI 2014

With data collected from a user's brief interactions with common UI elements, such as checkboxes and sliders, machine learning models can uniquely identify the user. Such a system could authenticate a mobile device's user continuously and seamlessly, without many of the vulnerabilities common to traditional authentication methods.

Projects

Deep Scoop

Final Project for 16-703 Deep Reinforcement Learning and Control

This project was the precursor to my paper on predicting the effects of scooping. I attempted to learn a scooping policy with deep reinforcement learning, posing scooping a goal mass as a Contextual Bandit Problem. I adapted different popular techniques such as Actor-Critic and Cross-Entropy Method. What I learned from my results was very helpful in refining my approach for my later publication.

CPR+ Mask

Finalist for Georgia Tech's 2017 Inventure Prize

We designed a mask to walk an untrained rescuer through performing standard-of-care CPR to a cardiac arrest victim, in real time. The mask is equipped with sensors for monitoring the state of the victim and the quality of the CPR and uses a speaker and LEDs to give instructions and cues to the rescuer.

Experience

Research

Adobe Research

Synthesized datasets and developed learning-based methods for tracking object motion using sound.

June 2022 - November 2022

Machine Learning Intern

Autoroboto at Google Brain

Developed deep learning-based methods to extract feedback and detect anomalies from audio, with applications in industrial automation.

May 2019 - August 2019

Graduate Research Assistant

Carnegie Mellon University (co-advised by Chris Atkeson and Oliver Kroemer)

Developed deep learning-based audio feedback framework for estimating masses for robotic pouring and scooping granular materials (see Research)
Developed deep learning-based data-driven framework for predicting effects of robotic scooping granular materials (see Research)
Designed and manufactured numerous parts for lab projects using SolidWorks
Designed and trained many learning models using TensorFlow

November 2017 - May 2019

Mechanical Engineering Intern

Google

Led multi-team design and implementation of an automated vision-based scanning device to ensure a secure exit of hard drives from data centers
Developed APIs for automation machines to query desired data from sources on production networks with Go, Python, and C++ communicating through JSON

May 2016 - August 2016

Software Engineering Intern

Google

Built Arduino-controlled motorized linear slide to automate Android device tests
Developed Go adapter for communicating with Arduino devices over serial
Developed tools for compiling Arduino code within company codebase
Wrote automated communications tests for Android devices, using Go and Android Java

April 2015 - August 2015

Propulsion Data Science Intern

SpaceX

Collaborated in development of an automated system to detect anomalies in telemetry sensor data from rocket engine tests, targeting a 12x reduction in human review time
Transferred legacy academic code modelling the Merlin 1D rocket engine from Fortran to Python to optimize performance of a new iteration of the engine

January 2015 - April 2015

Undergraduate Researcher

Georgia Tech (advised by Polo Chau)

Developed gesture-based 3D data visualization in an Oculus Rift virtual reality environment, with demo developed in the Unity game engine (see Research)
Developed method for behavioral authentication on Android featured in popular press (Engadget, Gizmodo, Yahoo, many more) (see Research)

January 2013 - December 2016

Engineering Practicum Intern

Google

Developed backend RPC infrastructure for serving feature search requests to the search bar in Google Maps Engine viewer
Mechanical 20% project with Glass: Studied repeatability of 4-axis automated manufacturing robots through designing tests and analyzing results in MATLAB

May 2014 - August 2014

Freshman Engineering Practicum Intern

Google

Developed automated user interface tests for Google Play on Android
Developed App Engine project in Python to organize and monitor test results
20% project with Android hardware team: Designed and prototyped testing tool and façade case for unannounced products in Creo Parametric, performed mechanical tests

May 2013 - August 2013

Contractor

Salesforce Marketing Cloud

Prototyped HBase infrastructure and an application to efficiently move billions of records from SQL Server to Hadoop and HBase, as a feasibility study
Worked with C#, Pig, Java, and Thrift

April 2012 - July 2012

Intern

ChaCha Search

Data mining and analysis tasks using Hadoop, Pig, Tableau, and Excel
Developed in-house click fraud detection algorithm which rivaled proprietary options

September 2011 - April 2012

Education

Stanford University

Doctor of Philosophy

Computer Science

September 2019 - Present

Carnegie Mellon University

Master of Science

Robotics

GPA: 4.17/4.33

August 2017 - May 2019

Georgia Institute of Technology

Bachelor of Science

Computer Science and Mechanical Engineering

GPA: 4.0/4.0

August 2012 - May 2016