Data projects - Roy Keyes

Radiation dose estimation via Machine Learning

In 2016 I started a side project aimed at estimating radiation doses for cancer therapy using machine learning algorithms instead of physics based Monte Carlo.

You can watch the talk I gave about this on 14 July 2017 at SciPy 2017 in Austin, TX here and check out the slides here [PDF version].

Intro to Deep Learning

Over the span of a few months in 2016 and 2017 I gave some introductory talks on neural networks and deep learning for the Houston Data Science Meetup group. The slides are linked below.

Part 1: The basics of neural networks [PDF] (Jupyter notebook)
Part 2: From shallow to deep networks [PDF]
Part 3: Common deep learning networks [PDF]

slots - a multi-armed bandit library in Python

slots is a Python library that lets you explore and use several strategies for the multi-armed bandit problem. slots is available for installation from PyPI via "pip install slots".

You can read about slots and the multi-armed bandit problem in my blog post here and check it out on Github here.

Klackers strategy

Klackers (a.k.a. Shut the Box) is a dice game, often played in bars. The Klackers box has nine "tiles" numbered 1-9. A player rolls two dice, then flips down tiles that sum to the value of their roll. The player continues to roll the dice and flip tiles until they are no longer able to find a combination of tiles that sum to the dice or they have flipped all of the tiles. The player's score is the sum of the un-flipped tiles.

To determine the best simple strategy for Klackers I ran a series of Monte Carlo simulations. I created the simulation in Python. The code is found here on GitHub.

Chutes and Ladders via Markov chains in D3

Markov chains can be used to model probabilistic processes, such as financial markets or, in this case, children's games. This project is a visualization of the Markov chain model described by Nick Barry in a popular post contrasting the Monte Carlo and Markov chain methods.

For this project I created the Markov chain simulation with Python and Numpy and created the vizualization with the D3.js Javascript library. Because the visualization is rendered as scaled vector graphics (SVG), it may not appear correctly on less modern browsers. Chrome, Firefox, and Safari should work. The code is found here on GitHub.

How bikable is Albuquerque?

As someone who enjoys getting around town by bicyle, I thought it would be interesting to try to quantify how "bikeable" different parts of Albuquerque are. Using data made available by the city of Albuquerque, Samat Jain and I put together this "bikeability" map as part of the 2013 ABQ Hack Day.

Using Python and pandas, we converted XML files to JSON, extracted the values of interest, and calculated a score based on presence and type of biking infrastructure. The map was created with Leaflet.js via Folium, OpenStreetMaps data, and Jinja2. The code is found here on GitHub.

A talk about data science

In 2013 I gave a talk with Steve Koch at the ABQ Tech Fiesta titled "Data Science, Big Data, and other buzzwords". We decided to make a more generic version (PDF) of that talk and open source it. This talk was aimed at a general technical audience and discussed big data, its history, and data science and its components, including data munging, statistics, machine learning, and visualizations. Hopefully others will find the slides and graphics useful for their purposes.

These slides were made using the Beamer package for LaTeX. Our original graphics were created as SVG's in Inkscape. The presentation is primarily licensed under the CC BY-SA 4.0 terms (see the License.txt file for full details). The source code is found here on GitHub.

Roy Keyes

Some data projects