Latest Posts
-
SPACEKIT Computer: evaluate and predict
The spacekit.computer class is for generating predictions and evaluating model metrics. The example here evaluates a neural network for classifying stars with orbiting exoplanets using time series signal analysis (light curves).
-
SPACEKIT Builder: convolutional neural networks
spacekit.builder
building and fitting convolutional neural networks
-
SPACEKIT Transformer: signal processing
spacekit.transformer
tools for converting and preprocessing timeseries signals for ML
-
SPACEKIT Analyzer: frequency spectrographs
Plotting a SpecGram
-
SPACEKIT Analyzer: plotting light curves
timeseries flux signal analysis
-
SPACEKIT Radio: scraping NASA data
spacekit.radio
This module is used to access the STScI public dataset [astroquery.mast.cloud] hosted in S3 on AWS. In this post, I’ll show you how to scrape and download NASA space telescope datasets that can be used in astronomical machine learning projects (or any astronomy-based analysis and programming). For this demonstration we’ll call the API to acquire FITS files containing the time-validated light curves of stars with confirmed exoplanets. The datasets all come from the K2 space telescope (Kepler phase 2).
-
SPACEKIT: Machine Learning for Astrophysics
Overview
-
Tensorflow Deep Learning on AWS EC2
Configure
AWS Deep LearningEC2 Container image forTensorFlowmodel training on aCPU instancewithPython 3.6and run a machine learning model. -
Digdag PostgreSQL Tutorial
The
Digdag Postgresproject demonstrates how to useSQL querieswithdigdagandembulkopen source libraries for automation of ingesting and analyzing data using a PostgreSQL database. -
Digdag MySQL Tutorial
Digdag MySQL Tutorial
In this project, we’ll create a digdag workflow that executes an embulk script for ingesting csv files to a MySQL database. We’ll then write SQL queries to prepare and analyze the data.
-
Starskøpe 2: Spectrograph Image Classification
For the next phase of the Starskøpe planet hunting project, I used
Google Colabsto generate spectrograph images of the same star flux signals dataset from Kaggle. Due to memory constraints, I started out by only using a portion of this already small dataset as a test round. I ran the images through aKeras 2D Convolutional Networkusing Tensorflow, similar to the 1D CNN I built in the first phase of the project. The accuracy score was lower (0.86 vs .99), but this was to be expected due to the constraints of the dataset. -
STARSKØPE: Cyberoptic Artificial Telescope
“Mathematicians […] are often led astray when ‘studying’ physics because they lose sight of the physics. They say: ‘Look, these differential equations–the Maxwell equations–are all there is to electrodynamics; it is admitted by the physicists that there is nothing which is not contained in the equations. The equations are complicated, but after all they are only mathematical equations and if I understand them mathematically inside out, I will understand the physics inside out.’ Only it doesn’t work that way. Mathematicians who study physics with that point of view–and there have been many of them–usually make little contribution to physics and, in fact, little to mathematics. They fail because the actual physical situations in the real world are so complicated that it is necessary to have a much broader understanding of the equations.” -Richard Feynman, The Feynman Lectures on Physics: Volume 2, Chapter 2-1: “Differential Calculus of Vector Fields”
-
Detecting Dead Stars in Deep Space
A
supervised machine learning feature classificationproject that usesDecision Trees and XGBoosttopredict and classify signals as either a pulsar or radio frequency interference (noise). -
Visualizing Time Series Data
Time Series Forecasting with SARIMAX and Gridsearchis ahousing market prediction modelthat usesseasonal ARIMA time-series analysis and GridsearchCVtorecommend the top 5 zip codesfor purchasing a single-family home in Westchester, New York. The top 5 zip code recommendations rely on the following factors: highest ROI, lowest confidence intervals, and shortest commute time from Grand Central Station. Along with several custom time series analysis helper functions I wrote for this project, I also extrapolate the USZIPCODE pypi library to account for several exogenous factors, including average income levels. -
SQL Northwind Database
The
Northwind SQL Database Projectdemonstrates how to useSQL queriesandhypothesis testingin order torecommend business strategiesfor increasing sales and reducing costs for the fictitious “Northwind” company. The Northwind SQL database was created by Microsoft for data scientists to practice SQL queries and hypothesis testing in their analyses. -
Predicting Home Values
Ask any realtor what are the top 3 most important variables for measuring property value, and they will all say the same thing: 1) location, 2) location, and 3) location. I asked a friend who has been doing real estate for about 20 years (we’ll call her “Mom”) what other factors besides location tend to have some impact and she mentioned the following:
-
AWS Redshift Database Management
Configuring, Managing and Performing Remote SQL Queries on AWS Redshift.
-
Scientist Artist Engineer
It was the summer of 2019. I owned my own marketing and video production company, had my own apartment in Hollywood, friends that cared about me, and a girlfriend I loved. So why was I so miserable? Why was I waking up every day filled with dread, and staying awake at night, restless and consumed by anxiety? I hated the way I was living, but I was even more afraid of dying. “Yet Another Existential Crisis” - I thought I knew who I was now, I thought I loved myself. Was all of it a lie?