Image for post
Image for post
Photo by Burak K from Pexels

As an aspiring master of all things data I often think about my favorite aspect of the many things one can do with data: Visualizations (data viz)! If you’ve ever taken time to visit one of the many websites using data visualizations (like https://pudding.cool/ or https://setosa.io/#/) and play with their charts and graphs you can quickly get a sense of the potential and power that data visualizations can unearth. In my personal work with data I’ve used a combination of Matplotlib and Tableau to make vizualizations but I’ve always been intrigued with flexibility and power of the JavaScript library D3.js. This library is an amazing tool, but it also comes with the reputation of being very dense and having a steep learning curve. While that may be true, I’ve decided to finally dive into the deep end and see if my knowledge of JavaScript and data will provide me with enough intuition to keep me afloat. I intend to use this, and future blog entries, as a learning journal to share with others who might be thinking of taking that same leap. This post is meant as a basic intro and setup with the addition of how to use selectors. …


Image for post
Image for post

Welcome to part 2 and entry number 5 in my Tableau tutorials series. In my two part series we are looking at how to group your data. There are countless scenarios where one would want to use groupings of some sort, but you can think of groupings as a way to get a more detailed view of some subset of your data. For example, if we are using a COVID-19 dataset we may want to see only cases that include people from the ages of 50 to 60. There are a few ways we could go about viewing that particular group. What if we want to see people from the ages of 50 to 60 that live in California. There’s ways we can do that as well although we may need to use different grouping methods. In part one this blog we covered groups, hierarchies, and ordering which are ways to group data. …


Image for post
Image for post
Photo by Mike from Pexels

Welcome to the fourth entry in my ongoing Tableau tutorials series. This blog will be a two part journey with this entry covering groups and hierarchies and next week’s entry covering sets and dynamic groupings using calculations and parameters. In the past blogs I’ve covered basic data-connection-to-dashboard, calculations, and filtering while touching super briefly on groups and hierarchies. In this post I will be drilling down into a more detailed explanation of groups and hierarchies.

When working with large datasets you may end up with more columns than the Parthenon. For our sanity and ease of analysis we want to be able to order and group our data in some way. Tableau gives us the option to use sets, groups, hierarchies, and, as mentioned, calculations and parameters for groupings that dynamically update with the data or user input. …


Image for post
Image for post
Photo by Oleg Magni from Pexels

This blog marks the third entry in my ongoing “Teaching Tableau” blog. In our previous installments I showed how to create a basic dashboard from start to finish and how to work with filters. This week’s tutorial will go over “Calculations”. What is a calculation? Seems simple but Tableau has three designations for what a calculation is. Calculations are written either as part of the query made on the data source or after the query. This will make more sense in a second. There are Basic Calculations, Level Of Detail Expressions (LOD), and Table Calculations. Basic calculations can be aggregate calculations like a sum or average that work on the whole dataset or can also work on the data source by calculating a single row. Basic calculations are made as part of the query. LOD Expressions are also made as part of the Tableau query. Tableau states that they allow us to “compute aggregations that are not at the level of detail of the view”. LOD expressions can work at a more detailed level through the use of the EXCLUDE, INCLUDE, and FIXED functions. …


Image for post
Image for post
Photo by Maurício Mascaro from Pexels

In my last blog post I touched on a few basics for building simple graphs and dashboards. In this blog post I intend to dive a bit deeper into customizing graphs to make them interactive through the use of filters. For this post I will be using a data set that I created myself from insideschools.org. This data set is a collection of 22,662 comments from all non-charter public schools in New York City. To begin let’s take a look at some ways to filter categorical variables.


Image for post
Image for post
Photo by Lukas from Pexels

GOAL OF THIS BLOG

Upon completing Flatiron’s data science program I found that I had a solid range of experience with the standard libraries and programs for a data science stack like Pandas, Numpy, Matplotlib, ScikitLearn, etc. Although libraries like Matplotlib and Seaborn can create static and interactive data visualizations it seemed like most businesses were using Tableau in their production line for reporting on insights gained from the data. To make myself industry ready I decided to take some time and learn some of the ins-and-outs of Tableau’s data visualization platform. The following series of blog posts will be a learning journal to track my progress and to provide an easy to follow roadmap along with some tips and tricks for the new users ready to jump into the Tableau ecosystem. …


Image for post
Image for post
Photo by Francesco Ungaro from Pexels

Since the time of early civilizations the human impulse to record data has manifested and evolved from tick marks, to hieroglyphs, to cloud-computing databases filled with millions of entries. Our forms of data collection and science have evolved to meet our needs as a society in a way that has benefited the human race in immeasurable ways. …


Image for post
Image for post

According to the Oxford Dictionary of Phrase and Fable, the phrase “Garbage in, garbage out.” has been used in the realm of computer science since the 1950s to mean poor programming input will yield poor results because computers can’t think for themselves and self adjust in a relevant way. This adage rings particularly true in the processing of text data. Having gone through the process multiple times now I wanted to write a short guide to serve as a rough checklist on how to pre-process text for input into a machine learning model or for techniques like LDA topic modeling. For this guide I did my text pre-processing in Pandas and Spacy so all code and examples will be formatted as such but feel free to use other tools like NLTK and the like. For my dataset I scraped the comments section for each school in NYC in insideschools.org with the goal of performing topic modeling to find topics of concern for communities regarding their local schools. …


Policing issues are and have been a constant dark cloud hanging over the nation since municipal police departments first formed in the late 1800s (and before with more informal forms of “community policing”). The air around policing has become thick with distrust and anger on both sides of the issue. Data science has an interesting role in the debate, as I believe the best we can do to begin to chip away at the problem is to be impartial and speak to the facts. …


As a former NYC public school teacher I can say from personal experience that lack of attendance can be a tricky dilemma to solve. Some students’ guardians work night shifts and get to bed right before a student is about to wake up, or they may leave to work before a student even wakes up to get ready. Some students may even be in a purely neglectful situation, may be living in a shelter, or may live far away and have difficulty finding transportation. With a gamut of challenges to get students in their seats it is troubling to know many schools get state funding based on students in attendance for the day. There is also a corpus of studies and literature that suggest time spent in the classroom is connected to educational outcomes. …

About

Kevin Macias-Matsuura

Former English teacher turned Data Scientist/Analyst interested in data, design, and storytelling.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store