Visualizations at Meetings

You probably also hate wasting time at meetings. What I don’t understand is how some meetings can go on for multiple instances without proper presentation skills - appallingly this is a meeting of senior technical leads. Look, it doesn’t even have to be BPMN, although that’s a good start. I don’t understand how half of the meeting revolved around a draft of some standards that was never screen-shared and the meeting spent 30 minutes talking (in circles). The second half of the meeting showed a dashboard (from vendor, not customized), and the presenter was trying to show some trends, but there was no narrative, no custom dashboard or some sort of analysis to demonstrate the hypotheses, and then some proposed scheme of action buried in an email (like, can’t you just copy it to a Powerpoint slide and even make it point form?!) that was convoluted enough that it was unclear what was really happening. ...

February 17, 2025 · 1 min · Shen Ting

You Should not be Using Anaconda

Imagine you’re a newcomer to Python and data analytics and some website tells you to use conda. Days later, you get an email from Anaconda telling you that you’re in breach of their licensing terms because the organisation you’re working for has more than 200 employees! Confused, you do a quick google and find this: https://www.anaconda.com/blog/is-conda-free Now you’re even more confused. Meanwhile on the second search result, the answer is clearer: https://stackoverflow.com/questions/74762863/are-conda-miniconda-and-anaconda-free-to-use-and-open-source ...

August 1, 2024 · 2 min · Shen Ting

DataScience SG Talk: Data Challenges

I gave a talk last night at Data Science SG entitled “Trustable Data: Challenges in a National Sports Association”. It gives an outline of what I’ve encountered and done in the past few years for SCBA. Talk slides can be found here

April 25, 2024 · 1 min · Shen Ting

Why you probably shouldn't outsource data work

I was having a conversation with David and the subject of outsourcing came up. I’ll start by stating I am not against outsourcing. There are definitely situations where it makes sense, especially for resource-constrained organizations who can’t possibly cover every single function by themselves. (On a personal level, hosting this site on SquareSpace is also a form of outsourcing.) Outsourcing does work for one-off projects where it doesn’t make sense for an organization to hire long-term. So, if you’re working on a one-off data project, it probably makes sense to outsource the work. ...

July 2, 2021 · 2 min · Shen Ting

Mandatory Reading on Names

Yes, we’re talking about actual names It’s a pity that I came across this 6 months too late, as it would have saved me an hour repeating myself thrice on why we can’t use names as a unique key to join across different data sources. Thankfully my point eventually got across, but to any fellow developer/data scientist/engineer having to explain to stakeholders, hopefully this helps. And no, you do not want to use email addresses as a unique key either: ...

April 29, 2021 · 1 min · Shen Ting

Organizing Data Science Projects

In the past 8 months, I’ve probably worked on close to 10 different projects. While half of these consists of not more than a few Jupyter notebooks, the others consist of intermediate data and different notebooks for preprocessing and modelling. Cookiecutter seems to be a good solution and framework: https://drivendata.github.io/cookiecutter-data-science/ Refactoring those projects will take some effort, but I believe it will be well worth the time to do so.

January 18, 2020 · 1 min · Shen Ting