开源日报 每天推荐一个 GitHub 优质开源项目和一篇精选英文科技或编程文章原文,坚持阅读《开源日报》,保持每日学习的好习惯。
今日推荐开源项目:《没必要再用 print 来调试了 PySnooper》
今日推荐英文原文:《Tips for Effective Data Visualization》
开源日报第407期:《没必要再用 print 来调试了 PySnooper》
今日推荐开源项目:《没必要再用 print 来调试了 PySnooper》传送门:GitHub链接
推荐理由:说到调试,相信调试器是一个很好的选择——断电和变量观察都是解决问题的好帮手。但是兴许有的时候对于一些尝试性的小玩意来说专门装一个调试器显得有些小题大做了……然后有些人就会在项目里加 print 来充当简单的调试器用。这个项目让你再也不需要一行行的 print(1)或者某个变量,只需要运行一遍,然后看看它的输出日志——变量的变化,当前运行的位置都一览无遗,这可比 print 管用得多。
今日推荐英文原文:《Tips for Effective Data Visualization》作者:Suraj Thatte
原文链接:https://medium.com/@sursthatte/tips-for-effective-data-visualization-d4b2af91db37
推荐理由:在数据可视化时如何提升观看体验

Tips for Effective Data Visualization

开源日报第407期:《没必要再用 print 来调试了 PySnooper》
Data visualization has a strong design element to it. Given the differences in domains, applications and audience it’s hard to put a structure around the best way to visualize your data. However, there definitely are wrong ways of doing it! I have come across multitude of such instances which were a driving force behind this post.

In this article I’d like to share some useful tips to help prevent blunders in your visuals.

1. Choose the right visual

This one may seem too obvious! But I have seen several people trying to demonstrate their artistic side for no reason..
Always remember that “form follows function” — purpose of a visual should be the starting point of its design
Ask yourself — are you trying to compare values, show trends, explore distribution or relationship between variables? Then choose appropriate visual(s) depending on the message you are trying to convey.
Consider the following charts. The underlying dataset contains product complaints/defects. We are trying to show products that need attention with a defective rate ppm (parts per million) A bar chart is a simple yet effective way to display this data. One of the downside of tree maps and packed bubble charts is that is asks the reader to compare area instead of height which is visually taxing.
开源日报第407期:《没必要再用 print 来调试了 PySnooper》
Choosing the right visual based on it’s function

2. Trivial are many but vital are few (data points)

Do not merely slap a visual on the dataset. Analyze and convert your data into an information ‘nugget’ that the audience can grasp.
开源日报第407期:《没必要再用 print 来调试了 PySnooper》
The figure alongside shows a time series chart of a process variable. Line A-A’ shows a point in time when both period and amplitude changed which was the onset of an issue that utlimately led to the sudden dip in its value after a few cycles. Which chart does a better job at uncovering this insight?

The chart on the top simply converts the data into a visual whereas the one at the bottom “conditions” it to provide an insight.

Let’s say that we want to show total annual spend by project ID. There are 41 unique categories. Chart on the left looks crowded with all categories cramped in it. A better way to display this is to show the top 5 categories and combine the remaining into ‘others’ bucket.
开源日报第407期:《没必要再用 print 来调试了 PySnooper》
Suppressing the “noise” in the data

Also, beware that filtering out the ‘others’ category may exaggerate the area of the pie chart or change the percent of total calculations. This can be misleading! While displaying percentages, always make sure that they add up to a 100% or explain why and what’s being excluded.

3. Figures don’t lie, but lairs figure!

Visuals should reflect reality and not distort it. Formatting of the chart plays an important role as it sets up a frame of reference for the audience.
In the example below, the yield of a process increased from 56% to 67% over a period of 6 months. Chart to the left is trying to overstate the improvement by formatting the y-axis to start at 50%. The latter visuals paints an accurate picture where y-axis starts off at 0 and also includes a goal line.
开源日报第407期:《没必要再用 print 来调试了 PySnooper》
Figures don’t lie..liars figure!

4. Use color wisely

Use of color should be made to add more information or to highlight key data points in a visual. In all other cases, it’s redundant and distracting.
This article by Lisa Charlotte Rost has several pointers to consider while choosing a color scheme for your visuals. I would also recommend trying the Viz Palette tool developed by Susie Lu and Elijah Meeks.
开源日报第407期:《没必要再用 print 来调试了 PySnooper》

5. How important are aesthetics as compared to functionality?

Visualization tools available today allow us to create most stunning and rich visuals in a few mouse clicks. However, over-doing the aesthetic elements may distract people from the key message of the visuals. One of the seven wastes in the Lean philosophy is ‘Over-processing’.
开源日报第407期:《没必要再用 print 来调试了 PySnooper》
下载开源日报APP:https://openingsource.org/2579/
加入我们:https://openingsource.org/about/join/
关注我们:https://openingsource.org/about/love/