每天推荐一个 GitHub 优质开源项目和一篇精选英文科技或编程文章原文,欢迎关注开源日报。交流QQ群:202790710;微博:https://weibo.com/openingsource;电报群 https://t.me/OpeningSourceOrg

2018年7月8日:开源日报第122期

今日推荐开源项目:《超小编辑器 pell》GitHub链接

推荐理由:这看起来像是个普通的编辑器……但是它很小。它的大小甚至不到5kb,但是功能还是很齐全的,加粗下划线分段代码块插入图片和链接一样不落,就是你只能通过 url 插入图片而已……不过这也没关系,这个大小足够让它在众多编辑器中具有自己的独到之处了。

2018年7月8日:开源日报第122期

今日推荐英文原文:《Don’t learn Machine Learning in 24 hours》作者:Rwiddhi Chakraborty

原文链接:https://towardsdatascience.com/dont-learn-machine-learning-in-24-hours-3ea3624f9881

推荐理由:学习是一个长久的过程,知其然,知其所以然,而不是用7行代码或者是别的什么,当然如果你需要用最快的速度完成你的目标然后不需要别的什么后续,那就另当别论。

Don’t learn Machine Learning in 24 hours

Recently, I came across a wonderful article by Peter Norvig — “Teach yourself programming in 10 years”.

This is a witty and a tad bit satirical headline, taking a dig at all those coffee table programming books that aim to teach you programming in 24 hours, 7 days, 10, days, *insert a ridiculously short time line*.

Dr. Norvig makes quite a strong case. Yes, you may come to grips with the syntax, nature, and style of a programming language in 24 hours, but that doesn’t mean you’ve become adept at the art of programming. Because programming isn’t about a language at all. Programming is about intelligent design, a rigorous analysis of time and space complexity, understanding when a certain language works over another, and so much more.

Of course you could write a Hello World program in C++ in 24 hours, or a program to find the area of a circle in 24 hours, but that’s not the point. Do you grasp object oriented programming as a paradigm? Do you understand the use cases of namespaces and templates? Do you know your way around the famed STL? If you do, you certainly didn’t learn all this in a week, or even a month. It took you a considerable amount of time. And the more you learned, the more you realised that the abyss is deeper than it looks from the cliff.

I’ve found a similar situation in the current atmosphere surrounding Machine Learning, Deep Learning, and Artificial Intelligence as a whole. Feeding the hype, thousands of blogs, articles, and courses have popped up everywhere. Thousands of them have the same kind of headlines — “Machine Learning in 7 lines of code”, “Machine Learning in 10 days”, etc. This has, in turn led people on Quora to ask questions like “How do I learn Machine Learning in 30 days?”. The short answer is, “You can’t. No one can. And no expert (or even one comfortable with its ins and outs) did.”

2018年7月8日:开源日报第122期

Even if we were to forget the 10,000 hours rule for a second, you can’t do machine learning in 7 lines of code.

Why? Because those 7 lines of code do not explain how you did in the bias-variance tradeoff, what your accuracy value means, or whether accuracy is an appropriate metric of performance in the first place, whether your model overfits, how your data is distributed, and if you’ve chosen the right model to fit the data you have, etc. There’s just so much more to it even after you’ve answered these questions.

And since you couldn’t interpret your model, you tweak the parameters in sklearn, get a minimal improvement in accuracy, and go home happy. But did you really learn?

2018年7月8日:开源日报第122期

In short, don’t do it in 7 lines of code. Do it over 6 months, a year. You’ll know in the middle of that period whether it interests you. Forget the glamour for now, and really get into the depths of this amazing field of research. You should definitely read this. I found it to be the best introduction for a newbie in this field. You don’t need to know math or code to read it. But after reading this, you will realise the entire gamut of concepts you need to understand in order to be fluent with this field, to think in ML, so to speak.

There are indeed fascinating blogs to follow on this subject. Here are some of my personal favourites:

  1. http://colah.github.io/
  2. http://mark.reid.name/blog/
  3. http://karpathy.github.io/

Medium is also a wonderful place to learn. I follow this publication almost exclusively.

If you’re old school, take Andrew Ng’s CS229 at Stanford. This is more involved than his course on Coursera, which is also a good introduction.

An unfortunate result of hype is that we “drown in information and starve for knowledge”. So many people do it, that we frequently lose sight of the bigger picture. Machine Learning is wonderful. It is a serious field of research and development, and is driving so many 21st century use cases.

Just don’t do it in 24 hours.


每天推荐一个 GitHub 优质开源项目和一篇精选英文科技或编程文章原文,欢迎关注开源日报。交流QQ群:202790710;微博:https://weibo.com/openingsource;电报群 https://t.me/OpeningSourceOrg