開源日報 每天推薦一個 GitHub 優質開源項目和一篇精選英文科技或編程文章原文,堅持閱讀《開源日報》,保持每日學習的好習慣。
今日推薦開源項目:《USTC USTC-CS-Courses-Resource》
今日推薦英文原文:《Dev, Ops, and Determinism》

今日推薦開源項目:《USTC USTC-CS-Courses-Resource》傳送門:GitHub鏈接
推薦理由:來自中國科學技術大學的各種課程資源,以計算機與技術類為主。其中涉及了資料庫,演算法和計算機結構等等各個方面,各位可以從中挑選感興趣的進行學習。如果想要偶爾換換風格看點技術向以外的東西的話,興許人文社科類或者物理類可以滿足你的興趣。
今日推薦英文原文:《Dev, Ops, and Determinism》作者:J. Paul Reed
原文鏈接:https://medium.com/@jpaulreed/dev-ops-and-determinism-966a57e3a5cc
推薦理由:開發人員和運營人員興許在有些地方的想法上具有相當大的不同

Dev, Ops, and Determinism

I』ve noticed an interesting pattern when discussing incidents with engineers over the years.

One of the topics that invariably comes up is the concept of 「root cause,」 a notion faithful followers of my Twitter stream know that I have at least a few thoughts about. Many organizations base their entire process of understanding incidents on the concept, and many of the techniques they use to facilitate that understanding, such as 「The Five Whys,」 are firmly rooted in this concept of a 「linearity of events.」

Challenging this idea, and suggesting that in complex systems, this linearity is soothingly deceptive — but deceptive none the less — always prompts in a fascinating discussion, and often times resulting in impassioned arguments that the idea of a root cause is crucial to understanding how incidents unfold.

The interesting pattern I』ve noticed is the way developers react to this idea versus the way operations engineers react: in my experience, developers tend to argue with more veracity that root cause matters and that cause and effect can be concretely established. Operations engineers, on the other hand, tend to nod and engage with the idea that linear narratives of the complex world may be deceptive.

I』ve always wondered why this is: what it is about developers and their experience that tends to make them react to the idea of 「root cause is a myth」 like an immune system seeking out a foreign agent, while operations engineers tend to at least entertain the idea?

I』m not entirely sure, but I do have an idea, and it has to do with the different contexts in which the two roles go about their daily work.

Developers work with tools that tend to be deterministic: compilers, linkers, operating systems are complex beasts, certainly, but we think of them as more or less deterministic: if we give them the same inputs, we generally expect the same outputs. And if there is a problem with that output — a 「bug」 — then the way developers go about solving it is to analyze the inputs (either from the user, or to the suite of tools that encompass the development process), find the 「error,」 and then change the inputs. This will fix the 「bug.」

How do I fix the bug? A core assumption of software development: the same inputs reliably and deterministically create the same outputs.

In fact, non-determinism itself is considered a bug: if the unexpected or errant output isn』t reproducible, then developers tend to extend their investigation into other parts of the stack (operating system, network, etc.) that we more or less assume should behave in the same way as long as we can reproduce the inputs… and if it doesn』t, then it』s still a bug. It』s just an operating system or networking bug.

Either way, determinism is a basic, almost unstated assumption of much of the work developers do.

But for any operations engineer who』s spent time racking and stacking hardware in a data center or arguing with a cloud API, this idea of a fully deterministic world (as long as we can map out all the inputs!) is a fleeting notion at best. Venerable Bastard Operator From Hell jokes about sunspots aside, seasoned operations engineers have seen all sorts of weirdness in the physical world and know that even a noisy neighbor can ruin your day.

The Operational reality (complete with tissues… and salt?)

So, poking holes in the notion that there exists a root cause of our incidents, and that tools like 「The Five Whys」 will faithfully (and repeatably!) lead us to that singular root cause, isn』t that far of a leap to make for operations engineers. In effect, it challenges an idea that when many operations engineers look back upon their own career experience, often times never really matched up with it anyway. So the reaction is different.

I do not, of course, mean to imply that developers』 reactions are silly or stupid, or that they are incapable of understanding how linearity may be deceptive. Seasoned developers are likely to have seen their fair share of non-determinism in the world.

But, I do think the reaction I tend to get from developers in these discussions has to do with the fact that the concept of determinism generally serves them well in the day-to-day execution of their work. And their run-ins with non-determinism aren』t as frequent as operations engineers』 fights with Schrödinger』s cat pawing at their infrastructure.

Ultimately, whether or not this fully explains the reactions I see, it is a potent reminder that the substance of our reactions is a complex amalgam of not only the topic at hand, but numerous other factors too.

And this is important to remember, whether we』re debriefing a single incident, collaborating across a software delivery pipeline, or making sense of our broader world.
下載開源日報APP:https://openingsource.org/2579/
加入我們:https://openingsource.org/about/join/
關注我們:https://openingsource.org/about/love/