每天推荐一个 GitHub 优质开源项目和一篇精选英文科技或编程文章原文,欢迎关注开源日报。交流QQ群:202790710;微博:https://weibo.com/openingsource;电报群 https://t.me/OpeningSourceOrg

2018年12月1日:开源日报第268期1

今日推荐开源项目:《蓝猫淘气五百问 DeepLearning-500-questions》传送门:GitHub链接

推荐理由:对深度学习有些不懂的地方想要提问?这个项目就是对深度学习进行了介绍——以各种问题的形式。通过对各种不同的问题进行解答来让阅读者了解深度学习的相关内容,从最简单的各种基础到现在流行的 NLP(虽然还在完善中)都有涉及,兴许你在寻找相关问题的时候,可能会大吃一惊:“这个问题刚好是我想问的,惊了!”


今日推荐英文原文:《Understand your program’s memory》作者:Tiago Antunes

原文链接:https://medium.com/@tm.antunes/understand-your-programs-memory-92431fa8c6b

推荐理由:了解你的程序的内存,如果你第一门编程语言选了个 C 或者 C++ 的话你大概已经被内存问题针对过好几次了,这篇文章就介绍了应用程序是如何分配内存的,主要是介绍堆栈和堆,兴许能够减少你再次被这种问题所烦扰的机率

Understand your program’s memory

When coding in a language like C or C++ where you can interact with your memory in a more low-level way, sometimes it creates a lot of problems you didn’t get before: segfaults. These errors are rather annoying, and can cause you a lot of trouble and are often indicators that you’re using memory you shouldn’t use.

One of the most common problems is accessing memory that has already been free’d — memory that you’ve either released with free or memory that your program has automatically released (from the stack for example).

Understanding all this is really simple and it will definitely make you program better and in a smarter way.

How is the memory divided

2018年12月1日:开源日报第268期
High stands for high addresses

Memory is divided in multiple segments, two of the most important ones (for this post) are the stack and heap. The stack is an ordered insertion place while the heap is all random — you allocate memory wherever you can.

Stack memory has a set of ways and operations for its work (it’s where some of your processor’s registers information gets saved) and it’s where relevant information about your program goes — which functions are called, what variables you created, and some more information. This memory is also managed by the program and NOT by the developer.

The heap is often used to allocate big amounts of memory that is supposed to exist as long as the developer wants. That said, it’s the developer’s job to control the usage of the memory on the heap. When building complex programs, you often need to allocate big chunks of memory, and that’s where you use the heap. We call this Dynamic Memory.

You’re placing things on the Heap every time you use malloc to allocate memory for something. Any other call that goes like int i; is stack memory. Knowing this is really important so that you can easily find errors in your program and futher improve your Segfault error search!

Understanding the stack

Although you don’t know about it, your program is constantly allocating stack memory for it to work. Every local variable and every function you call goes there. With this, you can do a lot of things — most of them are things that you did not want to happen — like buffer overflows, accessing incorrect memory, etc.

So how does it really work?
The stack is a LIFO (Last-In-First-Out) data structure, which you can image as a box of perfectly fitted books — the last book you place is the first one you take out. By using this structure, the program can easily manage all its operations and scopes by using 2 simple operations: push and pop. These 2 do exactly the opposite of each other: push inserts the value to the top of the stack while pop takes the top value from it.

2018年12月1日:开源日报第268期
Push and Pop operations

To keep track of the current memory place, there is a special processor register called Stack Pointer. Every time you need to save something (a variable or the return address from a function), it pushes and moves the stack pointer up. Every time you exit from a function, it pops everything from the stack pointer until the saved return address from the function. It’s simple!

To test if you understood, let’s use the following example (try and find the bug alone ☺️):

2018年12月1日:开源日报第268期
Everything looks ok — until you run it

If you run it, the program will simply segfault. Why does this happen? Everything looks in place! Except about… the stack.

When we call the function createArray the stack saves the return address, creates arr in stack memory and returns it (an array is simply a pointer to a memory location with its information) but since we didn’t use malloc it gets saved in stack memory. After we return the pointer, since we don’t have any control over stack operations, the program pops the info from the stack and uses it as it needs. When we try to fill in the array after we returned from the function, we corrupt the memory — making the program segfault.

Understanding the heap

In opposition to the stack, the heap is what you use when you want to something to exist for some time independently of functions and scopes. To use this memory, C language stdlib is really good as it brings two awesome functions: malloc and free .

Malloc (memory allocation) requests the system for the amount of memory that was asked for, and returns a pointer to the starting address. Free tells the system that the memory we asked for is no longer needed and can be used for other tasks. Looks really simple — as long as you avoid mistakes.

Since the system can’t overwrite what develpers asked for, it depends on us, humans, to manage it with the 2 functions above. This opens the door for one human error: Memory Leaks.

Memory Leak is memory that was requested by the user that was never free’d when the program ended or that pointers to its locations were lost, making the program use much more memory than what it was supposed to. To avoid this, every time we don’t need an heap allocated element anymore, we free it.

2018年12月1日:开源日报第268期
Pointers: bad vs good

In the picture above, in the bad way never free the memory we used. This ends up in wasting 20 * 4 bytes (int size in 64-bit) = 80 bytes. This might not look that much, but imagine not doing this in a giant program. We can end up wasting gigabytes!

Managing your heap memory is essential to make your programs be memory efficient. But you also need to be careful on how you use it. Just like in stack memory, after the memory is free’d, accessing it or using it might cause you a segfault.

Bonus: Structs and the heap

One of the common mistakes when using structs is to just free the struct — which is fine, as long as we didn’t allocate memory to pointers inside the struct. If memory is allocated to pointers inside the struct, we need to first free them and only after free the entire struct.

2018年12月1日:开源日报第268期
Look at how I used free

How I solve my memory leaks problems

Most of the time I program in C I’m using structs, therefore I always have 2 mandatory functions to use with my structs: the constructor, and the destructor. These 2 functions are the only ones where I use mallocs and frees on the struct, making it really simple and easy to solve my memory leaks (If you would like to know more about making code easier to read, check my post on abstraction).

2018年12月1日:开源日报第268期
A way to create, and a way to destroy!

Don’t forget to follow me!

Besides posting here on Medium, I’m also on Twitter.

If you have any questions or suggestions, don’t hesitate to contact me!


每天推荐一个 GitHub 优质开源项目和一篇精选英文科技或编程文章原文,欢迎关注开源日报。交流QQ群:202790710;微博:https://weibo.com/openingsource;电报群 https://t.me/OpeningSourceOrg