每天推薦一個 GitHub 優質開源項目和一篇精選英文科技或編程文章原文，歡迎關注開源日報。交流QQ群：202790710；電報群 https://t.me/OpeningSourceOrg

推薦理由：Uppy 是一個整潔的，模塊化的，幾乎可以與任何瀏覽器結合的文件上傳組件，它快速，易用，能讓你把心思放在比文件上傳組件更重要的事情上。

如何使用Uppy？

Uppy應對瀏覽器崩潰的方法：

隨著人們每天在網路上不斷的進行文件傳輸，文件傳輸過程中出現意外已經司空見慣，如何在發生意外時減少使用者受到的損失已經成為了一個重要的問題。而uppy 使用了Golden Retriever 來解決這個問題。Golden Retriever 將你選定的文件保存在你的瀏覽器緩存之中，這樣當瀏覽器崩潰時，uppy 也能找回所有東西繼續上傳。但是如果在瀏覽器崩潰之後沒有進行恢復而是進行了某種清理緩存的操作的話……那就沒辦法了。

Golden Retriever 通過三種方法結合起來保存數據：

使用本地存儲來保存文件真正的數據，在瀏覽器啟動時恢復它們，但 uppy 不能直接訪問這些數據
使用Service Worker 來保存對文件轉變為BLOB（二進位大對象，是一個可以存儲二進位文件的容器）之後的引用，雖然這會在瀏覽器崩潰後失效
IndexedDB 可以永久保存BLOB 文件，但是在儲存文件的量上有諸多限制

所以當uppy 啟動時，信息將從本地存儲中恢復，從而讓uppy可以知道發生了什麼。對於BLOB來說，它們將從Service Worker 和 IndexedDB中恢復。當文件成功上傳後，或者是使用者決定刪除它們時，Golden Retriever將會把它們自動清理掉。

雖然在有些時候Golden Retriever也無法救回丟失的進度，比如一個非常大的文件在上傳時被打斷，或者瀏覽器完全徹底的崩潰。但是至少它能保存一些這個文件的關鍵信息，比如它的名字和預覽，以便讓使用者重新添加它們。

關於tus：

tus是一個可恢復用戶正在上傳中文件的上傳協議，支持所有平台的客戶端和伺服器端，當使用者結束當前進行中的上傳時，伺服器將存儲已上傳的數據（如果沒有發生內部錯誤或者存儲的數據會違反某些規定的話）。當使用者準備恢復上傳時，就向相應的URL發送一個請求來獲得可用的空間從而繼續上傳數據。但是如果經過了較長的時間還是沒有完成上傳的話（比如一周），伺服器就將刪除未完成的上傳文件。

關於作者：

Artur Paikin：喜歡旅行，種植蔬菜，泡coffee，編寫代碼，製作網頁，並且致力於家庭的自動化。他用Instagram 上傳照片，用Facebook，twitter 寫小說。

個人主頁

今日推薦英文原文：《How to Choose the Right Open Source Database》作者：Vivek Ratan

原文鏈接：https://opensourceforu.com/2018/04/how-to-choose-the-right-open-source-database/

推薦理由：不管你做哪種類型的開發，基本上都離不開資料庫，怎麼挑選一個合適的開源資料庫呢？不同資料庫有什麼異同？應該從哪些維度去衡量一個資料庫軟體呢？

How to Choose the Right Open Source Database

Databases store data in an organised manner so that its retrieval becomes easy. Also, the management of data is easier when it is stored in a database. There are many factors to be considered, however, before choosing a database for a particular software application. Let』s take a look at how we can make the right choice.

Considering the number of users working with different online applications nowadays, databases are about the most important part of any software application, and can make a huge impact on its performance. It is really difficult and time consuming for a database to operate if it has a huge load of data or if it has a variety of data sets to operate upon. There are also several other factors like the security of data, as well as the cost (if there is one) associated with the database, impacting the choice of the database. A free database helps users to avoid huge costs. Hence, open source databases have been playing quite an important role in many of the widely used software applications across the globe.

Different kinds of databases available in the market

There are various types of databases being used by different users across the globe, based on their requirement. The following is a broad classification of databases.

1. Relational databases: Relational databases are the most common among all the types. In such databases, the data is actually stored in the form of different data tables. Each of the tables has a unique key field and that key is used to connect one table to the other tables. Hence, different tables are related to each other with the help of various key fields. Such databases are widely used in industries like media, telecom, etc, and this is probably the type of database one is most likely to come across.

2. Operational databases: An operational database is very important for organisations, as it supports the customer database and the inventory database. It helps companies to keep a track of inventories as well as store details of the customers who buy its products. The data stored in various operational databases can be analysed and used based on the requirements of the company.

Figure 1: Interaction of a user with a DBMS

3. Database warehouses: There might be a requirement for different organisations to keep some of the relevant data for several years. Such data sets act as significant sources of information to compare and analyse the present year』s data with that of previous years, which makes it quite easy to determine key trends. All such data from previous years is maintained and stored in a large database warehouse. Since the stored data set has already gone through various kinds of editing, screening and integration, it no longer requires any editing or alteration.4. Distributed databases: Many organisations have several regional offices, manufacturing plants, branch offices and a head office. Each such work group may have its own set of databases, which collectively form the main database of the company. Such a system of databases is known as a distributed database.

5. End user databases: There is a variety of data available at the workstations of different end users of an organisation. Each workstation acts like a small database in itself, and has data in the form of spreadsheets, presentations, downloaded files, Word files and in Notepad format. All such small databases together form an end user database.

Now let』s have a look at some of the popular open source databases available in the market.

MySQL

MySQL has been in use since 1995 and is currently owned by Oracle. In addition to its open source version, there are a couple of paid editions available as well, which offer some additional features, such as cluster geo-replication and automatic scaling. We all know that MySQL has become an industry standard now, as it is compatible with almost every operating system and has been written in both C and C++. This database option is great for different international users as well, since the server can provide various error messages to the clients in a number of languages.

Pros:

Offers a flexible privilege and password system
Can be used even if there is no available network
Uses host-based verification
It consists of different libraries that can be embedded into various standalone applications
It has security encryption for all the password traffic
Supports servers as a separate program for the client server networked environment

Cons:

Users feel that MySQL no longer falls under the free OS category
It is no longer community driven, so bug fixes and patches do not happen in time
Falls behind other similar available options due to its slow updates

PostgreSQL

PostgreSQL, also called Postgres, is basically an object-relational DBMS with much emphasis on extensibility and standards compliance. As a database server, its important functions are to securely store data and return the data or data sets in response to different requests made from other software applications. PostgreSQL can handle different workloads, ranging from that of small single-machine applications to even large Internet-facing applications with many concurrent users. It has been developed by PostgreSQL Global Development Group, and is available as free and open source software.

Pros:

It is transactional and ACID (atomicity, consistency, isolation, durability)-compliant
Supports updatable views and materialised views
Functions, stored procedures, triggers, etc, can be very well used in it
Supports concurrency with the help of a system known as MVCC (Multi Version Concurrency Control).
Provides three levels of transaction isolation, which are: Read Committed, Repeatable Read and Serialisable
Supports serialisability using the serialisable snapshot isolation technique
Supports a large variety of data types like Boolean, binary, etc

Cons:

It does not have any bug tracker (whereas it supports a bug-submission form, which feeds into the pgsql-bugs mailing list), making it a bit difficult to know the status of bugs.

Figure 2: Global database markets at a glance

MariaDB

MariaDB is a database that is widely used by tech giants like Wikipedia, Facebook and even Google. It has been developed by various developers who worked as part of MySQL. It is basically a database server that provides drop-in replacement functionality for MySQL. Data security is one of the most significant concerns and priorities for developers of MariaDB, and almost in each of its solution releases the developers merge in all of MySQL』s security patches and also enhance the same, if required.

Pros:

It provides real-time access to the data sets
It supports the maximum number of core functionalities of MySQL (it is an alternative to MySQL)
It supports high scalability with easier integration
It provides a couple of alternate patches, storage engines and server optimisations

Cons:

It does not provide support for the memcached interface
The password complexity plugin is not available
Has no optimiser trace

Choosing the best available open source database

1. Size of data: When choosing a database, we should always consider the volume of data that we need to retrieve and store as critical application data in a database. The amount of data that we can retrieve and store may vary depending on the combination of the selected data structure, and the ability of any database to differentiate between various data sets available across multiple servers and file systems. Hence, we need to choose our database by considering the overall volume of data generated by a software application at any specific rate, and also the size of data that needs to be retrieved from the database.

2. Speed and scalability: We must also gauge the speed that we require for reading and writing different sets of available data into the database—the time taken to service all the incoming reads and writes to any specific application. Some databases are designed to optimise read-heavy applications, whereas others are designed to support write-heavy solutions. Selecting a database that can handle our application』s input/output needs can really go a long way to a scalable architecture.

3. Structure of data: Structure of the data set is all about the manner in which we need to store and retrieve our data. Since an application deals with data present in diverse formats, before selecting a database we should consider picking the right data structure for storing and retrieving the data sets. If we fail to select the right data structure for persisting available data, the application may take more time to retrieve data from the database. This could also lead to more development efforts to work around any of the data issues.

4. Accessibility of data: We should also think about the number of users that may concurrently access the database in order to perform any operation on available data, and also the level of computation involved in accessing any set of data. The processing speed of software applications may get affected if the database chosen is not really good enough to handle large loads.

5. Safety and security of data: We must also check the level of security that a database provides for the data stored in it. In case of highly confidential data, we really need to have a highly secured database for the application using it. The different safety measures implemented by a database in case of any system crash or failure are an important factor that you should look at before choosing a database.

每天推薦一個 GitHub 優質開源項目和一篇精選英文科技或編程文章原文，歡迎關注開源日報。交流QQ群：202790710；電報群 https://t.me/OpeningSourceOrg

2018年4月19日：開源日報第42期

如何使用Uppy？

Uppy應對瀏覽器崩潰的方法：

Golden Retriever 通過三種方法結合起來保存數據：

關於tus：

關於作者：