開源日報 每天推薦一個 GitHub 優質開源項目和一篇精選英文科技或編程文章原文,堅持閱讀《開源日報》,保持每日學習的好習慣。
今日推薦開源項目:《欲知此事須躬行 simple-computer》
今日推薦英文原文:《Getting to Know Natural Language Understanding》

今日推薦開源項目:《欲知此事須躬行 simple-computer》傳送門:GitHub鏈接
推薦理由:But How Do It Know?——這是一本介紹計算機如何工作的書。而這個項目則是項目作者為了模擬書中提到的計算機而創造的。雖然自己從簡單到複雜的實現它是相當麻煩的一件事情,但是從中可以學到一些只有在實踐中才能獲得的知識,在獲得新知識之後,在實踐中檢驗它是最好的。紙上得來終覺淺,欲知此事須躬行。
今日推薦英文原文:《Getting to Know Natural Language Understanding》作者:#ODSC - Open Data Science
原文鏈接:https://medium.com/@ODSC/getting-to-know-natural-language-understanding-f18a0dc5c97d
推薦理由:關於自然語言處理的簡介

Getting to Know Natural Language Understanding

We like to imagine talking to computers the way Picard spoke to Data in Next Generation, but in reality, natural language processing is more than just teaching a computer to understand words. The subtext of how and why we use the words we do is notoriously difficult for computers to comprehend. Instead of Data, we get frustrations with our assistants and endless SNL jokes.

Related article: An Introduction to Natural Language Processing (NLP):https://opendatascience.com/an-introduction-to-natural-language-processing-nlp/

The Challenges of AI Language Processing

Natural Language Understanding (NLU) is a subfield of NLP concerned with teaching computers to comprehend the deeper contextual meanings of human communication. It』s considered an AI-hard problem for a few notable reasons. Let』s take a look at why computers can win chess matches against world champions and calculate billions of bits of data in seconds but can』t seem to grasp sarcasm.

Humans Make Mistakes

The first obstacle is teaching a computer to understand despite typos and misspellings. Humans aren』t always accurate in what they write, but a simple typo that you could skip right over without missing a beat could be enough to trip up the filters for computer understanding.

Human Speech Requires Context

We mentioned sarcasm above, but understanding the true meaning of utterances requires a strong understanding of context. Not only do sarcastic replies affect the outcome but not every negative utterance involves the presence of an explicitly negative word. To ask 「How was lunch?」 and receive a reply 「I spend the entire time waiting at the doctor」 is clear to you (lunch was bad) but not necessarily to a computer trained to search for negative words (no, not for example).

Human Language is Irregular

Language understanding also requires input from variances in the same language. British English and American English have overall similarities, but a few things different, including spelling and meaning, can trip up a computer. And those are just two of the many, many versions of English, which in itself is a non-standard language and still remains the most parsed language in all of NLP. What about the others?

Related article: The Promise of Retrofitting: Building Better Models for Natural Language Processing:https://opendatascience.com/models-for-natural-language-processing/

What Is Natural Language Understanding?

Natural Language Processing is the system we use to handle machine/human interactions, but NLU is a bit more narrow than that. When you』re in doubt, use NLU to refer to the simple act of machines understanding what we say.

NLU is post-processing. Once your algorithms have scrubbed the text, adding part of speech tagging, for example, you begin to work with the real context of what』s going on. This post-processing is what starts to reveal to the computer the true meanings of text and not just surface understanding.

NLU is a huge problem and an ongoing research area because the ability of computers to recognize and process human language at human-like accuracy has an enormous possibility. Computers could finally stand in for low paid customer service agents, capable of understanding human speech and its intent.

In language teaching, students often complain that they can understand their teacher』s language, but that understanding doesn』t transfer when they walk outside the classroom. Computers are similar to these language students. When researchers formulate test texts, for example, they may unconsciously formulate them in ways that avoid those three common problems above, a luxury not afforded in a real-world context. A Twitter user isn』t going to scrub tweets of misspellings and ambiguous language before publishing, but that』s precisely what the computer must understand.

The subfield relies heavily on both training lexicons and semantic theory. We can quantify semantics to an extent as long as we have large amounts of training data to provide context. As computers consume this training data, deep learning begins to make sense of intent.

The biggest draw for NLU is a computer』s ability to interact with humans unsupervised. The algorithms classify speech into a structured ontology, but AI takes over to organize the intent behind the words. This method of deep learning allows computers to learn context and create rules based on more substantial amounts of input through training.

What Are The Implications?

Aside from everyone having their very own Data? Cracking Natural Language Understanding is the key piece of computers learning to understand human language without extraordinary intervention from humans themselves.

NLU can be used to provide predictive insights for businesses by analyzing the unstructured data feeds of things like news reports, for example. This capability is especially true in areas such as high-frequency trading where trades are handled by automated systems.

Unlocking NLU also rockets our AI assistants like Siri and Alexa into what finally counts as real human interaction. Siri still contains numerous errors exploited for humor by places like SNL, and those errors plague developers in search of human-like accuracy. If developers want off the SNL joke series, cracking AI is the key.

Humans are still reigning champions for understanding language despite roadblocks (mispronunciations, misspellings, colloquialisms, implicit meaning), but the NLU problem could unlock the final door we need for machines to step up to our level.
下載開源日報APP:https://openingsource.org/2579/
加入我們:https://openingsource.org/about/join/
關注我們:https://openingsource.org/about/love/