A “lexicalised” parser might do even better. Take the Groucho Marx joke, “One morning I shot an elephant in my pyjamas. How he got in my pyjamas, I’ll never know.” The first sentence is ambiguous (which makes the joke)—grammatically both “I” and “an elephant” can attach to the prepositional phrase “in my pyjamas”. But a lexicalised parser would recognise that “I [verb phrase] in my pyjamas” is far more common than “elephant in my pyjamas”, and so assign that parse a higher probability. (構文解析ソフトならもっとうまくやるかもしれない。グルチョ・マルクスのジョーク“One morning I shot an elephant in my pyjamas. How he got in my pyjamas, I’ll never know.”を取り上げてみよう。最初の文は曖昧(だからジョークになっている)なので、文法的にはIとan elephantの両方が前置詞句のin my pyjamasにかかる。しかし構文解析ソフトは“I [verb phrase] in my pyjamas”が“elephant in my pyjamas”よりもはるかに普通であることを認識し、この読み取りを可能性の高いものとするだろう)
「ある朝パジャマにいた象を打ったんだ。どうやってパジャマに入ったかは、知るよしもない」は有名な引用のようでAmerican Film Institute – 100 Years, 100 Movie Quotesの一つに選ばれているとか。常識では象がパジャマの中に入るなんてありえないし、「パジャマ姿で〜した」の方が普通だと判断して読み取るのでしょう。ジョークはその判断を揺さぶるから面白くなるんです。語学初学者がジョークを笑えないのはこの当たり前の知識が当たり前になっておらず揺さぶられていることがないからですね。
One morning, I shot an elephant in my pajamas. How he got in my pajamas, I don't know. Then we tried to remove the tusks. The tusks. That's not so easy to say, tusks. You try that some time...As I say, we tried to remove the tusks, but they were embedded in so firmly that we couldn't budge them. Of course, in Alabama, the Tusk-a-loosa. But, uh, that's entirely ir-elephant to what I was talking about.
“Who plays Thor in ‘Thor’?” Your correspondent could not remember the beefy Australian who played the eponymous Norse god in the Marvel superhero film. But when he asked his iPhone, Siri came up with an unexpected reply: “I don’t see any movies matching ‘Thor’ playing in Thor, IA, US, today.” Thor, Iowa, with a population of 184, was thousands of miles away, and “Thor”, the film, has been out of cinemas for years. Siri parsed the question perfectly properly, but the reply was absurd, violating the rules of what linguists call pragmatics: the shared knowledge and understanding that people use to make sense of the often messy human language they hear. “Can you reach the salt?” is not a request for information but for salt. Natural-language systems have to be manually programmed to handle such requests as humans expect them, and not literally. (“Who plays Thor in ‘Thor’?”相手がMarvelのスーパーヒーローの映画のタイトルになった北欧神話の髪を演じた体格の良いオーストラリア人を覚えていなかったとしよう。彼が自分のiPhoneに尋ねたところ、Siriは思いもよらない返答をした。「米国アイオワ州Thorで上演しているThorに合致する映画は見当たりません」Thorとはアイオワ州にあり人口184人で何千マイルも離れていたし、Thorという映画は何年も映画館で上演されていない。Siriは質問を適切に解析したが、回答は的外れで、言語学者が語用論(pragmatics)と呼ぶルールを逸脱している。これは耳にする人間の言語が乱れていた際に意味を通るようにするための共有の知識・理解である。“Can you reach the salt?”は情報ではなく塩を求めているのである。自然言語システムは人が介在してプログラムを施し、文字通りに受け取るのではなく人が期待する要求として処理できるようにしないといけない)
雑誌記事でのタイトルはSay ARとなっていました。音声認識技術の紹介でNow we’re talking.(待ってました)を使ったように今回は会議で決を取るときに使われるAll in favor, say aye.(賛成の方はAyeと言ってください)からAR技術を肯定的に捉えようとしているとみました。Economistは新技術が好きですねえ(笑)
THE history of computers is one of increasing intimacy. At first users rented time on mainframe machines they did not own. Next came the “personal computer”. Although PCs were confined to desks, ordinary people could afford to buy them, and filled them with all manner of personal information. These days smartphones go everywhere in their owners’ pockets, serving as everything from a diary to a camera to a voice-activated personal assistant.
The next step, according to many technologists, is to move the computer from the pocket to the body itself. The idea is to build a pair of “smart glasses” that do everything a smartphone can, and more. A technology called “augmented reality” (AR) would paint computerised information directly on top of the wearers’ view of the world. Early versions of the technology already exist (see article). If it can be made to work as its advocates hope, AR could bring about a new and even more intimate way to interact with machines. In effect, it would turn reality itself into a gigantic computer screen.
Economistの文章って精読する価値があるなと思えるのはよく整理された論理的であるからです。今回の書き出しではコンピュータの歴史をTHE history of computers is one of increasing intimacy.とincreasing intimacyと言い切ることから始めています(一語じゃなくて二語じゃないかと突っ込みがあるかもですが(汗))。このintimacyを発展させた先に今回取り上げる“augmented reality” (AR)があるというのです。その真偽は脇に置くなら見事な進め方ですよね。
Designing a nifty piece of technology, though, is not the same as ushering in a revolution. Social factors often govern the path to mass adoption, and for AR, two problems stand out. One is aesthetic. The HoloLens is an impressive machine, but few would mistake it for a fashion item. Its alien appearance makes its wearers look more creepy than cool. One reason the iPhone was so successful was that it was a beautiful piece of design. Its metal finish and high-quality components, allied with a big advertising push from Apple, all helped establish it as a desirable consumer bauble.
The other big problem surrounds consent. The history of one much-hyped set of smart glasses should give the industry pause. In 2013 Google launched its “Glass” headsets to a chosen segment of the public. As well as those who thought the product looked silly, plenty found the glasses sinister, worrying that their users were covertly filming everyone they came into contact with. “Glassholes” became social pariahs. Two years later, Google withdrew Glass from sale.
このハードルもBoth of these problems are solvable. と社説では書いていますが、慎重ながらも将来性に期待している感じですね。Economistの力の入れようを感じ取ることができるのは今週のScience & Technologyのセクションはこのトピック一つだけで3000語近くの記事を載せているのです。普段は4-5つの記事があるので異色さが目立ちます。社説では将来性や問題点の大きな方向性を描くだけですが、こちらの記事はより詳しく最新の動向や細かな問題点を知ることできます。
AR is a close cousin to virtual reality (VR). There is, though, a crucial difference between them: the near-opposite meanings they ascribe to the term “reality”. VR aims to drop users into a convincing, but artificial, world. AR, by contrast, supplements the real world by laying useful or entertaining computer-generated data over it. Such an overlay might be a map annotated with directions, or a reminder about a meeting, or even a virtual alien with a ray gun, ripe for blasting. Despite the hype and prominence given recently to VR, people tend to spend more time in real realities than computer-generated ones. AR thus has techies licking their lips in anticipation of a giant new market. Digi-Capital, a firm of merger and acquisitions advisors in California, reckons that of the $108 billion a year which it predicts will be spent by 2021 on VR and AR combined, AR will take three-quarters.
また最新動向も知ることができます。Googleが新しい端末を出していたなんて知りませんでした。
At the end of last year Google and Lenovo, a Chinese hardware manufacturer, unveiled the Phab 2 Pro, the first phone to implement a piece of Google technology called Tango. The idea is that, by giving the phone an extra set of sensors, it can detect the shape of the world around it. Using information from infra-red detectors, a wide-angle lens and a “time-of-flight” camera (which measures how long pulses of light take to reflect off the phone’s surroundings) Tango is able to build up a three-dimensional image of those surroundings. Armed with all this, a Tango-enabled phone can model a house, an office or any other space, and then use that model as a canvas upon which to draw things.
To give an idea of what is possible, Google has written apps that would be impossible on Tango-less phones. “Measure”, for instance, overlays a virtual tape measure on the phone’s screen. Point it at a door, and it will tell you how wide and high that portal is. Point it at a bed, and you get the bed’s dimensions—letting you work out whether it will fit through the door. Another Tango app is the oddly spelled “Woorld”, which lets users fill their living rooms with virtual flowers, houses and rocket ships, all of which will interact appropriately with the scenery. Place the rocket behind a television, for instance, and the set will block your view of it.
Such glasses do exist. So far, though, they have made a bigger impact on the workplace than in the home. Companies such as Ubimax, in Germany, or Vuzix, in New York, make AR spectacles that include cameras and sensors, and which use a projector mounted on the frame to place what looks like a small, two-dimensional screen into one corner of the wearer’s vision.
Used in warehouses, for instance, that screen—in combination with technology which tracks workers and parcels—can give an employee instructions on where to go, the fastest route to get there and what to pick up when he arrives, all the while leaving both of his hands free to move boxes around. Ubimax reckons that could bring a 25% improvement in efficiency. At a conference in London in October, Boeing, a big American aeroplane-maker, described how it was using AR glasses to give workers in its factories step-by-step instructions on how to assemble components, as well as to check that the job had been done properly. The result, said Paul Davies of Boeing’s research division, is faster work with fewer mistakes.
IN “STAR TREK” it was a hand-held Universal Translator; in “The Hitchhiker’s Guide to the Galaxy” it was the Babel Fish popped conveniently into the ear. In science fiction, the meeting of distant civilisations generally requires some kind of device to allow them to talk. High-quality automated translation seems even more magical than other kinds of language technology because many humans struggle to speak more than one language, let alone translate from one to another.
FINDING A VOICE Language: Finding a voice Computers have got much better at translation, voice recognition and speech synthesis, says Lane Greene. But they still don’t understand the meaning of language
I’M SORRY, Dave. I’m afraid I can’t do that.” With chilling calm, HAL 9000, the on-board computer in “2001: A Space Odyssey”, refuses to open the doors to Dave Bowman, an astronaut who had ventured outside the ship. HAL’s decision to turn on his human companion reflected a wave of fear about intelligent computers.
When the film came out in 1968, computers that could have proper conversations with humans seemed nearly as far away as manned flight to Jupiter. Since then, humankind has progressed quite a lot farther with building machines that it can talk to, and that can respond with something resembling natural speech. Even so, communication remains difficult. If “2001” had been made to reflect the state of today’s language technology, the conversation might have gone something like this: “Open the pod bay doors, Hal.” “I’m sorry, Dave. I didn’t understand the question.” “Open the pod bay doors, Hal.” “I have a list of eBay results about pod doors, Dave.”
Speech recognition: I hear you Computers have made huge strides in understanding human speech
Perhaps the most important feature of a speech-recognition system is its set of expectations about what someone is likely to say, or its “language model”. Like other training data, the language models are based on large amounts of real human speech, transcribed into text. When a speech-recognition system “hears” a stream of sound, it makes a number of guesses about what has been said, then calculates the odds that it has found the right one, based on the kinds of words, phrases and clauses it has seen earlier in the training text.
******
Advance knowledge of what kinds of things the speaker might be talking about also increases accuracy. Words like “phlebitis” and “gastrointestinal” are not common in general discourse, and uncommon words are ranked lower in the probability tables the software uses to guess what it has heard. But these words are common in medicine, so creating software trained to look out for such words considerably improves the result. This can be done by feeding the system a large number of documents written by the speaker whose voice is to be recognised; common words and phrases can be extracted to improve the system’s guesses.
Hasta la vista, robot voice Machines are starting to sound more like humans
But prosody matters when someone is telling a story. Pitch, speed and volume can be used to pass quickly over things that are already known, or to build interest and tension for new information. Myriad tiny clues communicate the speaker’s attitude to his subject. The phrase “a German teacher”, with stress on the word “German”, may, in the context of a story, not be a teacher of German, but a teacher being explicitly contrasted with a teacher who happens to be French or British.
Meaning and machine intelligence: What are you talking about? Machines cannot conduct proper conversations with humans because they do not understand the world
How do natural-language platforms know what people want? They not only recognise the words a person uses, but break down speech for both grammar and meaning. Grammar parsing is relatively advanced; it is the domain of the well-established field of “natural-language processing”. But meaning comes under the heading of “natural-language understanding”, which is far harder.
常識というと大げさに聞こえるかもしれませんが、普段なら意識することがなく理解していることも機械で処理させるには苦労させることがあるようです。This is not drinking water.という文を例に挙げて説明してくれています。
First, parsing. Most people are not very good at analysing the syntax of sentences, but computers have become quite adept at it, even though most sentences are ambiguous in ways humans are rarely aware of. Take a sign on a public fountain that says, “This is not drinking water.” Humans understand it to mean that the water (“this”) is not a certain kind of water (“drinking water”). But a computer might just as easily parse it to say that “this” (the fountain) is not at present doing something (“drinking water”).
Shared information is also built up over the course of a conversation, which is why digital assistants can struggle with twists and turns in conversations. Tell an assistant, “I’d like to go to an Italian restaurant with my wife,” and it might suggest a restaurant. But then ask, “is it close to her office?”, and the assistant must grasp the meanings of “it” (the restaurant) and “her” (the wife), which it will find surprisingly tricky. Nuance, the language-technology firm, which provides natural-language platforms to many other companies, is working on a “concierge” that can handle this type of challenge, but it is still a prototype.
Such a concierge must also offer only restaurants that are open. Linking requests to common sense (knowing that no one wants to be sent to a closed restaurant), as well as a knowledge of the real world (knowing which restaurants are closed), is one of the most difficult challenges for language technologies.
Proper conversation between humans and machines can be seen as a series of linked challenges: speech recognition, speech synthesis, syntactic analysis, semantic analysis, pragmatic understanding, dialogue, common sense and real-world knowledge. Because all the technologies have to work together, the chain as a whole is only as strong as its weakest link, and the first few of these are far better developed than the last few.
The hardest part is linking them together. Scientists do not know how the human brain draws on so many different kinds of knowledge at the same time. Programming a machine to replicate that feat is very much a work in progress.
ブログのUIというのはuser interfaceですがEconomistはbeing able to talk to computers abolishes the need for the abstraction of a “user interface” at all.と話せるようになればUIなんか不要になると言っています。
This is a huge shift. Simple though it may seem, voice has the power to transform computing, by providing a natural means of interaction. Windows, icons and menus, and then touchscreens, were welcomed as more intuitive ways to deal with computers than entering complex keyboard commands. But being able to talk to computers abolishes the need for the abstraction of a “user interface” at all. Just as mobile phones were more than existing phones without wires, and cars were more than carriages without horses, so computers without screens and keyboards have the potential to be more useful, powerful and ubiquitous than people can imagine today.
Although deep learning means that machines can recognise speech more reliably and talk in a less stilted manner, they still don’t understand the meaning of language. That is the most difficult aspect of the problem and, if voice-driven computing is truly to flourish, one that must be overcome. Computers must be able to understand context in order to maintain a coherent conversation about something, rather than just responding to simple, one-off voice commands, as they mostly do today (“Hey, Siri, set a timer for ten minutes”). Researchers in universities and at companies large and small are working on this very problem, building “bots” that can hold more elaborate conversations about more complex tasks, from retrieving information to advising on mortgages to making travel arrangements. (Amazon is offering a $1m prize for a bot that can converse “coherently and engagingly” for 20 minutes.)
We’ve been able to welcome over half a million guests… our outstanding pastry chefs have baked 200,000 holiday cookies… and Barack has treated the American people to countless dad jokes. They are great jokes. Not so funny. Although a few got a…Frosty reception.
(Urban dictionary) Dad Jokes An indescribably cheesy and/or dumb joke made by a father to his children.
8年の任期最後のクリスマスメッセージなのでこれまでの成果をしっかりとアピールしています。
THE PRESIDENT: And the greatest gift that Michelle and I have received over the last eight years has been the honor of serving as your President and First Lady. Together, we fought our way back from the worst recession in 80 years, and got unemployment to a nine-year low. We secured health insurance for another twenty million Americans, and new protections for folks who already had insurance. We made America more respected around the world, took on the mantle of leadership in the fight to protect this planet for our kids, and much, much more. By so many measures, our country is stronger and more prosperous than it was when we first got here. And I’m hopeful we’ll build on the progress we’ve made in the years to come.