XPRIZE, the non-profit organization developing and managing competitions to find solutions to social challenges, has named two grand prize winners in the Elon Musk-backed Global Learning XPRIZE .
The companies, KitKit School out of South Korea and the U.S., and onebillion, operating in Kenya and the U.K., were announced at an awards ceremony hosted at the Google Spruce Goose Hangar in Playa Vista, Calif.
XPRIZE set each of the competing teams the task of developing scalable services that could enable children to teach themselves basic reading, writing, and arithmetic skills within 15 months.
Musk himself was on hand to award $5 million checks to each of the winning teams.
Five finalists including: New York-based CCI, which developed lesson plans and a development language so non-coders could create lessons; Chimple, a Bangalore-based, learning platform enabling children to learn reading, writing and math on a tablet; RobotTutor, a Pittsburgh-based company which used Carnegie Mellon research to develop an app for Android tablets that would teach lessons in reading and writing with speech recognition, machine learning, and human computer interactions, and the two grand prize winners all received $1 million to continue developing their projects.
The tests required each product to be field tested in Swahili, reaching nearly 3,000 children in 170 villages across Tanzania.
All of the final solutions from each of the five teams that made it to the final round of competition have been open-sourced so anyone can improve on and develop local solutions using the toolkits developed by each team in competition.
Kitkit School, with a team from Berkeley, Calif. and Seoul, developed a program with a game-based core and flexible learning architecture to help kids learn independently, while onebillion, merged numeracy content with literacy material to provide directed learning and activities alongside monitoring to personalize responses to children’s needs.
Both teams are going home with $5 million to continue their work.
The problem of access to basic education affects more than 250 million children around the world, who can’t read or write and one-in-five children around the world aren’t in school, according to data from UNESCO.
The problem of access is compounded by a shortage of teachers at the primary ad secondary school level. Some research, cited by XPRIZE, indicates that the world needs to recruit another 68.8 million teachers to provide every child with a primary and secondary education by 2040.
Before the Global Learning XPRIZE field test, 74% of the children who participated were reported as never having attended school; 80% were never read to at home; and 90% couldn’t read a single word of Swahili.
After the 15 month program working on donated Google Pixel C tablets and pre-loaded with software, the number was cut in half.
“Education is a fundamental human right, and we are so proud of all the teams and their dedication and hard work to ensure every single child has the opportunity to take learning into their own hands,” said Anousheh Ansari, CEO of XPRIZE, in a statement. “Learning how to read, write and demonstrate basic math are essential building blocks for those who want to live free from poverty and its limitations, and we believe that this competition clearly demonstrated the accelerated learning made possible through the educational applications developed by our teams, and ultimately hope that this movement spurs a revolution in education, worldwide.”
After the grand prize announcement, XPRIZE said it will work to secure and load the software onto tablets; localize the software; and deliver preloaded hardware and charging stations to remote locations so all finalist teams can scale their learning software across the world.
A coalition of child protection and privacy groups has filed a complaint with the Federal Trade Commission (FTC) urging it to investigate a kid-focused edition of Amazon’s Echo smart speaker.
The complaint against Amazon Echo Dot Kids, which has been lodged with the FTC by groups including the Campaign for a Commercial-Free Childhood, the Center for Digital Democracy and the Consumer Federation of America, argues that the ecommerce giant is violating the Children’s Online Privacy Protection Act (Coppa) — including by failing to obtain proper consents for the use of kids’ data.
As with its other smart speaker Echo devices the Echo Dot Kids continually listens for a wake word and then responds to voice commands by recording and processing users’ speech. The difference with this Echo is it’s intended for children to use — which makes it subject to US privacy regulation intended to protect kids from commercial exploitation online.
The complaint, which can be read in full via the group’s complaint website, argues that Amazon fails to provide adequate information to parents about what personal data will be collected from their children when they use the Echo Dot Kids; how their information will be used; and which third parties it will be shared with — meaning parents do not have enough information to make an informed decision about whether to give consent for their child’s data to be processed.
They also accuse Amazon of providing at best “unclear and confusing” information per its obligation under Coppa to also provide notice to parents to obtain consent for children’s information to be collected by third parties via the online service — such as those providing Alexa “skills” (aka apps the AI can interact with to expand its utility).
A number of other concerns are also being raised about Amazon’s device with the FTC.
Amazon released the Echo Dot Kids a year ago — and, as we noted at the time, it’s essentially a brightly bumpered iteration of the company’s standard Echo Dot hardware.
There are differences in the software, though. In parallel Amazon updated its Alexa smart assistant — adding parental controls, aka its FreeTime software, to the child-focused smart speaker.
Amazon said the free version of FreeTime that comes bundled with the Echo Dot Kids provides parents with controls to manage their kids’ use of the product, including device time limits; parental controls over skills and services; and the ability to view kids’ activity via a parental dashboard in the app. The software also removes the ability for Alexa to be used to make phone calls outside the home (while keeping an intercom functionality).
A paid premium tier of FreeTime (called FreeTime Unlimited) also bundles additional kid-friendly content, including Audible books, ad-free radio stations from iHeartRadio Family, and premium skills and stories from the likes of Disney, National Geographic and Nickelodeon .
At the time it announced the Echo Dot Kids, Amazon said it had tweaked its voice assistant to support kid-focused interactions — saying it had trained the AI to understand children’s questions and speech patterns, and incorporated new answers targeted specifically at kids (such as jokes).
But while the company was ploughing resource into adding a parental control layer to Echo and making Alexa’s speech recognition kid-friendly, the Coppa complaint argues it failed to pay enough attention to the data protection and privacy obligations that apply to products targeted at children — as the Echo Dot Kids clearly is.
Or, to put it another way, Amazon offers parents some controls over how their children can interact with the product — but not enough controls over how Amazon (and others) can interact with their children’s data via the same always-on microphone.
More specifically, the group argues that Amazon is failing to meet its obligation as the operator of a child-directed service to provide notice and obtain consent for third parties operating on the Alexa platform to use children’s data — noting that its Children’s Privacy Disclosure policy states it does not apply to third party services and skills.
They are also objecting to how Amazon is obtaining parental consent — arguing its system for doing so is inadequate because it’s merely asking that a credit or debit/debit gift card number be inputted.
“It does not verify that the person “consenting” is the child’s parent as required by Coppa,” they argue. “Nor does Amazon verify that the person consenting is even an adult because it allows the use of debit gift cards and does not require a financial transaction for verification.”
Another objection is that Amazon is retaining audio recordings of children’s voices far longer than necessary — keeping them indefinitely unless a parent actively goes in and deletes the recordings, despite Coppa requiring that children’s data be held for no longer than is reasonably necessary.
They found that additional data (such as transcripts of audio recordings) was also still retained even after audio recordings had been deleted. A parent must contact Amazon customer service to explicitly request deletion of their child’s entire profile to remove that data residue — meaning that to delete all recorded kids’ data a parent has to nix their access to parental controls and their kids’ access to content provided via FreeTime — so the complaint argues that Amazon’s process for parents to delete children’s information is “unduly burdensome” too.
Their investigation also found the company’s process for letting parents review children’s information to be similarly arduous, with no ability for parents to search the collected data — meaning they have to listen/read every recording of their child to understand what has been stored.
They further highlights that children’s Echo Dot Kids’ audio recordings can of course include sensitive personal details — such as if a child uses Alexa’s ‘remember’ feature to ask the AI to remember personal data such as their address and contact details or personal health information like a food allergy.
The group’s complaint also flags the risk of other children having their data collected and processed by Amazon without their parents consent — such as when a child has a friend or family member visiting on a playdate and they end up playing with the Echo together.
Responding to the complaint, Amazon has denied it is in breach of Coppa. In a statement a company spokesperson said: “FreeTime on Alexa and Echo Dot Kids Edition are compliant with the Children’s Online Privacy Protection Act (COPPA). Customers can find more information on Alexa and overall privacy practices here: https://www.amazon.com/alexa/voice [amazon.com].”
At the time of writing the FTC had not responded to a request for comment on the complaint.
Over in Europe, there has been growing concern over the use of children’s data by online services. A report by England’s children’s commissioner late last year warned kids are being “datafied”, and suggested profiling at such an early age could lead to a data-disadvantaged generation.
Responding to rising concerns the UK privacy regulator launched a consultation on a draft Code of Practice for age appropriate design last month, asking for feedback on 16 proposed standards online services must meet to protect children’s privacy — including requiring that product makers put the best interests of the child at the fore, deliver transparent T&Cs, minimize data use and set high privacy defaults.
A set of new features for Android could alleviate some of the difficulties of living with hearing impairment and other conditions. Live transcription, captioning, and relay use speech recognition and synthesis to make content on your phone more accessible — in real time.
Announced today at Google’s I/O event in a surprisingly long segment on accessibility, the features all rely on improved speech-to-text and text-to-speech algorithms, some of which now run on-device rather than sending audio to a datacenter to be decoded.
The first feature to be highlighted, live transcription, was already mentioned by Google before. It’s a simple but very useful tool: open the app and the device will listen to its surroundings and simply display any speech it recognizes as text on the screen.
We’ve seen this in translator apps and devices, like the One Mini, and the meeting transcription highlighted yesterday at Microsoft Build. One would think that such a straightforward tool is long overdue, but in fact everyday circumstances like talking to a couple friends at a cafe, can be remarkably difficult for natural language systems trained on perfectly recorded single-speaker audio. Improving the system to the point where it can track multiple speakers and display accurate transcripts quickly has no doubt been a challenge.
Another feature enabled by this improved speech recognition ability is live captioning, which essentially does the same thing as above, but for video. Now when you watch a YouTube video, listen to a voice message, or even take a video call, you’ll be able to see what the person in it is saying, in real time.
That should prove incredibly useful not just for the millions of people who can’t hear what’s being said, but also those who don’t speak the language well and could use text support, or anyone watching a show on mute when they’re supposed to be going to sleep, or any number of other circumstances where hearing and understanding speech just isn’t the best option.
Captioning phone calls is something CEO Sundar Pichai said is still under development, but the “live relay” feature they demoed on stage showed how it might work. A person who is hearing-impaired or can’t speak will certainly find an ordinary phone call to be pretty worthless. But live relay turns the call immediately into text, and immediately turns text responses into speech the person on the line can hear.
Live captioning should be available on Android Q when it releases, with some device restrictions. Live transcribe is available now but a warning states that it is currently in development. Live relay is yet to come, but showing it on stage in such a complete form suggests it won’t be long before it appears.
As part of its rather bizarre news dump before its flagship Build developer conference next week, Microsoft today announced a slew of new pre-built machine learning models for its Cognitive Services platform. These include an API for building personalization features, a form recognizer for automating data entry, a handwriting recognition API and an enhanced speech recognition service that focuses on transcribing conversations.
Maybe the most important of these new services is the Personalizer. There are few apps and web sites, after all, that aren’t looking to provide their users with personalized features. That’s difficult, in part, because it often involves building models based on data that sits in a variety of silos. With Personalizer, Microsoft is betting on reinforcement learning, a machine learning technique that doesn’t need the kind of labeled training data typically used in machine learning. Instead, the reinforcement agent constantly tries to find the best way to achieve a given goal based on what users do. Microsoft argues that it is the first company to offer a service like this and the company itself has been testing the services on its Xbox, where it saw a 40% increase in engagement with its content after it implemented this service.
The handwriting recognition API, or Ink Recognizer as it is officially called, can automatically recognize handwriting, common shapes and documents. That’s something Microsoft has long focused on as it developed its Windows 10 inking capabilities, so maybe it’s no surprise that it is now packaging this up as a cognitive service, too. Indeed, Microsoft Office 365 and Windows use exactly this service already, so we’re talking about a pretty robust system. With this new API, developers can now bring these same capabilities to their own applications, too.
Conversation Transcription does exactly what the name implies: it transcribes conversations and it’s part of Microsoft’s existing speech-to-text features in the Cognitive Services lineup. It can label different speakers, transcribe the conversation in real time and even handle crosstalk. It already integrates with Microsoft Teams and other meeting software.
Also new is the Form Recognizer, a new API that makes it easier to extract text and data from business forms and documents. This may not sound like a very exciting feature, but it solves a very common problem and the service needs only five samples to understand how to extract data and users don’t have to do any of the arduous manual labeling that’s often involved in building these systems.
Form Recognizer is also coming to cognitive services containers, which allow developers to take these models outside of Azure and to their edge devices. The same is true for the existing speech-to-text and text-to-speech services, as well as the existing anomaly detector.
In addition, the company also today announced that its Neural Text-to-Speech, Computer Vision Read and Text Analytics Named Entity Recognition APIs are now generally available.
Some of these existing services are also getting some feature updates, with the Neural Text-to-Speech service now supporting five voices, while the Computer Vision API can now understand more than 10,000 concepts, scenes and objects, together with 1 million celebrities, compared to 200,000 in a previous version (are there that many celebrities?).
Forty-one percent of voice assistant users are concerned about trust, privacy and passive listening, according to a new report from Microsoft focused on consumer adoption of voice and digital assistants. And perhaps people should be concerned — all the major voice assistants, including those from Google, Amazon, Apple and Samsung, as well as Microsoft, employ humans who review the voice data collected from end users.
But people didn’t seem to know that was the case. So when Bloomberg recently reported on the global team at Amazon that reviews audio clips from commands spoken to Alexa, some backlash occurred. In addition to the discovery that our AI helpers also have a human connection, there were concerns over the type of data the Amazon employees and contractors were hearing — criminal activity and even assaults in a few cases, as well as the otherwise odd, funny or embarrassing things the smart speakers picked up.
The report said the team auditing Alexa commands has had access to location data and, in some cases, can find a customer’s home address. This is because the team has access to the latitude and longitude coordinates associated with a voice clip, which can be pasted into Google Maps to tie the clip to where it came from. Bloomberg said it wasn’t clear how many people have access to the system where the location information is stored.
This is precisely the kind of privacy violation that could impact user trust in the popular Echo speakers and other Alexa devices — and, by extent, other voice assistant platforms.
While some users may not have realized the extent of human involvement on Alexa’s backend, Microsoft’s study indicates an overall wariness around the potential for privacy violations and abuse of trust that could occur on these digital assistant platforms.
For example, 52 percent of those surveyed by Microsoft said they worried their personal information or data was not secure, and 24 percent said they don’t know how it’s being used. Thirty-six percent said they didn’t even want their personal information or data to be used at all.
These numbers indicate that the assistant platforms should offer all users the ability to easily and permanently opt out of the data collection practices — one click to say that their voice recording and private information will go nowhere, and will never be seen.
Forty-one percent of people also worried their voice assistant was actively listening or recording them, and 31 percent believed the information the assistant collected from them was not private.
Fourteen percent also said they didn’t trust the companies behind the voice assistant — meaning Amazon, Google and all the others.
“The onus is now on tech builders to respond, incorporate feedback and start building a foundation of trust,” the report warns. “It is up to today’s tech builders to create a secure conversational landscape where consumers feel safe.”
Though the study indicates people have worries about their personal information, it doesn’t necessarily mean people want to entirely shut off access to that data — some may want to offer their email and home address so Amazon can ship an item to their home, when they order it by voice, for instance. Other people may even opt into sharing more information if offered a tangible reward of some kind, the report also notes.
Despite all these worries, people largely said they performed tasks using voice instead of keyboards and touch screens. Even at this early stage, 57 percent said they would rather speak to a digital assistant; and 34 percent say they like to both type and speak, as needed.
A majority — 80 percent — said they were “somewhat” or “very” satisfied with their digital assistants. More than 66 percent said they used digital assistants weekly, and 19 percent used them daily. (This refers to not just voice, but any digital assistant, we should note.)
These high satisfaction numbers mean digital and voice assistants are not likely going away, but the mistrust issues and potential for abuse could lead consumers to decrease their use — or even switch brands to one that offered more security in time.
Imagine, for example, if Amazon et al. failed to clamp down on employee access to data, as Apple launched a mass market voice device for the home, similar in functionality and pricing to a Google Home mini or Echo Dot. That could shift the voice landscape further down the road.
The full report, which also examines voice trends and adoption rates, is here.
Voice recognition is a standard part of the smartphone package these days, and a corresponding part is the delay while you wait for Siri, Alexa, or Google to return your query, either correctly interpreted or horribly mangled. Google’s latest speech recognition works entirely offline, eliminating that delay altogether — though of course mangling is still an option.
The delay occurs because your voice, or some data derived from it anyway, has to travel from your phone to the servers of whoever operates the service, where it is analyzed and sent back a short time later. This can take anywhere from a handful of milliseconds to multiple entire seconds (what a nightmare!), or longer if your packets get lost in the ether.
Why not just do the voice recognition on the device? There’s nothing these companies would like more, but turning voice into text on the order of milliseconds takes quite a bit of computing power. It’s not just about hearing a sound and writing a word — understanding what someone is saying word by word involves a whole lot of context about language and intention.
Your phone could do it, for sure, but it wouldn’t be much faster than sending it off to the cloud, and it would eat up your battery. But steady advancements in the field have made it plausible to do so, and Google’s latest product makes it available to anyone with a Pixel.
Google’s work on the topic, documented in a paper here, built on previous advances to create a model small and efficient enough to fit on a phone (it’s 80 megabytes, if you’re curious), but capable of hearing and transcribing speech as you say it. No need to wait until you’ve finished a sentence to think whether you meant “their” or “there” — it figures it out on the fly.
So what’s the catch? Well, it only works in Gboard, Google’s keyboard app, and it only works on Pixels, and it only works in American English. So in a way this is just kind of a stress test for the real thing.
“Given the trends in the industry, with the convergence of specialized hardware and algorithmic improvements, we are hopeful that the techniques presented here can soon be adopted in more languages and across broader domains of application,” writes Google, as if it is the trends that need to do the hard work of localization.
Making speech recognition more responsive, and to have it work offline, is a nice development. But it’s sort of funny considering hardly any of Google’s other products work offline. Are you going to dictate into a shared document while you’re offline? Write an email? Ask for a conversion between liters and cups? You’re going to need a connection for that! Of course this will also be better on slow and spotty connections, but you have to admit it’s a little ironic.
Google is expanding its suite of apps designed for the Indian market with today’s launch of a new language-learning app aimed at children, called Bolo. The app, which is aimed at elementary school-aged students, leverages technology like Google’s speech recognition and text-to-speech to help kids learn to read in both Hindi and English.
To do so, Bolo offers a catalog of 50 stories in Hindi and 40 in English, sourced from Storyweaver.org.in. The company says it plans to partner with other organizations in the future to expand the story selection further.
Included in the app is a reading buddy, “Diya,” who encourages and corrects the child when they read aloud. As kids read, Diya can listen and respond with feedback. (Google notes all personal information remains on device to protect kids’ privacy.) Diya can also read the text to the child and explain the meaning of English words. As children progress in the app, they’ll be presented with word games in the app which win them in-app rewards and badges to motivate them.
The app works offline – a necessity in large parts of India – where internet access is not always available. Bolo can be used by multiple children, as well, and will adjust itself to their own reading levels.
Google says it had been trialing Bolo across 200 villages in Uttar Pradesh, India with the help of nonprofit ASER Centre. During testing, it found that 64 percent of children who used the app showed an improvement in reading proficiency in three months’ time.
To run the pilot, 920 were given the app and 600 were in a control group without the app, Google says.
In addition to improving their proficiency, more students in group with the app (39%) reached the highest level of ASER’s reading assessment than those without it (28%), and parents also reported improvements in their children’s reading abilities.
Illiteracy remains a problem in India. The country has one of the largest illiterate populations in the world, where only 74 percent are able to read, according to a study by ASER Centre a few years back. It found then that more than half of students in fifth grade in rural state schools could not read second grade textbooks in 2014. By 2018, that figure hadn’t changed much – still, only about half can read at a second grade level, ASER now reports.
While Google today highlights its philanthropic efforts in education, it’s worth noting that Google’s interest in helping improve India’s literacy metrics benefits its bottom line, too. As the country continues to come online to become one of the largest internet markets in the world, literate users capable of using Google’s products like Search, Ads, Gmail, and others – are of increased importance to Google’s business.
Chinese internet giant Tencent just lost a leading artificial intelligence figure. Zhang Tong, who previously worked at Yahoo, IBM and Baidu, has stepped down after directing Tencent’s AI Lab for nearly two years.
The scientist will return to academia and continue research in the AI field, Tencent confirmed with TechCrunch on Thursday, adding that it hasn’t appointed a successor.
”We are grateful for [Zhang]’s contributions to Tencent AI Lab and continue to explore fundamental and applied research that can make the benefits of AI accessible to everyone, everywhere,” Tencent said in a statement.
Talent is key to a tech firm’s AI endeavor, for a revered leader not only inspires employees but also boosts investor confidence. Baidu stocks plunged following Lu’s exit as markets weighed on the talent gap inside the company, which had poured resources into autonomous driving, smart speakers among other AI efforts. Tencent itself had poached Zhang from Baidu’s Big Data Lab to ramp up its own AI division.
Tencent is best known for its billion-user WeChat messenger and being the world’s largest video game publisher, but it’s also been doubling down on machine learning R&D to serve users and enterprise clients. It launched the AI Lab in April 2016 and opened its first U.S. research center in Seattle a year later to work on speech recognition and natural language processing (NLP).
The AI Lab dives into machine learning, computer vision, speech recognition and NLP. Meanwhile, the social and entertainment giant also works to put fundamental research to practical use, applying AI to its key businesses — content, social, online games and cloud computing.
One beneficiary has been WeChat, which applies NLP to enable seamless dialogues between users speaking different languages. Another case in point is Tencent’s news aggregator Tiantian Kuaibao, which deploys deep learning to recommend content based on readers’ past preference. Kuaibao is a direct competitor to Jinri Toutiao, the popular AI-powered news app run by TikTok’s parent company ByteDance.
To date, Tencent’s AI Lab has a team of 70 research scientists and 300 engineers, according to information on its website. Tencent operates another AI initiative called the Youtu Lab, which focuses on image understanding, face recognition, audio recognition and optical character recognition. While its sister AI Lab falls under Tencent’s research-focus Technology Engineering Group, Youtu is the brainchild of the Cloud & Smart Industries Group, a new unit that Tencent set up during its major organizational reshuffle in October to place more emphasis on enterprise businesses.