Breaking barriers to Internet access

Internet search engines are the most efficient method of acquiring information. This poses three problems in China: 1.3 billion people cannot read English; 16 million are visually impaired and cannot use traditional search engines; and only 163 million Chinese use the Internet, although 580 million have mobile phones.

To overcome these barriers, commercial search engines translate complete web pages, with often unreadable results. An alternative solution would be for the search engine to find the “answers” and translate the short answers only. To meet the needs of the visually impaired that short answer should be converted to Braille or sound. Furthermore, the short answers would allow mobile phone users to search the Internet.

The research 

The research team is developing a natural language search engine that gives concise answers and eliminates the multiple interactive processes in current search engines. The team is developing this technology based on a novel information distance theory, as well as prior work on Braille systems and Question and Answer systems. And to overcome limited Internet access, the team is capitalizing on the extensive use of text messaging by Chinese mobile phone users to access the Internet. 

The project’s main goal is to minimize the barriers preventing Chinese citizens from acquiring information through the Internet. The specific objectives are to:

  • develop a natural language question answering-system that answers both simple questions and a reasonable portion of complex questions; has a Braille interface; has a Chinese-English interface enabling Chinese users to query in their mother tongue, search the English internet, and receive English/Chinese answers; and is suitable for cellphone users with small display areas;
  • develop a Canada-China Centre of excellence on Internet information acquisition at Tsinghua University, China;
  • perform research on factoid query problems, complex and list query problems, query analysis, solution (answer) analysis, and a new indexing data structure that is suitable for question answering; and
  • train Chinese and Canadian graduate students in computer science in the specific areas of data mining, search engines, natural language processing, and search algorithms.

Expected outcomes

The project aims to:

  • develop a new generation natural language based Internet search engine for the visually impaired, cell phone users, and Chinese citizens who cannot read English;
  • create a Canada-China Centre of excellence in the field of internet information acquisition; and
  • train graduate students in computer science in both Canada and China.

Lead Researchers 

Xiaoyan Zhu

IDRC Research Chair in Information Technology Tsinghua University, China 

Xiaoyan Zhu is a Professor in the Department of Computer Science and Technology, Tsinghua University, the deputy head of the State Key Lab of Intelligent Technology and Systems, and the head of the Tsinghua-HP Multimedia Research Lab. An acknowledged leader in areas of information processing, her research interests include natural language processing, Question and Answer systems, computer systems for the visually impaired, pattern recognition, neural networks, machine learning, and bioinformatics.

The results of Zhu’s research have been commercialized by Toshiba and Fujitsu and been successfully applied to Chinese-Braille computer systems. Her work on natural language search engines opens new ways to search the Internet. She has successfully conducted research programs supported by China’s National Basic Research Program, the National High Technology Research and Development Program of China, and the National Natural Science Foundation of China.

Xiaoyan Zhu has authored or coauthored more than 100 papers and conference proceedings.

Ming Li

Canada Research Chair in Bioinformatics University of Waterloo, Canada 

Ming Li is Professor of Computer Science at the University of Waterloo and a Canada Research Chair in Bioinformatics. The world’s leading expert on measuring information distance between two information carrying entities, Li is a Fellow of the Royal Society of Canada, the Association for Computing Machinery, and the Institute of Electrical and Electronics Engineers, Inc. He received Canada’s E.W.R. Steacie Fellowship Award in 1996, and the Killam Fellowship in 2001.

Together with University of Amsterdam Professor Paul Vitanyi, Ming Li pioneered the applications of Kolmogorov complexity. His work on information distance and normalized information distance, in particular, has found many applications in document comparison, genome evolution, and time series analysis, as well as a Question and Answer search engine on the Internet. A co-managing editor of the Journal of Bioinformatics and Computational Biology, he is an associate editor-in-chief of the Journal of Computer Science and Technology. He also serves as an associate editor for the Journal of Computer and System Sciences, and the Journal of Combinatorial Optimization.

Project ID

104519-006

Project status

Active

Start Date

Monday, June 1, 2009

End Date

Sunday, June 1, 2014

Duration

60 months

IDRC Officer

David O'Brien

Total funding

CA$ 1,000,000

Countries

China, Canada

Program

International Research Chairs Initiative

Project Leader

Ming Li

Institution

Tsinghua University

Project Leader

Ming Li

Institution

University of Waterloo

Institution Country

Canada

Institution Website

http://www.uwaterloo.ca