cs.CMU的研究领域跟咱们相关的人员 - tjuiems

QA也许是咱们要关注的一个方面。

以下是CMU CS跟咱们相关的几个研究领域

http://www.csd.cs.cmu.edu/research/areas/ai/ AI研究团队 AI: Planning, Knowledge Representation, and Game Theory

M. Bilotti, L. Zhao, E. Nyberg, and J. Callan. (To appear.) "Focused retrieval over richly-annotated collections." SIGIR 2008 Workshop on Focused Retrieval. Singapore.

http://www.cs.cmu.edu/~callan/Papers/

http://www.cs.cmu.edu/~callan 此位先生做ir的研究，值得跟踪，做了一些blog的信息抽取

Pedro, Vasco, Eric Nyberg and Jaime Carbonell. 2006. "Federated Ontology Search", Proceedings of the 1st International Workshop on Semantic Information Integration on Knowledge Discovery (SIIK 2006), Yogyakarta, Indonesia PDF.

Wang, Mengqiu, Kenji Sagae, Teruko Mitamura. 2006. “A Fast, Accurate Deterministic Parser for Chinese”. Proceedings of COLING/ACL 2006. Sydney, Australia. PDF-------ZXW读一下？

JAVELIN
Open-Domain Question Answering

Typical IR systems return a set of documents, or perhaps a set of queries. LTI Question Answering software extracts information from documents in large, open-domain corpora to answer questions in subject areas that are not known in advance.

Contact: Eric Nyberg and Teruko Mitamura

Utility-based Information Distillation

We study supervised, unsupervised and semi-supervised learning techniques for automatically detecting novel events and tracking the new trends for relevant events from temporally-ordered documents, for dynamically updating user profiles under context, and for optimizing the utility of passage selection and summarization based on relevance, novelty, readability, readability and user cost (e.g., time). Collaborative and adaptive information filtering among multiple users is also a part of the open challenge.

Contacts: Yiming Yang and Jaime Carbonell

Knowledge Acquisition Projects

Dark Matter
Knowledge Acquisition from Text

LTI is participating in Project Halo, a research effort to design and implement a "Digital Aristotle". Our focus is on the definition of KAL (Knowledge Acquisition Language), a form of controlled language that can be used to acquire domain knowledge from subject matter experts in domains such as Chemistry, Physics and Biology.

Contacts: Eric Nyberg and Teruko Mitamura

IAMTC
Interlingual Annotation of Multilingual Text Corpora

IAMTC is a multi-site NSF ITR project focusing on the annotation of six sizable bilingual parallel corpora for interlingual content with the goal of providing a significant data set for improving knowledge-based approaches to machine translation (MT) and a range of other Natural Language Processing (NLP) applications. The central goals of the project are: (1) to produce a practical, commonly-shared system for representing the information conveyed by a text, or interlingua (IL), (2) to develop a methodology for accurately and consistently assigning such representations to texts across languages and across annotators, (3) to annotate a sizable multilingual of parallel corpus of source language texts and translations for IL content.

Contacts: Lori Levin and Teruko Mitamura

Scone
Symbolic Knowledge Base

Scone is a symbolic knowledge representation system designed to run well on a standard workstation. Scone's primary design goals are ability to represent "common sense" knowledge, efficiency in performing inference and search, scalability to several million assertions, and ease of use.

Contact: Scott Fahlman

Also see: Tutalk and A Shared Resource for Robust Semantic Interpretation for Both Linguists and Non-Linguists .

A Shared Resource for Robust Semantic Interpretation for Both Linguists and Non-Linguists

The majority of existing authoring tools for constructing advanced conversational interfaces were designed for use by computational linguists. Our research goal is to explore strategies for supporting the development of language understanding interfaces by non-linguists. In our previous work we have developed Carmel-Tools, a behavior oriented authoring environment for building semantic knowledge sources for the CARMEL core understanding engine. In our recent work, we have begun conducting user studies that aim to better understand how people process a large amount of corpus data when faced with a task comparable to programming a dialogue agent using a data driven methodology. Our preliminary user study results hint that participants (1) introduce a bias when processing data sequentially (i.e. primacy effects) and (2) naturally represent semantic relatedness using spatial proximity. Based on these observations, we have developed the InfoMagnets interface that provides a physical metaphor for exploratory data analysis that is consistent with user conceptions of semantic relatedness and helps users avoid being biased by primacy effects by gaining a birds-eye view of their whole inventory of dialogue topics simultaneously.

Contact: Carolyn Rose

WebKB
The World Wide Knowledge Base Project

The World Wide Web is a vast source of information accessible to computers, but understandable only to humans. The goal of this research project is to automatically create a computer understandable knowledge base whose content mirrors that of the World Wide Web. If successful, this would lead to much more effective retrieval of information from the web, the use of this information to support new knowledge based problem solvers. Our approach is to use machine learning algorithms to train the system to extract information of the desired types. Our web page describes the overall approach, plus several new algorithms we have developed that successfully extract information from the web.

Contact: Tom Mitchell

Research / AREAS

Computer Science Department (See Faculty Research Guide)

3 Search, planning, and knowledge representation

Another AI focus at Carnegie Mellon CSD is search, planning, and knowledge representation.

This is often intertwined with multiagent systems.

Search algorithms for market clearing

http://www.informatik.uni-trier.de/~ley/db/indices/a-tree/c/Carbonell:Jaime_G=.html

Sandholm pioneered the idea of mediated marketplaces that allow the participants to use highly expressive preference specification languages in order to reach better outcomes. He developed the world’s fastest optimal algorithms for clearing the market. On highly expressive real-world procurement auctions, they have optimally solved problems with over 2.6 million bids and over 160,000 items to be procured, as well as instances with over 320,000 side constraints. The techniques are a combination of tree search from AI, mixed integer programming from operations research, and dozens of techniques that he developed. Since 2002, he has used these techniques to clear over $20 billion of the most combinatorial procurement auctions ever conducted, resulting in value creation in the world (by increased economic efficiency) of over $2 billion. He has applied many of the techniques to other search problems as well.

Search for homeland security

Carbonell and Fink have studied techniques for fast identification of matches in multi-attribute exchange markets, which allow fast-paced trading of complex non-standardized goods. They have also applied these matching techniques to a homeland-security project, focused on identifying suspicious and unexpected patterns in massive structured databases. For example, the developed techniques may allow the detection of money-laundering patterns in banking transactions.

Distributed planning

Guestrin is working on efficient distributed multiagent coordination, planning and learning. Using the factored Markov decision processes representation, which exploits problem-specific structure using Bayesian networks, he designed efficient approximate planning algorithms, leveraged by a novel linear programming decomposition technique. The decomposition technique yields efficient distributed algorithms for planning and learning in collaborative multiagent settings, where multiple decision makers must coordinate their actions to maximize a common goal. Guestrin also works on wireless sensor networks using efficient inference methods from probabilistic graphical models.

Probabilistic replanning

Veloso et al. introduced extended rapidly-exploring random trees (E-RRT), as a novel reuse strategy that solves the general replan/reuse question, in which a past plan probabilistically guides a new search. The replan algorithm considers an initial state, a path, and a goal to be achieved; from the initial state, it grows a search tree by extending towards the goal with probability $p$ , towards a point in the path with probability $r$ , and towards a random exploration target with probability $1 - p - r$ . The past (or failed) plan is effectively used as a bias in the new search, therefore solving the general reuse problem in a probabilistic manner.

Learning domain-specific planners

Instead of hand writing domain-specific planners to solve large-scale planning problems, Veloso uses example plans to demonstrate how to solve problems in a particular domain and to use that information to automatically learn domainspecific planners that model the observed behavior. Her group developed the ITERANT algorithm for identifying repeated structures in observed plans and show how to convert looping plans into domain-specific planners, or dsPlanners. Looping dsPlanners are able to apply experience acquired from the solutions to small problems to solve arbitrarily large ones. The automatically learned dsPlanners are able to solve large-scale problems much more effectively than any state-of-the-art general-purpose planners and are able to solve problems many orders of magnitude larger than general-purpose planners can solve.

Knowledge representation

Fahlman is working on Scone, a knowledge representation system. In addition to representing all kinds of knowledge (including “common sense” knowledge), Scone is designed to support efficient inference and search. Compared to some other knowledge-representation efforts, Scone’s emphasis is on efficiency, scalability (up to a few million entities and statements about them), and ease of use. Members of the Scone research group are working on a natural-language front-end that will make it possible to add knowledge to Scone and to ask questions using simple English. Scone is intended to be a software component, useful in a large number of other software systems, much as databases are used today. As a longer-term goal, the Scone group is working to develop a flexible declarative representation for episodic memory, i.e., a hierarchical representation of action-types and event-types, along with the entities involved and affected by each event.

Robotics Institute (See Faculty Research Guide)

Institute for Software Research, International

Human-Computer Interaction Institute (See All HCII Projects )

Cognitive Modeling
PACT Center
Project LISTEN
Computing Workshop
Designing Interfaces to Support Human Attention
Interaction Design
Interactive Systems Laboratories (ISL)
Societal Impacts of Computing Speech
Usability Analysis
User Interface Design

Language Technologies Institute (See All LTI Projects)

Machine Learning Department (See Research Guide)

posted on 2008-09-11 14:16 tjuiems 阅读(528) 评论(0) 收藏举报

刷新页面返回顶部