- Armonk NY, US Charles E. BELLER - Baltimore MD, US Edward Graham KATZ - Washington DC, US Christopher F. ACKERMANN - Fairfax VA, US
International Classification:
G06F 16/14 G06Q 10/10 G06N 5/04 G06F 40/30
Abstract:
Provided are a method, system, and computer program product in which operations are performed to receive a question that includes a descriptor and an indication that indicates that a unique answer to the question is expected. A determination is made of instances of matching descriptors and descriptor targets from a set of documents. The determined descriptor targets are compared for consistency. In response to determining that the determined descriptor targets are inconsistent, more restrictive descriptors are iteratively generated via a selection model based on metadata associated with the question until the descriptor targets are consistent. An answer to the question is returned from consistent descriptor targets.
A mechanism is provided in a data processing system to implement a cognitive natural language processing (NLP) system with descriptor uniqueness identification to support named entity mention clustering. The mechanism annotates a set of documents from a corpus of documents for entity types and mentions, collects descriptor usages from all documents in the corpus of documents, analyzes the descriptor usages to classify the descriptors as base terms or modifier terms, generates compatibility scores for the descriptors, and performs entity merging of entity clusters based on the compatibility scores.
- Armonk NY, US Christopher F. Ackermann - Fairfax VA, US Kristen Maria Summers - Takoma Park MD, US David McQuenney - Yorktown Heights NY, US Rob High - Round Rock TX, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 17/27 G06F 17/28 G06N 20/00
Abstract:
Embodiments relate to an intelligent computer platform to selectively amend one or more tokens in a document. A first document set is subjected to natural language processing (NLP) and a vector score is identified for two or more documents of the first document set. Upon receipt of a new document, the new document is subjected to NLP and a new document vector score is identified. The new document is analyzed against the first document set, and the identified vector score of the first document set is compared to the vector score of the new document. One or more tokens of the new document are amended responsive to the comparison, and a new document version is created from the selective amendment.
- Armonk NY, US Christopher F. Ackermann - Fairfax VA, US Kristen Maria Summers - Takoma Park MD, US David McQuenney - Yorktown Heights NY, US Rob High - Round Rock TX, US
Assignee:
International Business Machines Corporation - Armonk NY
International Classification:
G06F 17/27 G06F 21/62 G06N 5/04 G06F 16/93
Abstract:
Embodiments relate to an intelligent computer platform to selectively amend one or more document elements. A first document is subjected to natural language processing (NLP) and two or more document characteristics are subjected to an assessment to produce a characteristic value. The document characteristics and corresponding characteristic values are analyzed to produce a characteristic profile for each identified document characteristic. Upon receipt of a new document, document characteristic data and corresponding characteristic value(s) are identified. The corresponding characteristic value(s) of the new document is applied against the produced characteristic profile. New document characteristic data is selectively amended responsive to the comparison, and a new document version is created from the selective amendment.
Methods And Systems For Providing Suggestions To Complete Query Sessions
- Armonk NY, US Christopher ACKERMANN - Fairfax VA, US Kristen SUMMERS - Takoma Park MD, US Rob HIGH - Round Rock TX, US David MCQUEENEY - YORKTOWN HEIGHTS NY, US
Assignee:
INTERNATIONAL BUSINESS MACHINES CORPORATION - Armonk NY
International Classification:
G06F 16/9032 G06F 16/2457
Abstract:
Embodiments for identifying entities relevant to queries are provided. At least one query is received from a user. The at least one query is associated with at least one entity. Results of the at least one query are analyzed to identify related entities. The related entities are analyzed based on a relevancy score and an information enhancement score for each of the related entities to generate a ranking of the related entities. At least one of the related entities is provided to the user based on the ranking of the related entities.
- Armonk NY, US Kristen Maria Summers - Takoma Park MD, US Christopher F. Ackermann - Fairfax VA, US Michael Drzewucki - Woodbridge VA, US Andrew Doyle - Mount Rainier MD, US
International Classification:
G06F 16/242
Abstract:
A method includes searching an initial set of documents for an initial set of query names. Each query name of the initial set of query names is associated with at least one document of the initial set of documents. The method also includes prioritizing the initial set of query names based on at least one topic label. The method also includes searching an additional set of documents to generate candidate query names. The method also includes prioritizing the candidate query names based on the at least one topic label. The method further includes applying a temporal search filter to each candidate query name to determine whether the candidate query name was processed within a time frame. The method further includes performing disambiguation processing on each candidate query name not processed within the time frame.
Ranking Collections Of Document Passages Associated With An Entity Name By Relevance To A Query
- ARMONK NY, US CHRISTOPHER F. ACKERMANN - FAIRFAX VA, US ANDREW DOYLE - MOUNT RAINIER MD, US MICHAEL DRZEWUCKI - WOODBRIDGE VA, US CHARLES E. BELLER - BALTIMORE MD, US
International Classification:
G06F 16/2457 G06F 16/93 G06F 16/28 G06F 16/2458
Abstract:
Query service receives a query comprising at least a name component. The query service searches a document corpus to identify multiple passages, each comprising a mention of the name component within a selection of one or more documents of the document corpus. The query service collects bins, each bin comprising a distinct selection of the passages from the one or more documents, each of the bins identifying a separate relationship the name component participates in within the distinct selection of passages. The query service assesses a separate score of each respective bin reflecting the relevance of each respective bin to the query. The query service returns a response to the query with the bins each ranked according to each separate score.
Query-Directed Discovery And Alignment Of Collections Of Document Passages For Improving Named Entity Disambiguation Precision
- ARMONK NY, US CHRISTOPHER F. ACKERMANN - FAIRFAX VA, US MICHAEL DRZEWUCKI - WOODBRIDGE VA, US ANDREW DOYLE - MOUNT RAINIER MD, US EDWARD G. KATZ - WASHINGTON DC, US KRISTEN M. SUMMERS - TAKOMA PARK MD, US
A query system identifies a collection of discovered entity bins each comprising unstructured documents with mentions of a name element from a name query and each identified with a particular named entity identifiable from the name element. The query system identifies, from a knowledge base of structured documents, based on identifier components with the name element, candidate records identifying the respective identifier components with the name element, the one or more identifier components identified among the discovery entity bins. For each respective selection of candidate records associated with each bin, the query system applies one or more alignment threshold rules to rank the likelihood that each candidate record within each respective selection matches one or more characteristics of the respective discovery entity bin. The query system aligns, with each of the discovery entity bins, a highest ranked record from among each respective selection of candidate records, where the respective aligned highest ranked record identifies a distinct named entity from among the named entities.
Ibm
Senior Managing Consultant
Caci International Inc Feb 2012 - Feb 2016
Principle Application Engineer
University of Maryland Jun 2010 - Feb 2012
Research Scientist
University of Maryland Jun 2010 - Feb 2012
Lecturer
Fraunhofer Usa Cese (Center For Experimental Software Engineering) Aug 2005 - Jun 2010
Associate Research Scientist
Education:
University of Maryland 2010
University of Maryland 2006 - 2010
Doctorates, Doctor of Philosophy, Computer Science
University of Maryland 2006 - 2009
Masters, Computer Science
Hochschule Mannheim 2002 - 2006
Bachelors, Computer Science
Berufsbildende Schule Technik 2 Ludwigshafen 2000 - 2002
Berufsbildende Schule Technik 2 Ludwigshafen 1996 - 2000
Skills:
Software Development Testing Architecture C++ Software Engineering Sql Server Spring Hibernate Java Programming Sql Python Html Process Development C Javascript Php Xml Microsoft Office Research Engineering C# Computer Science Eclipse Databases Uml Technical Writing Agile Visual Basic Machine Learning Visual Studio Subversion Asp.net Jquery Process Improvement Software Project Management Teaching Proposal Writing Analytics Big Data Integration Distributed Systems Software Design Requirements Analysis Architectures Agile Methodologies
Languages:
German English
Certifications:
Itil Foundation Sei Introduction To Cmmi For Development V1.2 (Cmmi-Dev) Sei Services Supplement For Cmmi V1.2 (Cmmi-Svc) Exin Abridge Technology
Fraunhofer Center for Experimental Software Engineering
Mar 2004 to 2000 Research Scientist (Graduate Research Assistant, Intern)HIMA Paul Hildebrandt GmbH Mannheim Jun 2003 to Mar 2004 InternAzteka Consulting GmbH Mannheim Mar 2002 to Jun 2003 Programmer
Education:
University of Maryland College Park, MD 2006 to 2010 Ph.D. in Computer ScienceHochschule Mannheim Mannheim Jan 2002 to Jan 2006 Bachelor in Computer Science