OpenEphyra is a question answering (QA) system. It retrieves answers to natural language questions from the Web and other sources. OpenEphyra comes with implementations of algorithms that proved effective in Carnegie Mellon's Ephyra system, which participated in the TREC evaluations. It is platform independent and can be set up in just a few minutes. The goal of this project is to give researchers the opportunity to develop new QA techniques without worrying about the end-to-end system.
Python Web Graph Generator is a threaded Web graph (Power law random graph) generator. It can generate a synthetic Web graph of about one million nodes in a few minutes on a desktop machine. It supports both directed and undirected graphs. It implements a threaded variant of the RMAT algorithm. A little tweak can produce graphs representing social networks or community networks. It can also output connected components in a graph.
Isobel is a framework to build complex information retrieval and analysis systems. Isobel can be functionally divided in two subsytems, Isobel Gatherer (the crawling and filtering subsystem) and Isobel Analyzer (the analysis subsystem). The two subsytems can also be used separately. Isobel Gatherer offers ready-to-use services like content fetching, scheduling, document format conversion, Hyperlink graph storage and analysis, content storage and indexing. A programmer may easily add new services. Isobel Analyzer uses the IBM UIMA architecture to reuse the analysis components developed for this architecture.
SENTENSA Knowledge Miner is a platform independent tool for searching any text. SENTENSA uses robust methods of indexing and searching text, leveraging experience from more than 20 years of information retrieval. SENTENSA products offer advanced text retrieval solutions for large databases that will make your searches for key information fast and effective. You can index on one platform and query on another.
LinkGrammar-WN is a lexicon expansion for the Link Grammar Parser. The Link Grammar Parser is a syntactic parser of the English language that is capable of handling a wide variety of syntactic constructions and is considered quite robust. The LinkGrammar-WN project aims to import lexical information from WordNet in an effort to increase the size of the LGP lexicon. This project is of interest to anyone interested in NLP (natural language parsing) of English text.
Mindmeld is an enterprise-capable knowledge sharing system designed for any Web community that needs to capture and share information. It is unique in that the knowledge base grows smarter every time it's used. It incorporates terms used in each search into a contextual map of the answer itself, continually improving its ability to derive contextual information from a given search. The system learns how people typically search for an answer by identifying which terms are most valuable in any specific context.