Please feel free to contribute by suggesting new tools or by pointing out mistakes in the data.
|Sketch Engine||A corpus manager and text analysis software developed by Lexical Computing.||annotation, concordancer, tagging, sampling, search, visualization, wordlists, keywords, compilation, text analysis, n-grams, collocation, statistics, segmentation, analysis, crawler, parallel, colligation, annotations, tokenization, query, ngrams, boilerplate remover, comparison, frequency analysis, information retrieval, data, sentence boundary, corpus creation, duplicate remover, regex, thesaurus, meta modelling, dictionary, text-processing, xml, frequency, trends patterns, web-based, collocates, collocation analysis, word cloud, coocurence, KWIC, corpus management, multilingual, NLP, diachronic analysis, term extraction, keyword extraction, bilingual term extraction||30-day free trial then starts at 4.83 €/month|
|The Simple Corpus Tool||A corpus analysis toolkit that supports XML annotations.||concordancer, annotation, xml, frequency||Windows||Free|
|TXM||XML & TEI compatible text analysis software based on TreeTagger, the CQP search engine and the R statistical environment.||text analysis, concordancer, r, statistics, search tool, tokenizer, xml||Windows,Mac,Linux,Tomcat||Free|
|Xaira||Indexing and analysis of XML resources,||indexing, xml||Windows||Free, Open Source|
|Trafilatura||Trafilatura is a Python package and command-line tool which seamlessly downloads, parses, and scrapes web page data.||corpus creation, python, R, compilation, crawler, boilerplate remover, data, xml, scraping||Python||Free, Open Source|