A hopefully comprehensive list of currently 284 tools used in corpus compilation and analysis.
This list is kept up to date by its users. Hence, please feel free to contribute by suggesting new tools.
You can also make suggestions, e.g., corrections, regarding individual tools by clicking the ✎ symbol. As this is a non-commercial side (side, side) project, checking and incorporating updates usually takes some time.
There is also a comprehensive list of all tags in the database.
Tool | Description | Tags | Platforms | Pricing |
---|---|---|---|---|
aConCorde ✎ | Multilingual concordance tool (English and Arabic) | concordancer | Linux, Mac, Windows | Free |
AntConc ✎ | Corpus analysis toolkit | wordlists, concordancer, keywords | Linux, Mac, Windows | Free |
AntPConc ✎ | Corpus analysis toolkit designed for working with parallel corpora. | wordlists, concordancer | Windows, Mac | Free |
BFSU ParaConc ✎ | A parallel concordancer | concordancer, parallel | Windows | Free |
BFSU PowerConc ✎ | A fairly powerful concordancer | concordancer | Windows | Free |
BNCWeb ✎ | BNCweb is a web-based client program for searching and retrieving lexical, grammatical and textual data from the British National Corpus (BNC). | analysis, concordancer | Web | Free |
buzz ✎ | A python-based linguistic analysis tool. | parsing, concordancer, visualization | Python | Free, Open Source |
CasualConc ✎ | CasualConc is a concordance program that runs natively on macOS. | concordancer | OSX | Free |
CLiC ✎ | A corpus tool to support the analysis of literary texts. | concordancer | Web | Free |
Collocate ✎ | Tool for the extraction of concordances and collocations | concordancer | Windows | 35 USD |
Concordance Randomizer ✎ | A concordance randomizer | concordancer | Windows | Free |
Concordancer ✎ | Online tool for frequency counts and text clouds | concordancer | Web | Free |
CorpKit ✎ | An advanced modern corpus toolkit with an emphasis on visualization and annotated corpora. | wordlists, parsing, concordancer, visualization | Linux, Mac, Windows (Python) | Free |
Corpus Presenter ✎ | Tree tagger and corpus analysis software | wordlists, parsing, concordancer, visualization | Windows | Free |
gwic ✎ | A very basic KWIC tool written in Go. | concordancer, KWIC | Windows, Mac, Linux | Open Source |
HeidelGram Web-Based Tools ✎ | Basic corpus analysis toolkit for the HeidelGram Corpus | wordlists, concordancer | Web | Free |
IMS Corpus Workbench ✎ | Tool for sorting frequencies in corpora | wordlists, concordancer | Web and local version | Free |
KAT Tool ✎ | Grouping patterns based on search terms | patterns, concordancer | Windows | Free |
KWords ✎ | A tool for keyword identification and analysis. | keywords, CADS, concordancer, collocation analysis | Windows, Linux, Mac | Free |
Lextutor Web Concordancers ✎ | Web concordancers targeted towards DDL | collocations, concordancer, DDL | Web | Free |
MLCT ✎ | Tool for building and processing corpora | concordancer, sentence boundary detector | Free | |
MonoConc Esy ✎ | Concordancing and text search tool that allows primary and secondary concordancing | concordancer, sentence boundary detector | Free for non-Commercial research | |
OpenConc ✎ | Tool for concordancing | concordancer | Free | |
ParaConc ✎ | A bilingual/multilingual concordancer | concordancer | Non-Free | |
PhraseContext ✎ | Tool for wordlists, concordancing, collocation, TTR, | wordlists, concordancer | 35€ | |
Praaline ✎ | Praaline is a system for metadata management, annotation, visualisation and analysis of spoken language corpora. | speech, prosody, spoken, annotation, concordancer, search, visualization, converter, analysis | Windows, Mac, Linux | Free / Open Source (GPL3) |
PyXMLConc ✎ | Concordancer for XML files with automatic tag and attribute detection. | concordancer | Multi (Python), Windows | Free, Open Source |
Shinyconc ✎ | ShinyConc is a framework for generating custom web-based concordancers and is written in R and R Shiny. | concordancer, kwic, r | Open Source / R | Free |
Simple Concordance Program ✎ | Tool for concordance and word listing that works with many languages | concordancer | Windows, Mac | Free |
Sketch Engine ✎ | A corpus manager and text analysis software developed by Lexical Computing. | annotation, concordancer, tagging, sampling, search, visualization, wordlists, keywords, compilation, text analysis, n-grams, collocation, statistics, segmentation, analysis, crawler, parallel, colligation, annotations, tokenization, query, ngrams, boilerplate remover, comparison, frequency analysis, information retrieval, data, sentence boundary, corpus creation, duplicate remover, regex, thesaurus, meta modelling, dictionary, text-processing, xml, frequency, trends patterns, web-based, collocates, collocation analysis, word cloud, coocurence, KWIC, corpus management, multilingual, NLP, diachronic analysis, term extraction, keyword extraction, bilingual term extraction | 30-day free trial then starts at 4.83 €/month | |
Text Analysis Computing Tools (TACT) ✎ | A simple, fairly old concordancer. | concordancer | Commercial | |
Textanz ✎ | Language analysis program that produces frequency lists, word lists, parts of speech tags. | wordlists, concordancer, pos tagger, dictionary | Any OS | Free, Open Source |
TextSTAT ✎ | Tool for creation and manipulation of linguistic data from different languages | corpus creation, concordancer | Windows, GNU/Linux und MacOS | Free |
The Prime Machine ✎ | A user- and mobile-friendly corpus analysis toolkit (primarily concordancing) initially designed for English language teaching. | concordancer, language teaching, wordlist, keywords, efl, esl | MacOS, Window, iOS, Android | Free |
The Simple Corpus Tool ✎ | A corpus analysis toolkit that supports XML annotations. | concordancer, annotation, xml, frequency | Windows | Free |
The SPAADIA concordancer ✎ | A concordancer for the SPAADIA corpus | concordancer, SPAADIA | Windows | Free |
The Text Feature Analyser ✎ | A tool for investigating textual features and various meassures | text analysis, concordancer | Windows | Free |
TXM ✎ | XML & TEI compatible text analysis software based on TreeTagger, the CQP search engine and the R statistical environment. | text analysis, concordancer, r, statistics, search tool, tokenizer, xml | Windows,Mac,Linux,Tomcat | Free |
WConcord 3.0 ✎ | A fully featured concordancer | concordancer | Free | |
Wmatrix ✎ | Tool for corpus analysis and comparison. Provides access to CLAWS and USAS. | wordlists, concordancer, pos tagger, semantic tagger, keywords, web-based | Web | £50 per username per year |
WordCruncher ✎ | A tool for searching, studying, and analyzing digital texts and corpora. The tool has been tested for corpora up to a billion words. | concordancer, wordlists, collocates, n-grams, keywords, key phrases, ebooks | Windows, Mac, iOS | Free |
Wordsmith ✎ | One of the most established corpus toolkits providing a variety of functionality | concordancer, wordlists, statistics, keywords | Windows | 60€ per licence |
Wordstatix ✎ | Corpus analysis tool | concordancer | Free | |
WordWanderer ✎ | A web-based visualization/analysis tool which allows its users to "wander" a text. | visualization, concordancer | Web | Free |
Just the Word ✎ | A simple web interface for BNC data | concordancer, frequency analysis, BNC | Web | Free |
Wordless ✎ | An Integrated corpus tool With multilingual support for the study of language, literature, and translation. | concordancer, text analysis, statistics, readability | Windows, Mac, Linux, Python | Free, Open Source |
CorpusMate ✎ | A web-based, streamlined, and simplified language data analysis experience for younger learners. | language learning, language teaching, concordancer, frequency analysis, pattern | Web | Free |
AutoSearch ✎ | A cloud-based corpus query engine that supports the upload of corpora. | concordancer, corpus query engine | Web | Free |
LogiTerm Pro ✎ | A powerful commercial multilingual concordancer especially geared towards translators and terminologists. | concordancer, terminology management, terminology | Windows | 875 $CAN |
Last Updated: December 11, 2024.
In case you are interested, the data is also available in JSON format.