Tools for Corpus Linguistics

A hopefully comprehensive list of currently 280 tools used in corpus compilation and analysis.

This list is kept up to date by its users. Hence, please feel free to contribute by suggesting new tools.

You can also make suggestions, e.g., corrections, regarding individual tools by clicking the symbol. As this is a non-commercial side (side, side) project, checking and incorporating updates usually takes some time.

Suggest a Tool

Top 25 Tags

All Tags
concordancer 47
annotation 43
visualization 29
tagging 20
text analysis 20
pos tagger 18
wordlists 16
statistics 12
compilation 11
keywords 11
collocation 10
qda 10
readability 8
lexis 8
parser 8
tokenizer 8
frequency analysis 7
language learning 6
analysis 6
web-based 6
mixed methods 6
spoken 6
crawler 5
language teaching 5
xml 5

There is also a comprehensive list of all tags in the database.

Tools [visualization]

Tool Description Tags Platforms Pricing
ANNIS Search and visualization tool for multi-layer linguistic corpora with diverse types of annotationsearch, visualizationWeb (or Linux, Mac, Windows)Free
buzz A python-based linguistic analysis tool.parsing, concordancer, visualizationPythonFree, Open Source
CATMA (Computer Assisted Text Markup and Analysis) An undogmatic, complex annotation and analysis package.markup, analysis, visualization, annotationWebFree
Coquery A free corpus query tool to search, analyze, and visualize corporaquery, visualizationLinux, Mac, WindowsFree
CorpKit An advanced modern corpus toolkit with an emphasis on visualization and annotated corpora.wordlists, parsing, concordancer, visualizationLinux, Mac, Windows (Python)Free
Corpus Presenter Tree tagger and corpus analysis softwarewordlists, parsing, concordancer, visualizationWindowsFree
CorpusExplorer A complex corpus analysis toolkit combining 45 interactive tools.visualization, exploration, tagging, text analysisWindowsFree, Open Source
Cortext Manager A scriptable "ecosystem" for modeling and exploring corpora. Especially useful for creating topic models and co-occurence networks.NER, topic models, visualization, word2vec, collocation, keywordsWebFree
DocuScope A tool for computer-aided rhetorical anyalysisrhetorical analysis, text analysis, visualizationWindows (Java)Free
GraphColl Tool for building and exploring networks of linguistic collocationsvisualizationWindows, MacFree
ICARUS Search and visualization tool for dependency treesvisualizationFree
Kaleidographic A dynamic and interactive visualization tool for multivariate data.visualizationWebFree
Khepri A view-based toolfor exploring (historical sociolinguistic) datasociolinguistics, visualizationJavaScript, WebFree, Open Source
Praaline Praaline is a system for metadata management, annotation, visualisation and analysis of spoken language corpora.speech, prosody, spoken, annotation, concordancer, search, visualization, converter, analysisWindows, Mac, LinuxFree / Open Source (GPL3)
Sketch Engine A corpus manager and text analysis software developed by Lexical Computing.annotation, concordancer, tagging, sampling, search, visualization, wordlists, keywords, compilation, text analysis, n-grams, collocation, statistics, segmentation, analysis, crawler, parallel, colligation, annotations, tokenization, query, ngrams, boilerplate remover, comparison, frequency analysis, information retrieval, data, sentence boundary, corpus creation, duplicate remover, regex, thesaurus, meta modelling, dictionary, text-processing, xml, frequency, trends patterns, web-based, collocates, collocation analysis, word cloud, coocurence, KWIC, corpus management, multilingual, NLP, diachronic analysis, term extraction, keyword extraction, bilingual term extraction30-day free trial then starts at 4.83 €/month
TagCrowd A simple tool for generating tag/word clouds onlineword clouds, visualizationWebFree
Tagxedo A tool for generating word clouds.word clouds, visualizationWebFree
Text Variation Explorer The Text Variation Explorer TVE is a tool for exploring the effect of window size on various common linguistic measures. It visualizes these measures and allows for PCA/Cluster analysis.visualization, variation analysisJavaFree
Text Visualization Browser A survey/gallery of text visualizationsvisualizationWebFree
TextArc A tool for visualizing the structure of texts.visualization
Textplot A tool for mapping a document into a network of terms in order to visualize the topic structure.visualization, network analysis, semantics, graphsPythonFree, Open Source
Tree Editor TrEd 2.0 Graphical editor and viewer for tree-like structures.visualizationWindows, GNU/Linux und MacOSFree
Voyant Tools A web-based reading/analysis toolkit for digital texts.reading, text analysis, visualization, trends patternsWebFree, Open Source
Wordle A tool for generating word clouds.word clouds, visualizationWebFree
WordMap A simple web-based word-map / wordcloud generator.visualization, web-basedWebFree
WordWanderer A web-based visualization/analysis tool which allows its users to "wander" a text.visualization, concordancerWebFree
Worldbuilder Tool for annotation and visualisation in analysis applying text-world-theoryannotation, visualization
Orange Data Mining An open source machine learning and data visualization platform based on workflows.text analysis, visualization, time seriesWindows, Unix, Linux, MacFree, Open Source
TEITOK A web-based platform for viewing, creating, and editing corpora with rich textual mark-up and linguistic annotation.visualization, TEI, mark-up, annotationLinux, MacFree, Open Source

Last Updated: February 29, 2024.

In case you are interested, the data is also available in JSON format.