Tools for Corpus Linguistics

A comprehensive list of 188 tools used in corpus analysis.

Please feel free to contribute by suggesting new tools or by pointing out mistakes in the data.

Suggest a Tool

Tags

Everything
annotation
concordancer
parser
pos tagger
search
visualization
wordlists
compilation
text analysis
converter
n-grams
p-frames
lexical bundles
lexical frames
text complexity
collocation
statistics
segmentation
coding
concordaner
ddl
pedagogy
analysis
crawler
parallel
tagging
colligation
parsing
collocations
exploration
searching
database
dialogues
cleaning
annotations
tokenization
transcription
downloader
readability
semantic parser
word2vec
ngrams
pattern matching
temporal tagger
timex3
network analysis
semantic tagger
ICE
tokenizer
boilerplate remover
patterns
comparison
keywords
sociolinguistics
frequency analysis
lexis
lemmaizer
news
data
machine learning
morphological tagger
statistical nlp
MDA
sentence boundary ...
tagger
multilevel tagger
corpus creation
semantic analysis
duplicate remover
editing
vocabulary
constructions
regex
conversion
phonology
speech
prosody
spoken
phonetics
query
thesaurus
meta modelling
tokenizing
kwic
r
topic modeling
cohesion
lexical sophistication
word clouds
variation analysis
dictionary
text-processing
python
phraseology
xml
frequency
SPAADIA
efl
esl
linguistics
search tool
multi-layer
variant detector
reading
metaphor identific ...
metaphors
ebooks
political science
indexing
chinese
graphs
rhetorical analysis
textual criticism
witnesses
close reading
stylometry
management
twitter
web-based
coherence
lexical analysis
style
video
discourse
images
multilevel
qda
mixed methods
markup
anc
sampling
matching

Tools

Tool Description Categories Platform Pricing
ANNISSearch and visualization tool for multi-layer linguistic corpora with diverse types of annotationsearch, visualizationWeb (or Linux, Mac, Windows)Free
CorpKitAn advanced modern corpus toolkit with an emphasis on visualization and annotated corpora.wordlists, parsing, concordancer, visualizationLinux, Mac, Windows (Python)Free
Corpus PresenterTree tagger and corpus analysis softwarewordlists, parsing, concordancer, visualizationWindowsFree
CorpusExplorerA complex corpus analysis toolkit combining 45 interactive tools.visualization, exploration, tagging, text analysisWindowsFree, Open Source
GraphCollTool for building and exploring networks of linguistic collocationsvisualizationWindows, MacFree
ICARUSSearch and visualization tool for dependency treesvisualizationFree
KaleidographicA dynamic and interactive visualization tool for multivariate data.visualizationWebFree
KhepriA view-based toolfor exploring (historical sociolinguistic) datasociolinguistics, visualizationJavaScript, WebFree, Open Source
PraalinePraaline is a system for metadata management, annotation, visualisation and analysis of spoken language corpora.speech, prosody, spoken, annotation, concordancer, search, visualization, converter, analysisWindows, Mac, LinuxFree / Open Source (GPL3)
TagxedoA tool for generating word clouds.word clouds, visualizationWebFree
Text Variation ExplorerThe Text Variation Explorer TVE is a tool for exploring the effect of window size on various common linguistic measures. It visualizes these measures and allows for PCA/Cluster analysis.visualization, variation analysisJavaFree
Text Visualization BrowserA survey/gallery of text visualizationsvisualizationWebFree
TextArcA tool for visualizing the structure of texts.visualization
TextplotA tool for mapping a document into a network of terms in order to visualize the topic structure.visualization, network analysisPythonFree, Open Source
Tree Editor TrEd 2.0Graphical editor and viewer for tree-like structures.visualizationWindows, GNU/Linux und MacOSFree
WorldbuilderTool for annotation and visualisation in analysis applying text-world-theoryannotation, visualization
WordleA tool for generating word clouds.word clouds, visualizationWebFree
DocuScopeA tool for computer-aided rhetorical anyalysisrhetorical analysis, text analysis, visualizationWindows (Java)Free
TagCrowdA simple tool for generating tag/word clouds onlineword clouds, visualizationWebFree
CATMA (Computer Assisted Text Markup and Analysis)A complex annotation and analysis packagemarkup, analysis, visualizationWebCommerical