Text and Data Mining
Find text and data sets and explore related tools, methods, and analysis through our workshop, class instruction, and consultation offerings.
What do you want to do?
Whether a computational process is needed for text and data mining — as opposed to analysis by hand — depends upon the level of complexity.
Review these questions as you consider what you want to find out:
- Do you need to analyze an entire corpus (body of work), or just selected items from it?
- Do you need the content to be easily read by humans, or only by machines?
- Do you need to download the entire contents? Or can your analysis be conducted on the platform where the content already resides?
- What kind of analysis do you want to do?
Find resources for mining
We can help you find text and data sets to use for text and data mining.
The following research guides contain text and data in broad categories or genres — such as content from the last year of Twitter, Congressional hearings, or particular newspapers. Most are licensed for use only by U-M researchers.
- English Language and Literature
- News Content
- Social Media Content
- Government-produced data
- Linguistics/Text Corpora
- Bibliometrics and Citation Analysis
You aren’t limited to these resources. We can help get the data you need by locating additional text or data sources, advising on whether licensed content is available to mine, and, in some cases, negotiating access to datasets and text collections for U-M researchers.
Contact our digital scholarship team at library-ds@umich.edu with questions or with requests for additional resources for mining.
Request a consultation
We can help identify tools, methodologies, and technologies that you can use in all phases of text and data mining, from cleaning and processing to text analysis and data visualization.
Contact our digital scholarship team at library-ds@umich.edu with specific questions about using tools or methods.
Request a workshop or class instruction
Contact us to request a workshop or in-class instruction focused on hands-on text analysis.
We can help you figure out what kinds of information are feasible to capture from a text, how best to capture it, how to write a set of keying or coding instructions or procedures for students to use, and what costs and benefits are likely to be involved with each method.
Contact our digital scholarship team at library-ds@umich.edu with your request.