Text and Data Mining

Find text and data sets and explore related tools, methods, and analysis through our workshop, class instruction, and consultation offerings.

Text and data mining is the process of getting insights by analyzing large, machine-readable text or data sets to identify patterns, relationships, and other structured information that’s valuable for research and analysis.

What do you want to do?

Whether a computational process is needed for text and data mining — as opposed to analysis by hand — depends upon the level of complexity.

Review these questions as you consider what you want to find out:

  • Do you need to analyze an entire corpus (body of work), or just selected items from it? 
  • Do you need the content to be easily read by humans, or only by machines?
  • Do you need to download the entire contents? Or can your analysis be conducted on the platform where the content already resides?
  • What kind of analysis do you want to do?

Find resources for mining

We can help you find text and data sets to use for text and data mining. 

The following research guides contain text and data in broad categories or genres — such as content from the last year of Twitter, Congressional hearings, or particular newspapers. Most are licensed for use only by U-M researchers.

You aren’t limited to these resources. We can help get the data you need by locating additional text or data sources, advising on whether licensed content is available to mine, and, in some cases, negotiating access to datasets and text collections for U-M researchers.

Contact our digital scholarship team at library-ds@umich.edu with questions or with requests for additional resources for mining.

Request a consultation

We can help identify tools, methodologies, and technologies that you can use in all phases of text and data mining, from cleaning and processing to text analysis and data visualization.

Contact our digital scholarship team at library-ds@umich.edu with specific questions about using tools or methods.

Request a workshop or class instruction

Contact us to request a workshop or in-class instruction focused on hands-on text analysis. 

We can help you figure out what kinds of information are feasible to capture from a text, how best to capture it, how to write a set of keying or coding instructions or procedures for students to use, and what costs and benefits are likely to be involved with each method. 

Contact our digital scholarship team at library-ds@umich.edu with your request.