Tuesday, March 18, 2025

DATA in AI

In this module, you will learn how data can be classified as structuredsemi-structured, or unstructured data and the challenges that arise when working with unstructured data.

Learning objectives

After completing this module, you should be able to:

  • Differentiate between structured, semi-structured, and unstructured data
  • Identify challenges that come with working with unstructured data

Data is raw information. Data might be facts, statistics, opinions, or any kind of content that is recorded in some format. This could include voices, photos, names, and even dance moves!

Data can be organized into the following three types.

  • Structured data is typically categorized as quantitative data and is highly organized. Structured data is information that can be organized in rows and columns. Perhaps you've seen structured data in a spreadsheet, like Google Sheets or Microsoft Excel. Examples of structured data includes names, dates, addresses, credit card numbers, stock information.
  • Unstructured data, also known as dark data, is typically categorized as qualitative data. It cannot be processed and analyzed by conventional data tools and methods. Unstructured data lacks any built-in organization, or structure. Examples of unstructured data include images, texts, customer comments, medical records, and even song lyrics.
  • Semi-structured data is the “bridge” between structured and unstructured data. It doesn't have a predefined data model. It combines features of both structured data and unstructured data. It's more complex than structured data, yet easier to store than unstructured data. Semi-structured data uses metadata to identify specific data characteristics and scale data into records and preset fields. Metadata ultimately enables semi-structured data to be better cataloged, searched, and analyzed than unstructured data. An example of semi-structured data is a video on a social media site. The video by itself is unstructured data, but a video typically has text for the internet to easily categorize that information, such as through a hashtag to identify a location.

Now, imagine a programmable computer trying to extract meaning from billions of data like this! What kind of program would someone write that could sort out every eventuality among the clutter? How would someone build a long enough list of keywords to find anything useful? Unstructured data hides answers to disease prevention, criminal activity, stock markets—almost every aspect of civilization today. Without those answers, people and organizations cannot make useful predictions or recommendations.

But AI can shed light on unstructured data! AI uses new kinds of computing—some modeled on the human brain—to rapidly give dark data structure, and from it, make new discoveries. AI can even learn things—by itself—from the data it manages and teach itself how to make better predictions over time. This is the Era of AI, and it changes everything!

 

Labels:

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home