DATA in AI
In this module, you will learn how data can be classified as structured, semi-structured, or unstructured data and the challenges that arise when working with unstructured data.
Learning objectives
After completing this module, you should be able to:
- Differentiate between structured, semi-structured, and unstructured data
- Identify challenges that come with working with unstructured data
Data is raw information. Data might be
facts, statistics, opinions, or any kind of content that is recorded in some format.
This could include voices, photos, names, and even dance moves!
Data can be organized into the
following three types.
- Structured
data is typically
categorized as quantitative data
and is highly organized. Structured data is information that can be organized
in rows and columns. Perhaps you've seen structured data in a spreadsheet,
like Google Sheets or Microsoft Excel. Examples of structured data
includes names, dates, addresses, credit card numbers, stock information.
- Unstructured
data, also
known as dark data, is typically
categorized as qualitative data.
It cannot be processed and analyzed by conventional data tools and
methods. Unstructured data lacks any built-in organization, or structure.
Examples of unstructured data include images, texts, customer comments,
medical records, and even song lyrics.
- Semi-structured
data is the “bridge”
between structured and unstructured data. It doesn't have a predefined
data model. It combines features
of both structured data and unstructured data. It's more complex than
structured data, yet easier to store than unstructured data.
Semi-structured data uses metadata to
identify specific data characteristics and scale data into records and
preset fields. Metadata ultimately enables semi-structured data to be
better cataloged, searched, and analyzed than unstructured data. An
example of semi-structured data is a video on a social media site. The
video by itself is unstructured data, but a video typically has text for
the internet to easily categorize that information, such as through a
hashtag to identify a location.
Now, imagine a programmable computer trying to extract meaning
from billions of data like this! What kind of program would someone write that
could sort out every eventuality among the clutter? How would someone build a
long enough list of keywords to find anything useful? Unstructured data hides
answers to disease prevention, criminal activity, stock markets—almost every
aspect of civilization today. Without those answers, people and organizations
cannot make useful predictions or recommendations.
But AI can shed light on unstructured data! AI uses new kinds of
computing—some modeled on the human brain—to rapidly give dark data structure,
and from it, make new discoveries. AI can even learn things—by itself—from
the data it manages and teach itself how to make better predictions over time.
This is the Era of AI, and it changes
everything!
Labels: AI

0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home