What is metadata, and is it safe?

Anton P. | February 8, 2021

What is metadata exactly? Despite its vague description as “data about data,” metadata is a powerful concept. It can be equally damaging to our privacy, despite the carefully-crafted narratives claiming otherwise. Many governments do not label metadata as personal. Thus, it does not fall under the scope of privacy regulations but has immense potential that gradually turns invasive. The accurate metadata meaning illustrates how its mining fuels behavioral pattern recognition. Hence, we end up facing a thriving data accumulation model that slips through the cracks.

Metadata definition and significance

Metadata refers to relatively harmless information about a specific set of data. Thus, it is the summary about data, typically useful in sorting, classifying, and organizing information.

Each file on your device contains details that describe it in terms of type or size. Some services automatically remove metadata from images, and other tools allow users to perform this manually. Libraries categorize books according to their genre, authors, publishers, or release date. All of this information describing either a digital or physical product is metadata. You will even encounter it while running Google searches, as each listing will present meta titles and descriptions. Thus, we use or generate “data about data” every day. Sometimes, that automatic generation ends up painting a relatively clear picture of our digital activities.

Metadata management refers to the processes involving data governance, data modeling, and metadata administration. It basically refers to the standards followed when handling and retaining metadata. Experts emphasize that such information is critical in decision making, research, and analytics. Hence, while the actual content is valuable, metadata management supplies context to the ever-increasing volumes of accumulated data.

However, while metadata is essential to businesses sustaining systems and workflows, experts note that consumers might be in the dark about it. In the online world, we generate metadata without realizing it. Due to the lack of deliberate input and consent, transparency regarding metadata management is crucial. You should know how long your data resides in databases and its use cases. Depending on the way entities oversee metadata, they can retain logs for several years. Over time, it becomes a goldmine for companies and governments alike. Gradual metadata collection can reveal anything from your browsing patterns, social circles, beliefs, etc.

Metadata standards

Metadata standards govern the categorization of metadata in terms of its purpose. Here are some of the prevalent schemes:

  • Dublin Core covers 15 metadata components concerning content. It indicates creators, date, format, language, source, subject, title, type, publisher, etc.
  • DDI (Data Documentation Initiative) refers to metadata collected through surveys and other observational methods. It is common in behavioral, economic, social, and health sciences.
  • DOI (Digital Object Identifier) means a character string that uniquely identifies an object. It ties an article or document and a link to it on the web.
  • ISO 19115 defines the standards for describing geospatial information.

How can metadata be intrusive?

Privacy advocates emphasize that metadata can be just as valuable as the actual data. While it won’t give away the content, metadata alone can supply insights into people’s routines and intimate relationships. Let’s take WhatsApp as an example. The popular messaging app protects your communications with end-to-end encryption but leaves metadata out of the equation. It basically means that WhatsApp knows who you text, how frequently, from what location, and for how long. Since WhatsApp holds the right to share such data with Facebook, it will combine it with its data. As a result, companies build extensive profiles, not shying away from minimal records on your online conversations.

Web browsing generates metadata as well. Each click and visit supplies companies with information about your location, device type, timestamps, and searches you make. Combine it all, and you have an insightful user profile generated without taking a deeper look at your activities. Thus, even relatively minimalistic records of your online actions become incredibly valuable to businesses and even law enforcement agencies. Despite claims that metadata poses little to no privacy implications, it can reveal a lot about your preferences, whereabouts, social relationships, and behavior. Thus, it is clear that metadata deserves properly-crafted legislation, addressing the prevalent issues and potential misuse.

What metadata do you generate?

Depending on the context, different activities online can generate a range of metadata.

Browsing:

  • Logs of visited websites and pages
  • Your search queries
  • Search query results
  • Websites you visit from search engines
  • IP addresses
  • Device types
  • Internet Service Providers (ISPs)
  • Cookies and cached data
  • Timestamps

Email correspondence:

  • Date, time, and timezone
  • Senders’ names, IP, and email addresses
  • Recipients’ names and email addresses
  • Content types
  • Mail client header formats
  • Email subjects

Instant messaging:

  • Message or call timestamps
  • User IDs
  • IP addresses
  • Frequency and duration of interactions

Final notes on metadata

Metadata is everywhere. You generate subtle records on each online activity you perform. Applications, computer systems, mobile devices, and websites all rely on it to sustain their operations or improve their quality. However, governments and businesses seem to dance around the question whether metadata can have negative implications on privacy. Unfortunately, it can, as metadata can be equally informative as the content itself.

For instance, online retailers turn to the metadata when it comes to recognizing consumers’ interests, location, and shopping patterns. Furthermore, governments have made metadata collection and retention compulsory so that law enforcement and federal agencies could use such insights.

While there are options to remove metadata from documents, it is challenging to escape it altogether. You can, however, limit the data you share online. A VPN encrypts and encapsulates your web traffic, meaning that online entities won’t accumulate as much data about you. For instance, your actual location may no longer be a variable collected. Thus, while a VPN can help, it is only the tip of the iceberg. Metadata does not have the same legal protections as personal information. However, entities collecting and retaining metadata must disclose how they manage such logs. Additionally, proper jurisdiction should outline the consequences and cases of metadata misuse.

Anton P.

Anton P.

Former chef and the head of Atlas VPN blog team. He's an experienced cybersecurity expert with a background of technical content writing.

Tags:

ip addresscookies