How data is stored in WordPress

To understand how this plugin operates, you need to know how WordPress and plugins store data in the database.

Suppose you have a plugin that stores form submissions of business leads, including contact information. For example:

  1. email address, john.lead@somedomain.xyz
  2. first name, John
  3. last name, Lead

Next to that, this form submission contains data about John’s current car:

  1. brand, Volvo
  2. model, XC60
  3. license plate, GST-25-JHA

Depending on on the WordPress feature or plugin, this data can be stored in three different ways.

Dedicated tables

The most straightforward way is to store data in a table that is specific to the type of data.

In this example there is a table called persons with three columns: email, first_name, last_name.

A second table is called cars with three columns: brand, model, license_plate. We assume that a car has a person as owner.

It is common for each row in a table to have a unique id stored in the first column.

persons

idemailfirst_namelast_name
1john.lead@somedomain.xyzJohnLead
Dedicated table for persons. Data per person is stored in one row.

cars

idbrandmodellicense_plateowner_id
1VolvoXC60GST-25-JHA1
Dedicated table for cars. Data per car is stored in one row. Owner_id refers to the id in the persons table.

Metadata tables

Some WordPress functions and plugins store data in more flexible structure: metadata tables. For each row in a metadata table, it must be specified what the meaning of the data in that row is. In this way you can store different types of data in the same table.

In this example, we have a table anydata that has the columns key and value. There is a third column that acts as an identifier, so that related data is marked as such.

anydata

person_idkeyvalue
1emailjohn.lead@somedomain.xyz
1first_nameJohn
1last_nameLead
1brandVolvo
1modelXC60
1license_plateGST-25-JHA
2emailsam.prospect@someotherdomain.abc
2first_nameSam
2
Metadata table that can store all types of (text) data using key-value pairs. Related data has the same id.

Table with JSON or serialized arrays

WordPress and plugins can also store data in a hybrid way using a database table and JSON or serialized arrays. This provides the most flexibility.

JSON stands for JavaScript Object Notation and provides a standardized format to store data in a readable format as text. It allows for substructures, so you can store data within data.

Serialized arrays are an internal data format of PHP, the programming language behind WordPress and plugins. It also stores data in a text structure, but it is less readable than JSON.

Example

In JSON, the data of the example can be stored as follows:

{
  "person": [
     {
       "email": "john.lead@somedomain.xyz"
       "firstname": "John"
       "lastname": "Lead"
     }
  ],
  "car": [
    {
      "brand": "Volvo",
      "model": "XC60",
      "license_plate": "GST-25-JHA"
    }
  ]
}

In a serialized array, this example is less readable, but it contains the same data with the same flexibillity:

"a:2:{s:6:"person";a:3:{s:5:"email";s:24:"john.lead@somedomain.xyz";s:9:"firstname";s:4:"John";s:8:"lastname";s:4:"Lead";}s:3:"car";a:3:{s:5:"brand";s:5:"Volvo";s:5:"model";s:4:"XC60";s:7:"license";s:10:"GST-25-JHA";}}"

Items as email, firstname and brand are called array keys.

Items as john.doe@internetmanagers.nl, John and Volvo are called array values.

In a database, JSON or serialized arrays can be combined with table rows and columns. Data that requires no flexibility (because it will always be present), is stored in dedicated columns. More fluid data is stored as JSON or as a serialized array. Let’s suppose that an email address and a name are always required in the form submission. But additional data is optional and can contain different fields depending on what the business lead wants to provide.

idemailfirst_
name
last_
name
lead_data
1 john.lead
@somedo
main.xyz
johnlead{ "car": [ { brand": "Volvo",
"model": "XC60",
"license_plate":
"GST-25JHA"
} ] }
A table with fixed columns combined with JSON

Summary

WordPress and its plugins apply three methods to store data in a database. If you want anonymize a WordPress database, you need a plugin that can supports all methods.

The plugin Database Anonymization can process:

  • Data stored in dedicated tables.
  • Data stored in metadata tables.
  • Data stored in tables as JSON or serialized arrays.

Continue reading