To understand how this plugin operates, you need to know how WordPress and plugins store data in the database.
Suppose you have a plugin that stores form submissions of business leads, including contact information. For example:
- email address, john.lead@somedomain.xyz
- first name, John
- last name, Lead
Next to that, this form submission contains data about John’s current car:
- brand, Volvo
- model, XC60
- license plate, GST-25-JHA
Depending on on the WordPress feature or plugin, this data can be stored in three different ways.
Dedicated tables
The most straightforward way is to store data in a table that is specific to the type of data.
In this example there is a table called persons with three columns: email, first_name, last_name.
A second table is called cars with three columns: brand, model, license_plate. We assume that a car has a person as owner.
It is common for each row in a table to have a unique id stored in the first column.
persons
id | first_name | last_name | |
1 | john.lead@somedomain.xyz | John | Lead |
… | … | … | … |
cars
id | brand | model | license_plate | owner_id |
1 | Volvo | XC60 | GST-25-JHA | 1 |
… | … | … | … | … |
Metadata tables
Some WordPress functions and plugins store data in more flexible structure: metadata tables. For each row in a metadata table, it must be specified what the meaning of the data in that row is. In this way you can store different types of data in the same table.
In this example, we have a table anydata that has the columns key and value. There is a third column that acts as an identifier, so that related data is marked as such.
anydata
person_id | key | value |
1 | john.lead@somedomain.xyz | |
1 | first_name | John |
1 | last_name | Lead |
1 | brand | Volvo |
1 | model | XC60 |
1 | license_plate | GST-25-JHA |
2 | sam.prospect@someotherdomain.abc | |
2 | first_name | Sam |
2 | … | … |
Table with JSON or serialized arrays
WordPress and plugins can also store data in a hybrid way using a database table and JSON or serialized arrays. This provides the most flexibility.
JSON stands for JavaScript Object Notation and provides a standardized format to store data in a readable format as text. It allows for substructures, so you can store data within data.
Serialized arrays are an internal data format of PHP, the programming language behind WordPress and plugins. It also stores data in a text structure, but it is less readable than JSON.
Example
In JSON, the data of the example can be stored as follows:
{
"person": [
{
"email": "john.lead@somedomain.xyz"
"firstname": "John"
"lastname": "Lead"
}
],
"car": [
{
"brand": "Volvo",
"model": "XC60",
"license_plate": "GST-25-JHA"
}
]
}
In a serialized array, this example is less readable, but it contains the same data with the same flexibillity:
"a:2:{s:6:"person";a:3:{s:5:"email";s:24:"john.lead@somedomain.xyz";s:9:"firstname";s:4:"John";s:8:"lastname";s:4:"Lead";}s:3:"car";a:3:{s:5:"brand";s:5:"Volvo";s:5:"model";s:4:"XC60";s:7:"license";s:10:"GST-25-JHA";}}"
Items as email, firstname and brand are called array keys.
Items as john.doe@internetmanagers.nl, John and Volvo are called array values.
In a database, JSON or serialized arrays can be combined with table rows and columns. Data that requires no flexibility (because it will always be present), is stored in dedicated columns. More fluid data is stored as JSON or as a serialized array. Let’s suppose that an email address and a name are always required in the form submission. But additional data is optional and can contain different fields depending on what the business lead wants to provide.
id | first_ name | last_ name | lead_data | |
1 | john.lead @somedo main.xyz | john | lead | { "car": [ { brand": "Volvo", "model": "XC60", "GST-25JHA" |
… | … | … | … | … |
Summary
WordPress and its plugins apply three methods to store data in a database. If you want anonymize a WordPress database, you need a plugin that can supports all methods.
The plugin Database Anonymization can process:
- Data stored in dedicated tables.
- Data stored in metadata tables.
- Data stored in tables as JSON or serialized arrays.